What's Missing Between LLMs and AGI - Vishal Misra & Martin Casado - The a16z Show Recap

Podcast: The a16z Show

Published: 2026-03-17

Duration: 48 min

Guests: Vishal Misra

What Happened

Vishal Misra discusses the limitations of large language models (LLMs) in achieving artificial general intelligence (AGI). According to Misra, LLMs like GPT-3 are excellent at pattern matching and correlation but fail to build models of cause and effect, which is essential for AGI. He explains that while LLMs can perform Bayesian updating, they lack the ability to retain learning between sessions, a key aspect of human intelligence.

Misra recounts his experience of getting GPT-3 to translate natural language into a domain-specific language (DSL) it had never encountered. This was implemented in a project for ESPN, illustrating the model's capability for in-context learning. However, he notes that this does not equate to true understanding or intelligence, as the model simply updates probabilities based on prompts.

The conversation delves into Misra's work on a mathematical model to understand LLM operations, particularly focusing on transformers. He describes a conceptual matrix where each row corresponds to a prompt and columns represent the probability distribution of the next token. This abstraction helps explain the mechanics of LLMs and their limitations in achieving AGI.

Misra and Martin Casado discuss the concept of a 'Bayesian wind tunnel' used to test different AI architectures. This approach allows for the testing of models in environments where they can't memorize solutions, thus revealing the true Bayesian capabilities of transformers compared to other architectures like MLPs and LSTMs.

The episode also touches on the current state of AI and how the architecture of LLMs is more suited to finding correlations rather than simulating causal relationships. Misra argues that simply scaling up models will not solve these fundamental issues and that a shift from correlation to causation is necessary.

Misra highlights the need for continual learning and causality to bridge the gap to AGI. He suggests that current models lack the plasticity needed to update their learning over time without risking catastrophic forgetting of previous knowledge.

He points to the work of Judea Pearl and his causal hierarchy as a potential framework to move from correlation to causation. Misra believes this could provide a theoretical underpinning for the next generation of AI models capable of simulating and intervening in complex environments.

In conclusion, Misra emphasizes that while LLMs are part of the solution, new architectures or mechanisms will be needed to achieve AGI. He remains optimistic about future advancements but stresses the importance of focusing research on causality and continual learning.

Key Insights

Vishal Misra highlights that LLMs, including GPT-3, excel at pattern matching but lack the ability to build models of cause and effect, necessary for achieving AGI.
Misra's work involves a 'Bayesian wind tunnel' to test AI architectures like transformers, revealing their precise Bayesian updating capabilities, unlike LSTMs and MLPs.
The conversation underscores that LLMs perform Bayesian inference but lack the plasticity to retain learning between sessions, a key difference from human cognition.
Misra argues for a paradigm shift from correlation to causation in AI research, suggesting that Judea Pearl's causal hierarchy could guide the development of more advanced AI models.