977: Attention, World Models and the Future of AI, with Prof. Kyunghyun Cho - Super Data Science: ML & AI Podcast with Jon Krohn Recap
Podcast: Super Data Science: ML & AI Podcast with Jon Krohn
Published: 2026-03-24
Duration: 4694
Guests: Kyunghyun Cho
What Happened
Professor Kyunghyun Cho, a leading figure in AI with over 200,000 citations, discusses the future trajectory of AI, emphasizing the importance of active data collection. He argues that while current AI models have captured most correlations in passive data, the real challenge is identifying which data to actively gather for more significant insights.
Cho delves into the debate on world models, questioning whether AI requires high-fidelity, step-by-step imagination or if a high-level latent representation suffices. His collaboration with Yann LeCun on 'Planning with Latent Dynamics Models' highlights the potential of these models and their role in improving AI's understanding of complex environments.
A surprising revelation from Cho's teaching experience is that 80% of his computer science students had never installed a coding agent. This highlights a gap in practical AI tool usage among students, despite being provided free access to these tools by companies like Google and Microsoft.
Cho recounts the inception of the attention mechanism, co-authored in a 2014 paper that significantly influenced the development of the Transformer architecture. This mechanism, which allows models to focus on relevant parts of data, was a collaboration between Cho and intern Dima Badanau.
The episode also covers the importance of sample efficiency in AI, contrasting current algorithms' need for large data volumes with biological systems' ability to learn from fewer examples. Cho suggests that active data collection could improve AI's sample efficiency, potentially leading to more frequent breakthroughs.
Cho provides insights into the Global AI Frontier Lab at NYU, which he co-directs. Funded by the Korean government, the lab fosters international collaboration, particularly between Korean researchers and those in New York. This initiative aims to enhance global AI research efforts.
Key Insights
- Professor Kyunghyun Cho emphasizes the importance of active data collection in AI, highlighting that current models have exhausted most correlations in passive data. He believes the next big step involves actively choosing which data to collect for more meaningful insights.
- The debate on world models centers around whether AI needs detailed, step-by-step imagination or if a high-level latent representation is sufficient. Cho's collaborative work with Yann LeCun on 'Planning with Latent Dynamics Models' suggests that latent models can effectively enhance AI's understanding of complex environments.
- In teaching machine learning at NYU, Cho discovered that 80% of his students had never used coding agents, despite access provided by major tech companies. This reveals a gap in hands-on experience with AI tools, which Cho addresses through practical coursework.
- The attention mechanism, pivotal in the development of the Transformer architecture, originated from a 2014 paper co-authored by Cho. This innovation enables models to prioritize relevant data, significantly advancing AI's capabilities in natural language processing.