Captaining IMO Gold, Deep Think, On-Policy RL, Feeling the AGI in Singapore - Yi Tay - Latent Space: The AI Engineer Podcast Recap

Podcast: Latent Space: The AI Engineer Podcast

Published: 2026-01-23

Duration: 1 hr 32 min

Guests: Yi Tay

Summary

Yi Tay discusses his journey and insights from working on achieving an IMO Gold with AI, the significance of on-policy reinforcement learning, and the impact of AI advancements in Singapore.

What Happened

Yi Tay returns to the podcast after 1.5 years, discussing his transition back to GDM Singapore and the new team he's working on called Reasoning and AGI. He reflects on his experiences at Google, noting the seamless nature of rejoining and the advancements made since his departure. Tay expresses his enthusiasm for research, particularly in the realm of reinforcement learning (RL), highlighting his transition from architectural work to focusing on RL and reasoning models.

The episode delves into the intricacies of on-policy reinforcement learning, contrasting it with off-policy approaches. Tay emphasizes the importance of models learning from their own outputs rather than imitating others, drawing parallels with human learning and the Montessori approach. He shares his insights on the philosophical aspects of AI learning, stressing the significance of self-discovery in both AI models and human education.

Tay discusses the groundbreaking achievement of using AI to win a gold medal at the International Mathematical Olympiad (IMO). He explains the decision to transition from using alpha proofs to an end-to-end text model, emphasizing the belief in AI's potential to achieve AGI. This achievement, he notes, was a collective effort, with contributions from various teams across different time zones.

The conversation shifts to the challenges and potentials of AI in various domains, such as coding and image generation. Tay highlights the transformation in AI's capabilities over the years, mentioning the significant improvements in AI coding and image generation, which have become practical tools for professionals.

Tay also touches on the philosophical aspects of AI research, discussing the balance between imitation and innovation in AI learning. He shares his perspective on the implications of machine learning insights for human learning and the importance of updating one's learning rate to adapt to new paradigms.

The episode concludes with Tay's reflections on the importance of health and wellness in maintaining productivity in AI research. He shares his personal journey of achieving peak physical health and how it has positively impacted his work, underscoring the connection between physical well-being and intellectual performance.

Key Insights

On-policy reinforcement learning models focus on learning from their own outputs rather than imitating others, akin to the Montessori educational approach, which emphasizes self-directed learning.
AI achieved a gold medal at the International Mathematical Olympiad by transitioning from alpha proofs to an end-to-end text model, demonstrating AI's potential progression towards achieving artificial general intelligence.
AI's capabilities in coding and image generation have significantly transformed, becoming practical tools for professionals, reflecting substantial improvements over recent years.
Maintaining physical health is linked to enhanced productivity in AI research, as demonstrated by the positive impact of peak physical health on intellectual performance.