#324 Sharon Zhou: Inside AMD's Plan to Build Self-Improving AI - Eye On A.I. Recap

Podcast: Eye On A.I.

Published: 2026-02-27

Duration: 46 min

Summary

Sharon Zhou discusses AMD's efforts in developing self-improving AI models that can optimize their own code and operations on GPUs. She highlights the importance of hardware design and kernel generation in advancing AI capabilities.

What Happened

In this episode of Eye On A.I., Sharon Zhou, Vice President of AI at AMD, delves into the concept of self-improving AI, particularly focusing on how these models can enhance their own performance by optimizing their underlying code. She explains that self-improvement encompasses various aspects, including data refinement and model architecture adjustments. Zhou emphasizes that her team is developing methods for AI to write the kernel code that allows models to run more efficiently on AMD GPUs, which is crucial for enabling broader access to AI computation.

Zhou elaborates on the collaborative efforts within the industry, mentioning partnerships with organizations like Meta, Google DeepMind, and Nvidia to advance kernel generation using AI. The discussion highlights the significance of kernel optimization, particularly matrix multiplication, which is foundational for language models. Historically, these kernels were manually crafted by engineers who needed extensive knowledge of GPU architecture. With AI's intervention, the process is becoming more efficient, showcasing a productivity leap that can significantly impact AMD and the field of AI as a whole.

Key Insights

Self-improving AI involves models that can edit their own code and optimize performance on GPUs.
Kernel generation is a collaborative effort across companies and institutions, enhancing AI capabilities.
Matrix multiplication optimization is a key focus area for improving language model performance.
AI's ability to assist in kernel development is transforming the productivity of engineers in the field.

Key Questions Answered

What is self-improving AI according to Sharon Zhou?

Sharon Zhou defines self-improving AI as the capability of models to edit any part of themselves to enhance performance. This includes refining their training data, modifying their architecture, and evaluating their outputs. Specifically, her focus lies in how these models can optimize their execution speed on GPUs by generating kernel code that allows them to run more efficiently.

How does AMD's hardware design relate to AI model development?

While AMD is not directly involved in developing its own AI models, the design of AMD hardware plays a crucial role in enabling more efficient AI computations. Zhou notes that there is a strong focus on enhancing hardware capabilities to support the growing demands of AI, particularly as self-improving models require optimized environments to function effectively.

What role does kernel generation play in AI performance?

Kernel generation is essential for optimizing the execution of AI models on GPUs. Zhou explains that kernels are small pieces of software that execute specific tasks, such as matrix multiplications, which are central to the functioning of neural networks. By improving these kernels, AI models can achieve faster processing speeds and better performance, which is vital for scalability and efficiency.

How is AI being used to assist kernel engineers?

AI is being utilized to either autonomously write kernel code or assist kernel engineers in the process. Zhou emphasizes that this shift represents a significant productivity gain, reducing the reliance on manual coding by highly specialized engineers who must understand both GPU architecture and the intricacies of the models being deployed.

What collaborative efforts exist in the field of AI kernel generation?

Zhou discusses AMD's involvement in collaborative initiatives with various institutions, including Meta and Google DeepMind. These partnerships are aimed at advancing techniques for generating kernels through AI agents, facilitating a more efficient framework for AI development. This collective approach is critical for pushing the boundaries of what is possible in AI and making advanced compute resources more accessible.