959: Building Agents 101: Design Patterns, Evals and Optimization (with Sinan Ozdemir) - Super Data Science: ML & AI Podcast with Jon Krohn Recap

Podcast: Super Data Science: ML & AI Podcast with Jon Krohn

Published: 2026-01-20

Duration: 1 hr 5 min

Guests: Sinan Ozdemir

Summary

This episode provides an in-depth look into building agentic AI systems, focusing on the distinctions between workflows and agents, evaluation metrics, and how to choose the right model parameters for specific tasks.

What Happened

Sinan Ozdemir joins Jon Krohn to discuss his latest book, 'Building Agentic AI', and provides a comprehensive overview of the key concepts in designing and deploying agentic systems. He explains the critical distinction between workflows and agents, highlighting that an agentic AI is an LLM with access to tools, allowing it to make decisions about which tools to use and in what order, unlike a workflow that follows a deterministic path.

Sinan describes how to decide whether a solution should be more deterministic or agentic by evaluating the existing process and its conditional elements. He also provides guidance on selecting the appropriate LLM parameter count, offering a breakdown of small, medium, and large models based on parameter size and their respective use cases.

The episode delves into the importance of evaluation metrics beyond accuracy, like precision and recall, and how these metrics expose different failure modes of a model. Sinan stresses the need for a comprehensive evaluation framework tailored to specific task types, such as retrieval, classification, and generation, to ensure AI reliability.

Sinan shares surprising insights from his research, such as the lack of correlation between reasoning time and model performance, challenging the assumption that more reasoning leads to better results. He also discusses the trade-offs involved in optimizing models for speed, cost, and accuracy, pointing out that quantization and distillation can lead to performance hits, effectively creating a 'cousin' model with different outputs.

Towards the end, Sinan emphasizes the value of a hybrid approach combining agentic systems with predefined workflows to manage complex tasks efficiently. He explains how this can enhance AI systems' reliability by providing a structured pathway with room for agentic decision-making.

Throughout the conversation, Sinan's expertise in AI shines through, as he shares practical insights and real-world examples from his consulting work. His engaging storytelling and clear explanations make complex AI concepts accessible to a broad audience, encouraging listeners to explore the potential of agentic AI in their own work.

Key Insights

Agentic AI systems differ from workflows by using large language models (LLMs) with tool access, allowing them to make decisions about tool usage and sequence, unlike workflows that follow predetermined paths.
Evaluation metrics like precision and recall are crucial for identifying different failure modes in AI models, beyond just relying on accuracy. These metrics need to be tailored to specific task types such as retrieval, classification, and generation.
Research indicates no correlation between reasoning time and model performance, challenging the assumption that longer reasoning leads to better results. This insight suggests that optimizing models for speed does not necessarily compromise accuracy.
Combining agentic systems with predefined workflows can enhance AI reliability by providing structured pathways with room for decision-making, effectively managing complex tasks more efficiently.