Dealing with increasingly complicated agents - Practical AI Recap

Podcast: Practical AI

Published: 2025-10-16

Duration: 55 min

Guests: Donato Capitella

Summary

The episode dives into the evolving complexity of AI agents, particularly focusing on security challenges that arise when these agents interact with external tools and data sources. Insightful discussions cover design patterns to secure AI applications and the balance between usability and security in enterprise environments.

What Happened

The episode starts with Donato Capitella discussing his recent experiences in AI cybersecurity, highlighting the rapid growth in generative AI (GenAI) security work and research. He shares insights from his recent presentations at Black Hat in Toronto and the SecureAI conference in Stockholm, where discussions included major players like OpenAI and Microsoft. Donato explains the shift from simple LLM applications to more complex agentic workflows that use external tools and APIs, which introduces new security vulnerabilities. He emphasizes the need for robust access control mechanisms, as these systems are often exposed to the internet unexpectedly, leading to potential security breaches through prompt injection attacks.

The conversation explores the parallels between the current complexity of AI applications and the earlier era of microservices, where root cause analysis became critical due to interconnected systems. Donato points out the challenges of managing multiple data sources in a single LLM call, which can cause confidentiality and integrity issues if not properly secured. He provides a practical example of how attackers can exploit these vulnerabilities to manipulate AI agents into sending unauthorized emails.

Chris Benson queries the security landscape, prompting Donato to discuss the offensive-driven approach to pentesting AI applications. Donato reveals that while most of their work is preventative, researchers have already demonstrated vulnerabilities, such as the Ecoleak vulnerability in CoPilot, which was a clever markdown syntax attack. He stresses the importance of understanding that attackers can exploit any part of the LLM input to gain unauthorized access to internal APIs.

The episode then covers the strategic approach companies should take towards AI security, balancing the need for productivity with the risks of introducing shadow AI. Donato categorizes enterprises into extremely risk-averse and more relaxed environments, each with its challenges. He notes that overly strict security measures can stifle innovation, while too lax an approach can lead to security breaches.

Donato introduces design patterns that help secure LLM agents against prompt injection, highlighting a paper co-authored by industry experts. He details the 'code then execute' pattern, which involves the LLM generating a plan before any untrusted input is processed, ensuring that all operations are validated against a set of policies. This approach is likened to SELinux on a Linux kernel, focusing on system design rather than model design to enhance security.

Finally, the discussion turns to Spiky, an open-source tool developed by Donato's team. Spiky allows for customizable penetration testing of AI applications, enabling testers to create datasets tailored to specific client needs. It automates the process of testing for data exfiltration, social engineering, and other vulnerabilities without requiring an OpenAI key. Donato emphasizes the importance of practical, adaptable tools in the evolving landscape of AI security.

Key Insights

Generative AI security is evolving rapidly with a shift towards complex agentic workflows that utilize external tools and APIs, which introduces new security vulnerabilities such as prompt injection attacks.
The 'code then execute' design pattern enhances AI security by ensuring that all operations are validated against a set of policies before processing untrusted input, similar to how SELinux operates on a Linux kernel.
The Ecoleak vulnerability in CoPilot demonstrated how markdown syntax can be exploited to gain unauthorized access, highlighting the need for offensive-driven penetration testing in AI applications.
Spiky, an open-source tool developed for AI security, allows customizable penetration testing without requiring an OpenAI key, automating tests for data exfiltration and social engineering vulnerabilities.