Proactive Agents for the Web with Devi Parikh - The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) Recap

Podcast: The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Published: 2025-11-19

Duration: 56 min

Summary

In this episode, Devi Parikh discusses the future of web interaction through proactive AI agents that will automate workflows on our behalf. She shares insights from her extensive background in AI and the founding of her company, Utori, which aims to transform how we engage with online tasks.

What Happened

Host Sam Charington welcomes Devi Parikh back to the podcast after five years, noting the significant advancements in AI during that time. Devi shares her 20-year journey in AI, starting with her PhD in computer vision and evolving into multimodal research that combines vision and language. She highlights her previous roles at Meta, where she led multimodal research efforts, including the development of generative models for various media.

The conversation shifts to Devi's current focus with Utori, where she envisions a future where web interaction is transformed. Rather than manually clicking buttons or filling forms, users will interact with the web at a higher level of abstraction, allowing AI agents to execute tasks proactively. These agents are designed to be always on, personalized, and capable of taking care of workflows in the background, creating a more meaningful and less distracted digital experience for users. She explains how this vision is motivated by the desire to enhance productivity and efficiency in our daily lives.

Key Insights

Key Questions Answered

What is Devi Parikh's background in AI?

Devi Parikh has been working in AI for about 20 years, beginning with her PhD thesis in computer vision. Over the years, she shifted her focus toward multimodal problems at the intersection of vision and language, exploring how people can interact with AI systems more naturally. Her experience includes significant roles at Meta, where she led multimodal research efforts, contributing to generative models for images, videos, and music.

What is the vision of Utori?

Utori aims to revolutionize how we interact with the web by moving beyond traditional methods of clicking buttons and filling forms. Devi Parikh describes a future where users will communicate their needs to AI agents, which will proactively manage tasks on their behalf. This approach is designed to enhance productivity and create a more meaningful digital experience by reducing the cognitive load on users.

How does Devi Parikh's work influence web agents?

Devi's extensive background in AI, particularly in multimodal research, informs her approach to developing web agents. She notes that while robotics presents challenges in physical environments, the transition to web agents allows for a different set of capabilities and efficiencies. Her work with AI at Meta has provided her with insights that are now being applied to create more reliable and autonomous workflows on the web.

What does the name 'Utori' signify?

The name 'Utori' comes from a Japanese word that emphasizes the sense of well-being derived from mental spaciousness. Devi Parikh explains that this concept is central to the company's mission: to alleviate the burden of constant distractions and tasks so users can focus on what truly matters in their lives. It embodies their goal to create a more efficient and fulfilling user experience.

What are the roles of the co-founders of Utori?

Utori was co-founded by Devi Parikh, her husband Dhruv, and Abhishek Das. Devi and Dhruv have worked closely together for nearly two decades, often sharing the same workplace and collaborating on research. Abhishek, also a close collaborator, was Dhruv's PhD student at Georgia Tech. Together, they leverage their diverse expertise to tackle the challenges associated with building proactive web agents.