Engineering AI Systems for Autonomy and Resilience with Krishna Sai - Software Engineering Daily Recap

Podcast: Software Engineering Daily

Published: 2026-02-24

Duration: 53 min

Summary

In this episode, Krishna Sai discusses how SolarWinds is leveraging AI to enhance observability and operational efficiency in complex IT environments. The conversation highlights the shift towards agentic AI systems that can autonomously manage operational data and reduce the burden on IT teams.

What Happened

The episode kicks off with Krishna Sai, the CTO of SolarWinds, explaining the evolution of the company from a network monitoring tool to a comprehensive solution for managing modern IT environments. He emphasizes that SolarWinds now offers a diverse product portfolio that spans observability, incident response, and service management, targeting the need for IT teams to manage SLAs and SLOs effectively. As systems grow increasingly complex with microservices and distributed architectures, the challenge of understanding failures and responding to issues quickly remains pressing.

Krishna delves into the concept of 'agentic AI,' which SolarWinds is integrating into its offerings. He describes a shift from traditional statistical approaches to AI that can reason about operational data and take autonomous actions when things go wrong. This transition aims to reduce operational toil, particularly the disruptions caused by alert storms that often require teams to scramble for resolutions at inconvenient hours. By focusing on agentic AI, SolarWinds is positioning itself to not just provide data but to enable systems that can proactively manage and remediate issues, thus reshaping the future of IT operations and engineering workflows.

Key Insights

Key Questions Answered

How is SolarWinds redefining its role in IT management?

Krishna Sai describes SolarWinds as having a broad product portfolio that spans observability, incident response, and service management. This evolution reflects the company’s response to the increasingly complex IT environments that its customers operate in, which now include applications, containers, and AI workloads.

What challenges do IT teams face with modern observability tools?

Sai highlights the persistent challenge of understanding why systems fail despite having access to numerous metrics and alerts. The complexity of modern IT environments makes it difficult for teams to pinpoint the causes of issues, which is a significant barrier to effective incident management.

What is agentic AI and how is SolarWinds implementing it?

Agentic AI represents a shift towards AI systems that can autonomously reason about operational data and take necessary actions. SolarWinds is focusing on this approach to enhance their solutions, aiming to reduce operational toil and improve the capacity of IT teams to manage incidents proactively.

How has AI-assisted programming impacted SolarWinds' engineering teams?

Krishna mentions that all engineers at SolarWinds are using AI-assisted coding tools, which has led to a significant increase in code generation and commit velocity. This investment in AI tools has positively influenced deployment frequency and overall productivity, although challenges remain in code review processes.

What future developments can we expect from SolarWinds in the AI space?

Sai indicates that as the tools and models continue to mature, SolarWinds will keep indexing on agentic AI. They are actively exploring how these advancements can apply to broader enterprise software use cases, ensuring that both AI-assisted coding and operational management evolve hand in hand.