AIOps Evolution Podcast | Season 2
Interview with Trent Fitz and Ani Gujrathi from Zenoss
In this insightful episode of AIOps Evolution, we dive into how AIOps has evolved…and continues to evolve. We’ve come a long way from Gen 1 AIOps, which was based mainly on root cause analysis, or pinpointing a problem for the IT team. Gen 2 AIOps is here and it is the next step up in AIOps. It provides faster insights with topology and connectivity built into the AIOps system. What does that mean? The IT team jumps in to resolve the issue even faster than before. Sean discusses what’s next in the AIOps evolution with Trent Fitz and Ani Gujrathi from Zenoss.
Gen 1 AIOps: Pioneering the Potential of AI
The basic premise behind Generation one (Gen 1) AIOps platforms is event and alert monitoring in the IT environment. It starts with a business-wide problem in the organization. The data gets fed into the AIOps platform and it produces insights for root cause analysis in the infrastructure. From there, the IT team can accelerate problem resolution.
The problem with Gen 1 AIOps is time to value
It is a promising format, but what we have learned is that the machine learning capabilities are not quite there yet to ingest a bunch of events that AIOps has no understanding of (i.e. the AIOps doesn’t know what a server is). Trent says, “It doesn’t know what a switch or a router is, or a storage array. It’s just seeing a bunch of purples and greens, and trying to make sense out of that. The reality is the algorithms are not capable of doing that without some other context.”
The problem that we see in the industry now with Gen 1 AIOps is time to value. The AIOps platform has billions of data points fed to it daily. The AI only learns an issue is a problem after it experiences it several times – this is called a fingerprint. Unless it runs into the exact same fingerprint, it will not identify an anomaly as an issue. In the modern IT infrastructure, it’s uncommon to run into the exact same issue repeatedly. So, this creates a lag in root cause analysis, slows the process down, and affects business outcomes in the end.
The AIOps needs context of how things are connected and how they’re dependent upon each other. That’s where Gen 2 AIOps has stepped up to the plate.
Gen 2 AIOps: Building topology and connectivity for faster problem resolution
What has evolved over the last couple of years is what analysts refer to as Generation two (Gen 2) AIOps platforms. These are AI-powered tools that collect other data types besides events. They proactively collect other data types, metrics in particular, but can ingest logs and streaming data. Sometimes there are agents involved, sometimes it’s agentless data collection. The point is that it’s building a model from the topology data and metrics collected.
With a dependency map of the specific IT service, you can feed the machine learning algorithms that are built to detect issues within those parameters. This fixes the issues of time to value from Gen 1 AIOps.
The second iteration of AIOps are vendors that were already collecting the data. Now they’ve added machine learning to the back end to do better root cause analysis and accelerate problem resolution. But where do we go from here?
Gen 3 AIOPs: What’s next in the AIOps evolution?
So we come to the final part of this conversation, which is: what’s next for AIOps? Generation Two AIOps can only lead to Generation Three (Gen 3) AIOps. Trent and Ani say that the next logical step in the evolution of AIOps is building a trust factor. Trent says, “So the biggest obstacle in my opinion, is being able to trust the machine, to do what a human being can do.”
How do we get humans to team up with AI? The way forward is explainability. The AIOps community has to get better at explaining to people how the machine is reaching its conclusions. Many people get apprehensive about hearing that a machine can automate their job. All they hear is “you’re replaceable.” That is far from the goal of AIOps. “They’re trying to put the people on the problems that will advance their business, and those problems that still require human brains, but we have to get better at automating all of the things that can be done by the machines,” Trent says.
The ultimate team: AI + Human Collaboration
The way forward is collaboration and showing people how much more they can accomplish with AIOps as a strategy in their toolbox. Over the next few years, we will see a huge effort to allow people to get more invested in what kind of machine learning is going on. We have to explain how the machines arrive at conclusions, and drive home the idea that this is what will free them up for higher pursuits into business priorities.