Inflection replatformed to autonomously managed AWS Lambda serverless functions with dramatic improvements to customer experience & ops team productivity
FCI Reduction
NPS Gain
Operations toil reduction
Performance Improvement
Availability Improvement
Ops Productivity
Autonomous Optimization
Autonomous Remediation
Release Intelligence
AWS Lambda
AWS CloudWatch
SaaS
North America
Inflection had decided to replatform from Kubernetes to serverless for both development and operational efficiencies. This was a complex task given that Inflection served a large number of customers with APIs built to process massive transaction volumes and deliver actionable results in real-time. Inflection sourced and aggregated their data from more than 2,000 in-house sources, 3,100 county, and 94 federal courts and used advanced algorithms to classify and filter more than 230M aggregated criminal records.
Inflection wanted to free up engineer time to focus on core functionality and alongside the move to serverless was looking for a lower effort way to manage operations. Inflection CTO Siddharth Ram defined operations as “the work done before and after code is deployed into production to maintain or improve operational quality.”
Inflection decided to implement Sedai´s autonomous cloud management platform in a way that would allow them to heavily prioritize latency improvements over cost reductions. Although the setup and integration of Sedai’s autonomous cloud management platform typically takes fifteen minutes for Lambda, the team from Sedai worked closely with Inflection´s SRE team during the implementation.
Immediately after the setup was completed, Sedai’s autonomous system started accessing Inflection´s cloud accounts with the goal of understanding their infrastructure and detecting their production environment topology. Once the initial discovery process was completed, Inflection´s SRE team was able to see all identified resources including all serverless functions and applications. Sedai then began collecting metrics data and developing optimization & remediation recommendations.
Inflection saw major improvements in customer experience through the combination of replatforming and autonomous management. In particular, Inflection reduced Failed Customer Interactions (FCIs) - Inflection was able to reduce FCIs from 3.2% to 0.02% (a greater than 95% reduction) over the space of 3 quarters, and FCIs approached near zero over the following year.
The FCI metric represents a subset of errors and slow services that impact customer experience; Amazon advises that companies should target an FCI rate of less than <0.025%, which Inflection achieved. Furthermore, latency was reduced by 14%, improving service for Inflection customers, with individual services showing up to 81% latency reductions.
These improvements contributed to NPS (Net Promoter Score) gains. NPS is calculated directly from customer ratings of their likelihood to recommend Inflection to peers & colleagues. Inflection’s NPS improved from 63% to 68% in the first two quarters after the change, and then to 70% over the next year.
With Sedai´s autonomous cloud management platform up & running, Inflection´s SRE team was able to move away from human-led automated scripted management of cloud-native workloads towards a machine learning system for autonomous management. Being able to autonomously resolve issues significantly reduced the toil for their SRE team while optimizing for both cost and performance.
The operational productivity gains supported the team’s wider goal of spending time on ‘core’ work and minimizing time on ‘context’ activities. Siddharth’s view is that core work is the secret sauce of the company - “anything that is a competitive advantage or the key value add in the company”. In contrast ‘context’ work is necessary, but not necessarily done by your engineering team.
Sedai also provided insights to the team on the underlying code quality as Inflection developers released new code.
Sedai helped Inflection deliver improved customer experience while also enabling Inflection’s engineering team spend more time on the “core” work that advanced the company’s competitive advantage. Based on his experience, Siddharth now expects growing adoption of autonomous management for modern applications.
With Sedai’s fully autonomous management, 500 errors due to scaling issues went away. So almost all the errors we were having were because of coding issues, where someone had not thought through all the code paths.
We didn’t actually need SREs to work on serverless. We just looked at Sedai dashboards once a week. Both vertical and horizontal scaling were being done by Sedai. We had a 90% busywork reduction for serverless workloads
Sedai Release Intelligence metrics were interesting. Every now and then it would tell us there was an issue with a release. And we would go in and look at the release and ask ourselves should we even be deploying it?
Irrespective of whether you’re a Kubernetes fan or you’re going serverless, autonomous is the way things are going