Story Highlights
The Company
KnowBe4 is the global leader in Human and AI Risk Management. More than 70,000 organizations use its platform, which is built entirely on AWS.
The Challenge
Explosive growth drove massive cloud scale, making manual optimization unsustainable.
The Solution
KnowBe4 moved from manual optimization to Sedai’s safe, autonomous optimization in production.
The Results
- $1.2M+ cloud costs saved
- Up to 99.5% latency reduction
- Zero incidents caused by Sedai
The Challenge: KnowBe4’s Success Created Operational Chaos
When it came to the cloud, KnowBe4 was a victim of its own success.
As the global leader in Human and AI Risk Management, the company saw its customer base grow rapidly to more than 70,000 organizations. But that success brought unprecedented scaling challenges in its AWS environment.
Matt Duren, VP of Engineering at KnowBe4, was leading the team bringing new services to market every single month and handling massive traffic volumes. At this point, he saw that scaling was non-negotiable.
“We're seeing tons of growth right now,” said Matt. “Just from our day-to-day, we peak out at thousands and thousands of requests per second."
Running entirely on AWS with the bulk of infrastructure on Lambda and Fargate ECS, the KnowBe4 Engineering teams were operating at a level they could no longer manage manually:
- 58% YoY ECS growth across 3,000+ services with 2,000–4,000+ peak tasks
- 422% YoY Lambda growth across 2,500+ functions
- 250M+ daily Lambda invocations
- Software releases every ~20 minutes
How AWS’ Complexity Impacted KnowBe4’s Engineering Team
The frequent releases and high performance required for real-time cybersecurity delivery put immense pressure on the engineering team to optimize resources continually.
“My team doesn’t really have the luxury of spending time doing toil,” said Matt. “They can’t wait for a production release to see how it responds, and then wait for another team to optimize the service for them,” said Matt.
For Nate Singletary, Staff Site Reliability Engineer at KnowBe4, it was impossible to know whether ECS services were truly optimized. Engineers were stuck monitoring metrics & alerts while balancing under- & over-provisioning.
“If a service was running too low on memory or CPU, we’d see performance issues,” explained Nate. “But if it was running too rich, we were missing out on maybe hundreds or thousands of dollars of cost savings across several services.”
The Risk of Autonomous Optimization in Production
Needing a solution for continuous optimization, Sedai was an obvious solution.
As an autonomous optimization platform, Sedai reduces cloud costs, boosts performance, and improves availability. Its patented ML models learn real app behavior to make safe, production-aware optimizations.
To date, Sedai has executed millions of autonomous actions with zero incidents, making it a strong fit for teams struggling to keep up with growth.
But Matt had real concerns about letting a system make changes in production without human intervention.
“We needed something that we could take out of the hands of developers and put into the hands of an agent that would allow us to continue to scale our engineering teams,” said Matt. “Which is terrifying.”
That fear was real. Because availability is critical to KnowBe4’s platform, Matt’s engineers worried one bad optimization could immediately impact customers.
But building a team solely dedicated to optimization wasn’t viable. There was simply no way to scale the SRE team fast enough to keep up with KnowBe4’s growth.
Although initially skeptical about autonomous optimization, Sedai’s measured approach to implementation assured Matt it was a safe solution.
“We took a big bet that Sedai wouldn’t cause problems after a short but rigorous test trial period. And really, the proof is in the pudding there. We never had a major outage, that’s for sure.”
The Solution: Sedai’s Autonomous Optimization
The Crawl-Walk-Run Implementation Framework
With the decision to adopt Sedai, KnowBe4 wanted to ensure engineering could optimize safely with clear guardrails and without putting customers at risk.
As part of the plan, the team intentionally wanted to validate real production traffic through Sedai so they could understand how Sedai behaved in real-world conditions.
To do this, the team took a phased approach to adopting full autonomy, allowing them to prove it was safe, predictable, and trustworthy before scaling further.
This approach became its “Crawl, Walk, Run” framework:
Crawl
- Connect Sedai to the AWS environment
- Establish cost-saving goals
- Enable autonomous optimization on low-risk services
Walk
- Analyze early results
- Expand to include flagship products
- Set service-specific cost and performance goals
Run
- Roll out autonomous optimization across infrastructure
- All services are autonomously optimized by default
- Integrate Sedai across all AWS accounts & regions
This strategy proved KnowBe4 could balance cost efficiency & performance across its environments, without risking production.
How Sedai Optimized AWS ECS Fargate and Lambda
Sedai was deployed across KnowBe4’s two primary compute layers: ECS Fargate and AWS Lambda.
Its key optimization areas included:
- Service right-sizing: Automatically tunes CPU, memory, and task counts as traffic and releases change
- Dynamic scaling: Adjusts horizontal & vertical scaling to handle peak traffic without over-provisioning
- Application-aware optimization: Optimizes instance types & configurations based on real workload behavior & latency
Closing the Loop with CI/CD and IaC
With releases happening every ~20 minutes, the team wanted to ensure the frequent deployments did not overwrite the autonomous optimizations. To do that, KnowBe4 integrated Sedai into its existing CI/CD and GitOps workflows.
This approach preserved the IaC as the source of truth, while allowing Sedai to safely make and maintain optimizations. Engineers no longer had to manually update configs or worry about reverting optimizations during deployments.
“We have a full autonomous feedback loop,” said Nate.
The Results
With Sedai fully implemented, the optimization results were staggering.
“We achieved ROI in just five months. And Finance is very pleased about that,” said Matt. “If you look back in our savings history, it’s gone up by hundreds of thousands of dollars a year, even when we added new resources, accounts, services, and products.”
AWS Cloud Cost Savings
For cloud costs alone, the team was able to save 27% with Sedai’s autonomous optimization.
Overall Results
- 27% overall cloud compute cost savings
- $1.2M+ total cumulative savings since adopting Sedai
- $81K average monthly savings
Development Environment Results
- Up to 87% cost reduction on some ECS services
- ~35% average reduction, equating to ~$390K annually
Production Environment Results
- Up to 50% cost savings across ECS services

AWS Lambda Cost & Performance Improvements
For Lambda functions specifically, the cost & performance improvements shot up to nearly 100%.
Overall Results
- Up to 99.3% cost savings on individual Lambda functions
- Production Lambda function:
- 31% cost decrease
- 54% latency decrease

Performance & Latency Improvements
The performance impact was equally impressive. For one Lambda function serving real customer traffic, execution time dropped from 18.5 seconds to ~80 milliseconds (a 99.5% duration reduction) after Sedai autonomously adjusted memory allocation.
“By having Sedai in place, we’re not just saving money. We’re preventing would-be customer problems, before they become an issue,” said Matt.
Overall Results
- 10,000+ days of latency eliminated
- Up to 99.5% latency reduction
- ~80% faster processing
Autonomous Operational Efficiency at Scale
Operational efficiency saw a massive boost, and engineers didn’t need to constantly tune services by hand anymore.
“Most of the engineering team probably doesn’t know what Sedai is at all. And I think that’s pretty great,” Matt explained. “They expect that optimization just happens for them because it does.”
Overall Results
- 98% of ~9,500 services running autonomously
- 1,100+ autonomous actions executed in first 3 months
- Continuous optimization with no manual intervention
Reducing Engineering Toil
This high level of autonomy significantly reduced the manual workload on the engineering team and enabled them to focus on work that matters.
“We've been able to widen the impact that each individual SRE has by not having to focus on that low-value work,” said Matt.
Why Autonomous Cloud Optimization Works
By adopting safe and autonomous cloud management with Sedai, KnowBe4 optimized its operations and laid the groundwork for future growth.
But beyond the day-to-day cost savings, one of the biggest value drivers was giving Matt’s developers the freedom to build and innovate.
“Builders want to focus on building,” Matt said. “What Sedai really lets us do is keep people focused on high-value work that they’re passionate about, while being a really good cost-saving tool.”
Scale AWS Without Toil
Cut your cloud costs. Boost performance. Let your engineers build.

