What is Sedai and how does it optimize Amazon ECS workloads?
Sedai is an autonomous cloud optimization platform that uses machine learning to manage and optimize production environments, including Amazon ECS workloads. For KnowBe4, Sedai autonomously rightsized ECS services, adjusted autoscaling, and continuously validated changes to reduce engineering toil and improve efficiency. Sedai's safety-by-design approach ensures all optimizations are gradual and validated in real time, minimizing risk of incidents or SLO breaches. Note: Sedai's effectiveness depends on integration with supported cloud environments and may require additional setup for complex workflows. Learn more.
What problems did KnowBe4 face before using Sedai for ECS optimization?
KnowBe4's engineers manually managed ECS service configurations, responding to monitoring feedback about resource over- or under-provisioning, performance issues, and cost inefficiencies. This process was time-consuming, error-prone, and required continuous manual intervention across thousands of microservices. The lack of autonomous optimization meant missed cost savings and potential customer experience impacts. Note: Manual optimization may still be required for highly custom or unsupported workloads.
How did KnowBe4 implement Sedai's autonomous optimization for ECS?
KnowBe4 adopted Sedai using a three-phase approach: Crawl (initial integration and low-risk services), Walk (expanding to flagship products with tailored cost and performance goals), and Run (full-scale autonomous optimization across all services and regions). This phased rollout built trust in Sedai's autonomous actions and allowed KnowBe4 to validate results at each stage. Note: Organizations with strict change management policies may require additional validation steps before full autonomy.
What measurable results did KnowBe4 achieve with Sedai on Amazon ECS?
KnowBe4 achieved a 27% reduction in ECS cloud costs, with over $400,000 in projected savings and 1,100+ autonomous actions executed in three months. 98% of KnowBe4's 9,491 services now run autonomously. In some clusters, potential savings reached 36%. These results were realized through Sedai's continuous rightsizing, autoscaling, and idle resource cleanup. Note: Actual savings may vary based on workload characteristics and adoption scope. (Source: Sedai blog, April 2024)
How does Sedai ensure safe autonomous optimization in production environments?
Sedai's patented safety-by-design approach includes continuous health verification, automatic rollbacks, and incremental changes for real-time validation. This minimizes the risk of outages or SLO breaches during autonomous optimization. Sedai's system is designed to make gradual, validated optimizations rather than all-at-once changes. Note: Detailed limitations not publicly documented; ask sales for specifics regarding edge cases or highly regulated environments. Learn more about Sedai's safety and compliance.
Features & Capabilities
What are the key features of Sedai's autonomous optimization platform?
Sedai offers autonomous optimization, application-aware intelligence, proactive issue resolution, full-stack cloud coverage (across AWS, Azure, GCP, Kubernetes), safety-by-design (continuous health checks, rollbacks), release intelligence, and plug-and-play implementation. Sedai integrates with tools like GitLab, Datadog, AWS Cloudwatch, and supports ECS Fargate. Note: Some features may require additional configuration or may not be available for all cloud providers. See solution briefs.
What integrations does Sedai support for cloud optimization?
Sedai integrates with monitoring and APM tools (Prometheus, Datadog, Cloudwatch, Azure Monitor), Kubernetes autoscalers (HPA/VPA, Karpenter), CI/CD and IaC tools (GitHub, GitLab, Bitbucket, Terraform), ITSM systems (ServiceNow, PagerDuty, Jira), notification platforms, and serverless environments (AWS Lambda, AWS Fargate). Note: Integration availability may vary by environment; check documentation for specifics. See technical docs.
Implementation & Onboarding
How long does it take to implement Sedai for ECS optimization?
Initial onboarding for Sedai's agentless or agent-based deployment typically takes about 15 minutes to begin reading metrics from your environment. Additional setup for CI/CD and integrations may require more time depending on complexity. KnowBe4 used a phased approach (Crawl, Walk, Run) to gradually expand adoption and validate results. Note: Complex environments or custom integrations may extend setup time. See getting started guide.
What technical documentation is available for Sedai users?
Sedai provides a Getting Started Guide, Kubernetes Optimization Guide, and a Platform Overview. These resources cover onboarding, optimization strategies, and platform capabilities. Access documentation at docs.sedai.io and sedai.io/resources. Note: Some advanced topics may require direct support or consultation.
Pricing & Plans
How is Sedai priced for ECS and other cloud optimization?
Sedai uses a volume-based pricing model, charging based on the resources optimized (e.g., ECS tasks, Kubernetes pods, VMs). Pricing is transparent, with a free tier and a 30-day free trial available. For Kubernetes and ECS environments, Sedai recommends booking a demo to discuss specific needs. Note: Actual costs depend on resource usage and optimization scope. See pricing details.
Security & Compliance
What security and compliance certifications does Sedai have?
Sedai is SOC 2 certified, demonstrating adherence to industry standards for data protection and compliance. This certification covers security, availability, and confidentiality requirements. For more details, visit the Sedai Security page. Note: For additional certifications or region-specific compliance, contact Sedai directly.
Customer Success & Case Studies
What are some real-world results from customers using Sedai?
KnowBe4 achieved a 27% reduction in ECS cloud costs, 1,100+ autonomous actions in three months, and 98% of services running autonomously. Other customers include Palo Alto Networks (saved $3.5 million), Belcorp (reduced AWS Lambda latency by 77%), and Campspot (34% reduction in Lambda latency). See more case studies at sedai.io/customers. Note: Results may vary by customer and workload.
Which industries have benefited from Sedai's autonomous optimization?
Sedai's platform is used in cybersecurity (KnowBe4, Palo Alto Networks), financial services (Experian), healthcare, e-commerce (Wayfair, Campspot), IT/technology (HP, Freshworks), consumer goods (Belcorp), and digital commerce (Informed). This demonstrates broad applicability across sectors. Note: Industry-specific requirements may affect implementation details. See resources.
KnowBe4 implemented a three-part approach (Crawl, Walk, Run) to gradually adopt autonomous optimization, resulting in significant cost savings and performance gains.
KnowBe4's autonomous journey has led to 98% of their services running autonomously, with a 27% cost reduction and over 1,100 autonomous actions in the past 3 months.
Introduction
This article covers KnowBe4‘s experience applying autonomous optimization (both cost optimization and performance optimization) and is based on part of the presentation “Mastering Autonomous Optimization for Amazon ECS” at autocon. The KnowBe4 portion was presented by Nate Singletary, Senior SRE at KnowBe4. You can see the full video here, and read the blog covering more detail on the optimization strategies for Amazon ECS behind KnowBe4’s case here.
About KnowBe4
KnowBe4 provides the world's largest security-awareness training and simulated phishing platform. KnowBe4 is used by over 34,000 organizations globally, and has the world's largest library of security awareness training content. KnowBe4 ranks at the top of many great places to work lists, and was ranked number one in Energage's top workplaces in the USA.
ECS cost optimization requires more than scaling policies. Book a demo to see how Sedai manages task rightsizing, spot usage, and idle resource cleanup in one platform.
KnowBe4’s Workloads
KnowBe4’s Amazon ECS workloads support a diverse suite of products ranging from security awareness tools to human detection response and more. KnowBe4 also integrates externally with their customers’ security stack to provide real-time coaching to their users in response to risky behavior. All these products are spread across thousands of microservices, functions, and data stores. And these services are all deployed in AWS across several compute and data storage services including Amazon ECS as shown below:
KnowBe4's Tech Stack
KnowBe4’s platform architecture is straightforward. They commit all their code in GitLab. KnowBe4’s CI/CD runs on GitLab on GitLab runners. Their production workloads are deployed on AWS. Their monitoring, metrics, and alerting is in Datadog (which includes logs and metrics exported from AWS Cloudwatch). They don't use a wide range of vendors and onboarding new vendors is a rare event for the company.
KnowBe4’s ECS Optimization Challenge
KnowBe4 saw that they had an optimization “void” post their commit, deploy, and monitoring workflow:
KnowBe4 had ECS services running in AWS, and wanted to ensure they're running efficiently. The challenge they faced was knowing if they were in fact running efficiently. And if they're not, how do they react to that and fix the issue?
KnowBe4 was using engineers to fill that void. The engineers had to respond to the feedback from the monitoring system including:
a service may be running too low in memory or too low in CPU
a service may be peaking from a traffic perspective and experiencing performance issues
a service may be running too rich and be overprovisioned
These issues could be impacting customer experience or meaning KnowBe4 was missing out on hundreds of thousands of dollars of cost savings across several services
So the question was how should KnowBe4 fix it? The KnowBe4 engineers had to commit code to update the ECS config, deploy it, and then wait for that feedback to see if they had rightsized correctly. And this was a continuous process across many services - in fact, thousands of microservices and functions. This manual process and feedback cycle was not ideal.
Ready to optimize Amazon ECS autonomously?
Book a Sedai demo to reduce container costs, improve performance, and automate ECS optimization at scale.
KnowBe4 Uses Sedai to Fill the ECS Optimization Void
KnowBe4 decided to use Sedai’s autonomous optimization to fill this void. Sedai allowed them to reduce the toil on their engineers and autonomize the feedback loop of checking what impact a change made in production to critical metrics.
The key drivers for KnowBe4 to move to an autonomous platform architecture were three-fold:
Reducing complexity in managing cloud and reducing toil for their engineers.
Keeping their cloud efficient on a continuous basis
Managing availability and performance at the highest levels for their customers
Reducing toil for KnowBe4’s engineers would also allow engineers to focus on the things they like to do - releasing new products and features.
KnowBe4 also wanted to make sure their workloads are running efficiently. That meant keeping up release velocity, while keeping cost at the front of their minds while also ensuring their services were performant.
To achieve this, KnowBe4:
Ran their containerized workloads on ECS Fargate. KnowBe4 doesn't need to worry about managing the cluster or the underlying host
AllowedSedai to autonomously rightsize their services and adjust their auto scaling.
How KnowBe4 Adopted Autonomous Optimization
To implement autonomous adoption with Sedai, KnowBe4 took a three part Crawl, Walk, Run approach as shown below:
Crawl
In the first crawl stage, KnowBe4 set up the Sedai integration. They set an initial goal of achieving around 10% cost reduction.
At this stage it gave KnowBe4 the ability to allow Sedai to analyze KnowBe4 workloads, see where KnowBe4 may be overprovisioned, and what the opportunities for cost reduction or performance gains were.
KnowBe4 then enabled autonomous on a set of services. They were not “diving off the deep end” at that stage as these were low-risk services. KnowBe4’s goal with these services was to see how they reacted to the autonomous optimizations.
Walk
In the walk stage, KnowBe4 had now seen some evaluations. KnowBe4 had seen some opportunities for significant cost reduction and significant performance gains. KnowBe4 had also seen some realized cost reduction and performance gains in the set of low-risk services that they had enabled.
At this stage KnowBe4 was impressed by the results of autonomous optimization and decided to more aggressively roll out Sedai and decided to turn on autonomous optimization for their flagship products.
Before turning on autonomous optimization, KnowBe4 created groups. KnowBe4 divided these groups by products and regions and set goals for cost and performance that were tailored to the product. KnowBe4 has some products with services that are more latency tolerant, and set a more aggressive cost reduction goal for that service. KnowBe4 also has services where they need to maintain the highest levels of availability and performance and will not be as aggressive with these services. Once these groups were set up and goals defined, KnowBe4 turned on autonomous optimization for them.
KnowBe4's phased adoption proves that the trust required for full autonomous optimization is buildable — and the results at each stage validate the approach. Book a demo to start your own Crawl, Walk, Run journey for ECS optimization.
Run
In the Run phase KnowBe4 allowed Sedai to “take the wheel”. Services are autonomously optimized by default. If an engineer releases a service in ECS, it's automatically managed by Sedai.
Sedai was integrated across all of KnowBe4’s AWS accounts, and is managing services across all regions.
KnowBe4 is now working towards integrating Sedai into the CI/CD flow so that KnowBe4 will have a fully autonomized feedback loop.Sedai had now filled KnowBe4’s optimization void.
Realized Savings and Highlights at KnowBe4
Below is an example of the opportunities KnowBe4 has seen in Sedai across a group of accounts. Sedai is projecting a 27% cost savings, or over $400,000 in cloud spend which KnowBe4 considers to be a significant reduction in cloud cost.
Below is an example of an individual cluster with a 36% potential saving.
Below is another example showing some of the realized savings at KnowBe4. This shows some of the Lambda services where KnowBe4 has not only reduced cost by 30%, but also reduced duration and increased performance by 86%.
Highlights from KnowBe4’s autonomous journey
KnowBe4’s highlights now include:
98% of KnowBe4’s 9,491 services now run autonomously
1,100+ autonomous actions in the past 3 months
27% cost reduction, with 10% realized by August 2023
KnowBe4 is now in the process of integrating that back into their CI/CD processes. Once completed they will have a full autonomous workflow. And the IaC would remain the source of truth for KnowBe4’s configs.