Attend a Live Product Tour to see Sedai in action.

Register now
More
Close

Rightsizing Kubernetes Dev/Test Environments: Saving $500K/yr in 60 Days

Published on
Last updated on

May 4, 2024

Max 3 min
Rightsizing Kubernetes Dev/Test Environments: Saving $500K/yr in 60 Days

Introduction

A major Kubernetes user saved $500,000 in cloud costs within 60 days using Sedai's autonomous optimization.  This saving represented a 25% cost reduction. In this initial phase of our Kubernetes cost optimization project we focused on optimizing their Development/Test environments.  This included development, user acceptance testing (UAT) and staging.

Why did they go Autonomous?

Kubernetes Optimization Challenges

The company was seeing a significant strain on their engineering teams due to a combination of factors: 

  • complexity of managing numerous Kubernetes services
  • rapid growth in the business including expanding functionality and growing end user traffic 
  • quick release cycles through their CI/CD pipeline
  • demands for high performance given the real-time nature of their services
  • expectations for cloud cost efficiency. 

Streamline operations and optimizing cost and performance without using a large amount of engineering time would be beneficial.

In Dev/Test specifically they faced the challenge of implementing separate configurations than production:

  • Traffic is lower so the dev configurations should use fewer cloud resources
  • The consequences of performance or availability issues are less important so we can operate with less of a buffer.

In addition the microservice architecture of Kubernetes made reducing resource usage and cost challenging.   Each service had a relatively small spend, often just a few thousand dollars a year. Under the traditional manual optimization model the savings would not justify the opportunity cost of diverting engineers to optimization tasks. 

The company also wanted to avoid the complexity of running Kubernetes leading to high stress and burnout.  This was a risk to  talent retention in the long term.  There was also a safety risk due to the risk of human error when engineers were tasked with large volumes of optimization tasks.  

Implementing Autonomous Optimization

The customer adopted Sedai using a "bring your own cloud" deployment model to meet their security and access requirements.  In this model, Sedai would run inside their Google Kubernetes Engine (GKE) environments.

Sedai was granted permissions to map the account topology and understand behavior patterns.   Understanding how workloads behave is foundational to Kubernetes cost optimization in an AI-based autonomous approach.

Optimization goals were then set up.  The focus was cost optimization and not Kubernetes application performance. Sedai was instructed to find the lowest cost for these Kubernetes resources while maintaining current performance.

Once Sedai had access to the Kubernetes metrics, it was able to quickly assess 1,400 Kubernetes services.  Sedai then made recommendations for how to optimize resource consumption.

Rightsizing was the highest impact tactic in the dev/test environments.  We used Sedai’s non-production mode, which permits a more aggressive optimization approach. Sedai then recommended the best configurations for Kubernetes resources.  Recommendations were made at two levels:

  • workload level, where Sedai recommended alternative CPU, memory and pod count
  • Kubernetes cluster level, where Sedai recommended the number and type of instances and node groupings

Often an individual services would go through multiple cycles of changes to make sure the service stayed stable.

These Kubernetes rightsizing suggestions were manually reviewed and implemented weekly by the customer in the initial phase.

Outcomes Achieved

Optimization efforts driven by rightsizing resulted in $500,000 of cloud spend savings on an annual run rate basis.  These extended across over 1,400 Kubernetes resources.   Below is the latest view of the Sedai dashboard for the account showing the total savings and some of the individual services (there are over 100 pages listing all the services!).

Impact of AI-Driven Rightsizing Kubernetes Dev/Test Environments with Sedai

Sedai determined that many environments were oversized, often replicating production configurations.  This was not required given the lower traffic and resources required. 

The average savings per service were surprisingly small  (<$400/year).  This confirmed the customer’s views that it would not have been economic previously to pursue these savings.  The screenshots below show some of the optimization details for one of the services.  In this case an overall saving of around $4,000/year after five optimization actions were taken.  Some of the gains were small with the optimization shown in the middle panel providing just $398/year of savings by adjusting Kubernetes requests and limits:

Example of a Kubernetes Service in Dev/Test Undergoing a Series of Rightsizing Impacts

 

Previously the cost of allocating engineering resources would have outweighed the benefits for many services. Sedai’s autonomous approach aggregated these small improvements into substantial overall Kubernetes cost savings.

Future Direction of Kubernetes Optimization

The customer is planning to expand their optimization efforts further:

  • Adding the rest of the dev/test environments.  We are currently setting up Sedai to optimize another cluster.
  • Using more advanced Sedai features (e.g., unique settings for ML based workloads).
  • Moving from manual to automated to autonomous operations
  • Extending optimization to production environments

As well as engineering optimizations we will also be looking holistically at their Kubernetes infrastructure costs including purchasing optimizations including spot instances, savings plans and reserved instances.  We’ll also look at scheduled shutdowns for Kubernetes environments.  These shutdowns would need to be compatible with the global operations and variable team working hours.

Conclusion

The use of AI based autonomous optimization to uncover Kubernetes savings in dev/test environments has been an effective way to reduce costs.  We were able to overcome the challenge of optimizing allocated resources across a large number of Kubernetes microservices. This first experience in dev/test has been a significant step in the journey to full autonomy.

Was this content helpful?

Thank you for submitting your feedback.
Oops! Something went wrong while submitting the form.