Frequently Asked Questions

Amazon ECS Cost Optimization & AWS Pricing Models

What are the main AWS pricing models for compute resources?

AWS offers four primary pricing models for compute resources: On-Demand, Reserved Instances, Savings Plans, and Spot Instances. Each model is designed for different usage patterns and budget requirements. On-Demand provides pay-as-you-go flexibility, Reserved Instances and Savings Plans offer discounts for long-term commitments, and Spot Instances provide steep discounts for using spare capacity with the risk of interruptions.

How much can I save using AWS Spot Instances compared to On-Demand?

Spot Instances can achieve up to 90% savings compared to On-Demand rates. However, they come with the risk of interruptions, so they are best suited for stateless or fault-tolerant workloads.

What are the risks of using Spot Instances in AWS?

The main risk of using Spot Instances is that AWS can reclaim them at any time to fulfill rising On-Demand requests. You receive a 2-minute interruption notice before the instance is terminated. As of March 2024, only about 5% of spot instances were interrupted in the previous three months, and following best practices like diversification can reduce downtime risk.

How does ECS support Spot Instances?

Amazon ECS has built-in support for Spot Instances. It manages compute diversification across instance pools and automatically replaces interrupted Spot Instances. ECS integrates with AWS Auto Scaling Groups to handle replacement and supports automated spot instance draining for graceful shutdowns and rescheduling of tasks.

What is the benefit of mixing On-Demand and Spot Instances in ECS?

Mixing On-Demand and Spot Instances allows you to achieve significant cost savings while maintaining service availability. For example, overprovisioning with Spot capacity can reduce costs to just 5-10% more than your original On-Demand cost, while ensuring availability through dynamic capacity management.

How do Reserved Instances and Savings Plans help optimize AWS costs?

Reserved Instances and Savings Plans provide significant discounts (often 20-58% off On-Demand prices) in exchange for committing to 1- or 3-year terms. They are ideal for steady or predictable workloads and can be combined with On-Demand and Spot for optimal cost efficiency.

What are the typical discount percentages for different AWS commitment options?

Discounts vary by payment option and term length. For example, Compute Savings Plans and Convertible RIs offer 21-50% off for 1- or 3-year terms, with higher discounts for partial or all upfront payments. Standard RIs can reach up to 58% off with a 3-year all upfront commitment.

How does ECS handle Spot Instance interruptions?

ECS integrates with AWS Auto Scaling Groups to automatically replace interrupted Spot Instances. It also supports automated draining, where tasks receive a SIGTERM and then a SIGKILL signal, allowing for graceful shutdown and rescheduling on available instances.

What is the advantage of using multiple Spot Instance pools?

Using multiple Spot Instance pools increases diversity and reduces the risk of downtime due to interruptions. The more pools you use, the less likely you are to be affected by a single pool's capacity fluctuations.

How does Fargate differ from EC2 for running ECS tasks with Spot Instances?

With EC2, you control the instance type and pools for Spot Instances, generally achieving higher discounts. In Fargate mode, AWS manages the infrastructure and Spot selection, simplifying operations but with less control over instance types.

What is the impact of a comprehensive purchasing strategy for AWS compute?

A comprehensive strategy that combines On-Demand, Spot, Reserved Instances, and Savings Plans can achieve up to a 50% reduction in overall compute costs, while balancing risk and availability for your workloads.

How does ECS automate spot instance draining?

ECS can be configured to automatically place instances in a draining state when a Spot interruption notice is received. This triggers a SIGTERM signal to running tasks, followed by a SIGKILL after 30 seconds, allowing for graceful shutdown and deregistration from load balancers.

What are best practices for minimizing Spot Instance interruptions?

Best practices include diversifying across multiple Spot Instance pools, using fault-tolerant workloads, and configuring ECS Auto Scaling Groups for automatic replacement. Following these practices can reduce the impact of interruptions, which are already infrequent (about 5% as of March 2024).

How does ECS reschedule tasks after a Spot Instance is interrupted?

When a Spot Instance is interrupted, ECS places the instance in draining mode, sends termination signals to running tasks, deregisters them from the load balancer, and attempts to reschedule them on available instances in the cluster.

What is the role of AWS Auto Scaling Groups in ECS Spot management?

AWS Auto Scaling Groups (ASG) work with ECS to manage the lifecycle of Spot Instances. When an interruption occurs, the ASG attempts to launch a replacement instance from another pool, maintaining desired capacity and availability.

How do I specify the mix of On-Demand and Spot Instances in ECS?

You can specify the weight of each capacity type (On-Demand and Spot) in your Auto Scaling Group configuration. ECS and ASG will then provision instances according to these weights, allowing you to control the cost and availability balance.

What is the difference between EC2 Spot and Fargate Spot in ECS?

EC2 Spot gives you control over instance types and pools, generally offering higher discounts but requiring more management. Fargate Spot is managed by AWS, which selects the underlying infrastructure, simplifying operations but with less control and potentially lower discounts.

How do Spot Instance pool prices and capacities fluctuate?

Spot Instance pool prices and capacities fluctuate independently based on supply and demand in each Availability Zone and instance type. Diversifying across pools helps mitigate the risk of interruptions due to these fluctuations.

What is the maximum cost reduction possible with a balanced AWS purchasing strategy?

In a scenario where On-Demand, Spot, Reserved Instances, and Savings Plans are used in equal shares at maximum discount levels, you could achieve up to a 50% reduction in overall compute costs.

Sedai Platform Features & Capabilities

What is Sedai and how does it help with cloud cost optimization?

Sedai is an autonomous cloud management platform that optimizes cloud resources for cost, performance, and availability using machine learning. It can reduce cloud costs by up to 50%, improve performance by reducing latency by up to 75%, and proactively resolve issues before they impact users. Learn more.

What are the key features of Sedai's autonomous cloud optimization platform?

Sedai offers autonomous optimization, proactive issue resolution, full-stack cloud coverage (across AWS, Azure, GCP, and Kubernetes), release intelligence, enterprise-grade governance, and plug-and-play implementation. It supports multiple modes: Datapilot (observability), Copilot (one-click optimizations), and Autopilot (fully autonomous execution).

How does Sedai's platform improve application performance?

Sedai enhances application performance by reducing latency by up to 75%. For example, Belcorp achieved a 77% reduction in AWS Lambda latency using Sedai, significantly improving user experience.

What integrations does Sedai support?

Sedai integrates with monitoring and APM tools (Cloudwatch, Prometheus, Datadog, Azure Monitor), Kubernetes autoscalers (HPA/VPA, Karpenter), IaC and CI/CD tools (GitLab, GitHub, Bitbucket, Terraform), ITSM tools (ServiceNow, Jira), notification tools (Slack, Microsoft Teams), and various runbook automation platforms.

How quickly can Sedai be implemented?

Sedai's setup process takes just 5 minutes for general use cases and up to 15 minutes for specific scenarios like AWS Lambda. The platform uses agentless integration via IAM and offers comprehensive onboarding support, including one-on-one sessions and detailed documentation.

What types of companies and roles benefit most from Sedai?

Sedai is designed for platform engineering, IT/cloud ops, technology leadership, site reliability engineering (SRE), and FinOps roles in organizations with significant cloud operations. Industries include cybersecurity, IT, financial services, healthcare, travel, e-commerce, and more.

What business impact can Sedai deliver?

Sedai can reduce cloud costs by up to 50%, improve performance (up to 75% latency reduction), deliver up to 6X productivity gains, and reduce failed customer interactions by up to 50%. Customers like Palo Alto Networks saved $3.5 million, and KnowBe4 achieved 50% cost savings in production.

How does Sedai compare to other cloud optimization tools?

Sedai differentiates itself with 100% autonomous optimization, proactive issue resolution, application-aware intelligence, full-stack cloud coverage, release intelligence, and rapid plug-and-play implementation. Unlike competitors that rely on manual adjustments or static rules, Sedai continuously optimizes based on real application behavior.

What security and compliance certifications does Sedai have?

Sedai is SOC 2 certified, demonstrating adherence to stringent security and compliance standards for data protection. Learn more.

What technical documentation is available for Sedai?

Sedai provides detailed technical documentation covering features, setup, and usage. Access it at docs.sedai.io/get-started. Additional resources, including case studies and datasheets, are available at sedai.io/resources.

What customer feedback has Sedai received about ease of use?

Customers highlight Sedai's quick plug-and-play setup (5–15 minutes), agentless integration, personalized onboarding, and extensive support resources. The 30-day free trial allows users to experience the platform's value risk-free.

What are the most common pain points Sedai solves for cloud teams?

Sedai addresses cost inefficiencies, operational toil, performance and latency issues, lack of proactive issue resolution, complexity in multi-cloud environments, and misaligned priorities between engineering and FinOps teams. It automates routine tasks and aligns cost and performance goals.

What industries have benefited from Sedai's platform?

Sedai's case studies span cybersecurity (Palo Alto Networks), IT (HP), financial services (Experian, CapitalOne), security awareness (KnowBe4), travel (Expedia), healthcare (GSK), car rental (Avis), retail/e-commerce (Belcorp), SaaS (Freshworks), and digital commerce (Campspot).

Can you share specific customer success stories with Sedai?

Yes. KnowBe4 achieved 50% cost savings and saved $1.2 million on AWS. Palo Alto Networks saved $3.5 million and reduced Kubernetes costs by 46%. Belcorp reduced AWS Lambda latency by 77%. See more at sedai.io/resources.

Who are some of Sedai's notable customers?

Sedai's customers include Palo Alto Networks, HP, Experian, KnowBe4, Expedia, CapitalOne Bank, GSK, and Avis. These organizations trust Sedai to optimize their cloud environments and improve operational efficiency.

What modes of operation does Sedai offer?

Sedai offers three modes: Datapilot (observability), Copilot (one-click optimizations), and Autopilot (fully autonomous execution). This flexibility allows teams to choose the level of automation that fits their needs.

How does Sedai ensure safe and auditable changes in cloud environments?

Sedai integrates with Infrastructure as Code (IaC), IT Service Management (ITSM), and compliance workflows to ensure all changes are safe, validated, and auditable. Every optimization is constrained, validated, and reversible, supporting enterprise-grade governance.

How does Sedai help align engineering and FinOps priorities?

Sedai provides actionable insights and autonomous optimization that align engineering goals (performance, reliability) with FinOps objectives (cost efficiency), helping teams achieve both without manual trade-offs.

Sedai Logo

Using Amazon ECS Spot, Savings Plan and Reserved Instances to Optimize Costs

JJ

John Jamie

Content Writer

May 10, 2024

Using Amazon ECS Spot, Savings Plan and Reserved Instances to Optimize Costs

Featured

Summary

This article explains how you can use Amazon ECS spot, savings plans and reserved instances to optimize cloud costs.

  • There are four AWS pricing models: On-Demand, Reserved Instances, Savings Plans, and Spot Instances, each tailored to different usage patterns and budgetary constraints.
  • Spot Instances can achieve up to 90% savings compared to On-Demand rates, while effectively managing risks associated with their potential interruptions.
  • Combining On-Demand and Spot Instances within ECS using Auto Scaling Groups to dynamically adjust to workload demands, ensuring optimal cost efficiency and service availability.
  • Reserved Instances and Savings Plans can provide discounts for long-term commitments based on your consistent workload requirements.
  • Implement a comprehensive purchasing strategy that incorporates all available options, optimizing cost efficiency while mitigating risks like Spot interruptions or overcommitments.

AWS Pricing Models 

Compute in AWS is primarily offered with four purchasing options:

  • On-Demand
  • Reserved Instances
  • Savings Plans
  • Spot Instances
67066866e448c33db562a6f1_6641be23fb1d61522628e721_Amazon-20ECS-20-20EC2-20Purchase-20Options-20-20Sedai.webp

On-demand instances follow the standard pay-as-you-go model, where you are billed by the second. This does not require any long term commitments, free’s you from the complexities of planning and purchasing compute, and are best suited for highly available fluctuating workloads.

Reserved Instances and Savings Plans are pricing models, where you commit to long term compute usage (usually 1 or 3 years in a more-or-less flexible way), in exchange for significant discounts. RIs and Savings Plans are perfect for workloads with steady usage or to handle the base load of unpredictable workloads .

Spot instances constitute the spare compute capacity in AWS which are available at steep discounts of up to 90% compared to on-demand prices. These are best suited for stateless or fault-tolerant workloads as AWS can claim these instances back to serve rising demand.

Spot Instances

The purchasing options provided by AWS, all rely on the same underlying EC2 infrastructure which behave the same way.  Spot is no exception. 

6641b970b3de86bb278301a5_RB92lQN_oP-UaCXusAuL3x-tJ7dgQsBnKOYZSAZeJO0c4ZCHHgPBeJG4oGrNTEJMwJhEH6RkeeiYtTzqO-IXxIZn06Cf-fg73bw9nC4bBXFIvd58H_OUkCwiHoJsDDgT7jq87VJqDWv4D-tshEqX8c-1.webp

Spot instances are the idle EC2 instances that are not being used to fulfill on-demand requests, and hence are made available at cheap prices with discounts of up to 90% off compared to on-demand prices. The prices of spot instances vary with time and demand. 

All these price benefits come with a catch. Spot instances can be interrupted to fulfill rising on-demand requests. The instance will be given a 2 minute notice, after which, it is terminated. Fault-tolerant or stateless workloads hence work best with spot instances, as an interruption will have little to no impact on it.

6641b96ff7dbcda617d0868b_mqsZ9SywvSuhbWEyVeMcqYThCieooJ_q9r3_cBbYwHO91PcqH0WKPROoZB6Eqv3MUOM3pUyErNOnR8Q_pftENE8BTl26QuwpGoqHOrBSGwvy2YQInoDmU2TYXIXmkkkMs_pF6FYbtcyvppocptmM65c.webp

AWS spot capacity is divided into spot instance pools. All spot instances of a specific instance type running in an AZ constitute a spot instance pool. For example: All C5.xlarge spot instances running in us-east-1a form a spot instance pool while all C5.2xlarge spot instances in the same AZ form another pool. Likewise, if you use the same instance type on three different AZ’s, you are  consuming capacity from three different spot pools.

The prices and capacities of these pools fluctuate independent of each other. Tying this back to spot interruptions, the more instance pools you use, the more diverse you are, the less downtime you face, when demand increases.  The “don't put all your eggs in one basket” concept applies to spot.

6641b97053b8345899d074fa_DjIKtrm_NXNzRBFKYVVZsEfpvUt2rdblJRP0mPMXo2_Y3NneSRvXNEZ6k2gemfuRoQnM-emTF-4y-NLJOWvxCJXRa6dxAZBEXuasCLo6t94Gbh2IfiiUtcS5OHoh6IGeN-cwQaM2D_YonPPt3CvXdus-1.webp

As of March 2024, with all the new capabilities we have, data shows us that spot interruptions have become fairly infrequent, with only 5% of spot instances interrupted in the last three months. The more we stick to best practices and proper diversification, the better spot as a whole will function.

Spot on ECS

ECS comes with built-in support for spot instances. Compute diversification across instance pools to automatic replacement of interrupted spot instances, are all taken care of by ECS.

6641b970a8d8ad47caa0a806_pywBZnx7MDmzSXlpGxyH7Oo5TWoyA9CVtiwVdmblR0LWRvzJV14_amba_AptsjRsXxPRrcl-3KCalDYiuU91p14FUV2buWMDlRCWNvtji5mu2P6kd2ScsfKsFoc0-Gfgzs-1HnWwGwgz_cZUFXgJRoc.webp

A task in ECS can be launched in two ways:

  • You can opt to run it in an EC2 instance, where you have full control over the underlying instance type and operating system. 
  • Or you can choose to run your ECS task in Fargate mode, where AWS relieves you from the operational burden of having to maintain servers.

In both these approaches, you can opt to make use of spot instances to bring in significant cost reductions, with EC2 spot generally providing higher discounts compared to Fargate. EC2 spot also requires you to choose the backing instance pools, whereas in Fargate, AWS takes the decision. 

6641b9718e4379725a0e35c8__fBw5h_2FRcBF4hQdBcqCjDn94QKtlg9eLI2eRG-CIzisPpmxpUYnPuaEAHFC7RE_OI9o-5C12Gc1A1_734lcjnrMynAiYTrEeEZS8dVwRDz8fiKH8rEkhYRSZGVKtJVcg9wFzpTtEJ-BgI4wYp9fNU.webp

ECS automates spot lifecycle management by integrating with AWS Auto Scaling Groups. When there is an interruption, the ASG will try to provide you with a replacement instance, from another spot instance pool, depending on your configuration. For most fault-tolerant workloads, replacing an interrupted instance is more than enough to ensure availability.

ECS also supports automated spot instance draining. This can be enabled by passing a parameter to the ECS container agent via user data of your container instance. 

Once enabled, ECS will place instances in a draining state when it receives the 2 minute spot interruption notice. All the tasks running on these instances will first be sent a SIGTERM signal, and then a SIGKILL signal 30 seconds after. This lets you stop your application gracefully, or even do that last mile log collection. ECS also deregisters all such tasks from the load balancer target group, while trying to reschedule them on the remaining available instances.

Mixing on Demand and Spot 

6641b972483a29b08768efe3_EelvJuud-gSUU1wsgyKUqNo0XZBQrv7agdO9ikHzxpkF0VFOXE19YtnX9rAKGqJYYeffZaCJ_QBE3r6Ffi9VdgKyoNG2WelgkJFpTIM71O2C98gRI7PyFolKUWY3ypFbxXbWLebxZneCfhDLhp1G8CI.webp

‍A mix of on-demand and spot capacity can bring in considerable savings while ensuring availability. For example: suppose you have a fleet running entirely on on-demand, which you want to overprovision by 50%. By using spot capacity to overprovision, you will only need to pay 5-10% more than your original cost. 

With ECS, it is easy to achieve dynamic capacity type splits. You can specify the weight of each capacity type and the backing ASG handles the rest. For example: If you provide on-demand with a weight of two, spot with a weight of three, and you have 10 instances provisioned by the ASG. Then four of those instances will be on-demand and six of them will be spot.

Savings Plans and Reserved Instances

Choosing Savings Plans and Reserved Instances can provide further gain. Savings Plans offer more flexible usage patterns than Reserved Instances.  Below is a comparison of effective discounts relative to on demand prices for a range of purchase options.

Source: Mark Butcher via LinkedIn

Overall Impacts of Purchasing Strategy

6641b9718e4379725a0e35a7_7Y-qeY98L8eHpI1WylNxC92wVPVdGK4TFN9JNEPpx9HNRGQquL9M_qcM-40VfacYmrzJwnilY7rSlds9-PrqVofDcbnbFALg9sbZR7XJW4g4fwUMYp_PFkmPp4CT0wJ_o_Bk699knJUhFSdOow2_ZuM.webp

Your final implementation should take all these options into consideration while keeping in mind their potential downsides. You can have a dynamic mix of on-demand and spot capacities wherein spot interruptions have minimal impact along with optimal multi-year commitments backed by proper understanding of your workload requirements. 

In a very rough scenario of a mix of on-demand, spot, Reserved Instances, and Savings Plans in equal shares at maximum discount levels, you could achieve a 50% reduction in the overall cost.

Type

Compute SP for EC2

EC2 Convertible RI

EC2 SP

EC2 Standard RI

1yr

3yr

1yr

3yr

1yr

3yr

1yr

3yr

No upfront

21%

45%

21%

45%

31%

52%

31%

52%

Partial upfront

24%

49%

25%

49%

34%

55%

34%

55%

All upfront

26%

50%

26%

50%

36%

58%

36%

58%

Discounts from On Demand Prices