Sedai Logo

Kubernetes Taints and Tolerations: Best Practices

BT

Benjamin Thomas

CTO

May 26, 2026

Kubernetes Taints and Tolerations: Best Practices

Featured

18 min read

Key Takeaways

  • Taints repel pods from nodes; tolerations let specific pods bypass that repulsion. Misconfigured tolerations are the upstream cause of GPU underutilization and bin-packing breakdown.
  • The CNCF 2025 Annual Survey found 82% of container users run Kubernetes in production, up from 66% in 2023. Placement is now a platform-wide concern, not a per-team one.
  • Kubernetes v1.35 (January 2026) introduced numeric toleration operators (Gt, Lt) in alpha, expanding placement logic beyond exact-match keys to threshold-based decisions.
  • The three taint effects (NoSchedule, PreferNoSchedule, NoExecute) are not interchangeable; treating them as policy levels rather than knobs prevents a class of placement bugs that only surface under traffic pressure.

If you run a mixed Kubernetes cluster with GPU inference services sitting alongside latency-sensitive APIs, batch jobs & spot-eligible workloads, taints and tolerations are the configuration layer holding all of it together. Or breaking it. One blanket toleration on an older internal service, added by someone who left the team two years ago, and suddenly your API pods are landing on GPU nodes, your ML inference jobs are contending for capacity they were supposed to own, and your cost attribution report no longer reflects which team consumed which capacity.

That scenario is not unusual. The CNCF 2025 Annual Survey found 82% of container users now run Kubernetes in production, with 56% of organizations using containers for most or all production applications. At that scale, workload placement is a platform-wide concern. A taint misconfiguration in one deployment spec is a cluster-wide failure waiting for a traffic spike.

This article covers what taints and tolerations actually do, how the Kubernetes scheduler uses them, what built-in system taints look like & where misconfigured tolerations cost real money. The consistent example throughout: a mixed EKS cluster with nvidia.com/gpu=present:NoSchedule on GPU nodes and spot=true:NoSchedule on spot nodes, and a legacy internal service with a blanket toleration that matches both. The downstream consequences of that single misconfiguration touch bin-packing, GPU utilization, SLO stability & cost attribution at once.

Summary Table

What is a Kubernetes taint?

A key=value:effect rule on a node that tells the scheduler which pods are not allowed to land there unless they tolerate it.

What is a Kubernetes toleration?

A matching key/value pair on a pod spec that lets the pod bypass a node's taint. It permits placement; it doesn't force it.

What are the three taint effects?

NoSchedule (hard reject), PreferNoSchedule (soft reject), NoExecute (reject and evict already-running pods that don't tolerate).

When do misconfigured tolerations cost real money?

When a blanket toleration lets non-GPU pods land on GPU nodes, non-prod pods land on dedicated nodes, or sticky pods evade spot eviction. All three break bin-packing or attribution.

How do taints differ from node affinity?

Taints repel (node-side); affinity attracts (pod-side). The cleanest production posture uses both: taints to protect nodes, and affinity to direct pods.

What's new in Kubernetes 2026?

v1.35 introduced numeric toleration operators (Gt, Lt) in alpha, enabling threshold-based placement rather than just exact-match keys.

In This Article

Answer Capsule: What Are Kubernetes Taints and Tolerations?

Kubernetes taints and tolerations are the scheduler controls that decide which pods are allowed to land on which nodes. A taint is a key/value/effect applied to a node that repels pods; a toleration is a matching key/value pair on a pod that lets it bypass that repulsion. The 3 taint effects (NoSchedule, PreferNoSchedule, NoExecute) determine how strictly the rule is enforced. Per the Kubernetes scheduling and eviction documentation, this mechanism is the primary way to keep workloads off the wrong nodes, including GPU nodes, spot capacity, and dedicated tenants.

How are Taints and Tolerations Configured?

A taint is a label applied to a node that tells the scheduler, "Don't place pods here unless they explicitly say they can handle this." The syntax is key=value:effect, where the effect is one of three values: NoSchedule, PreferNoSchedule, or NoExecute. You apply a taint to a node with kubectl taint nodes <node-name> key=value:NoSchedule.

A toleration is the pod-side counterpart. It says, "I can tolerate this taint. Place me on tainted nodes if needed." A toleration must match the taint's key, value (or use the Exists operator to match any value), and effect. Critically, a toleration permits placement on a tainted node; it does not force it. That distinction matters when you're debugging unexpected pod placement.

The three effects are not interchangeable. NoSchedule is a hard constraint: the scheduler rejects the pod if the toleration doesn't match. PreferNoSchedule is a soft constraint: the scheduler avoids tainted nodes but uses them if no untainted nodes are available. NoExecute is the most aggressive: it blocks new pods and evicts already-running pods that don't tolerate it, after an optional tolerationSeconds grace period. The Kubernetes docs define this as the primary mechanism for keeping workloads off the wrong nodes.

How Does the Scheduler Use Taints at Placement Time?

The Kubernetes scheduler runs two phases for every pod: filtering and scoring. Filtering eliminates nodes that can't host the pod (insufficient resources, taint mismatches, policy violations). Taint matching fires in filtering, so a taint mismatch is a hard exclusion, not a preference. This is why Kubernetes cluster scaling challenges often trace back to placement intent: a cluster can have available CPU and memory on nodes, but if the taint topology says "inference only," those resources are invisible to every other workload class.

Tolerations permit placement but don't contribute to scoring. Without a matching nodeSelector or node affinity rule, a toleration-carrying pod may land anywhere in the eligible set. Placement decisions surface through the Google SRE Book's four golden signals: latency, errors, traffic & saturation. A misplaced API on a GPU node doesn't fail immediately. It fails when the GPU workload saturates the node's memory bus and p99 latency drifts past SLO at the next traffic surge.

What Taints Does Kubernetes Apply on Its Own?

Kubernetes applies several built-in taints automatically, and they control behavior you rely on even if you've never manually set a taint. The well-known labels, annotations, and taints reference documents the full list. The most operationally significant ones:

  • node.kubernetes.io/not-ready: applied when a node's Ready condition is False. Default toleration seconds: 300.
  • node.kubernetes.io/unreachable: applied when the node controller can't reach the node. Also, 300 seconds by default.
  • node.kubernetes.io/memory-pressure: applied when memory pressure is detected.
  • node.kubernetes.io/disk-pressure: applied when disk pressure is detected.
  • node.kubernetes.io/unschedulable: applied when a node is cordoned.
  • node.kubernetes.io/network-unavailable: applied when the node network isn't configured.

These system taints matter for one specific reason: if your pod spec carries a toleration with operator: Exists and no key, it tolerates every system taint as well as every user-defined taint. Your pod won't be evicted from a memory-pressured node during a pressure event. The legacy service in the EKS example almost certainly has this problem. A wildcard Exists toleration matches node.kubernetes.io/not-ready, so that the pod stays on a node longer than it should during a rolling update, adding pressure exactly when the cluster needs to drain quickly.

Where Do Misconfigured Tolerations Break Cluster Efficiency?

The EKS cluster from the intro has two user-defined taints protecting two capacity pools: nvidia.com/gpu=present:NoSchedule on GPU nodes and spot=true:NoSchedule on spot nodes. The legacy service's wildcard toleration defeats both. Four distinct failures follow.

How Does A Blanket Toleration Cause GPU Underutilization?

API pods land on GPU nodes because nothing blocks them. They occupy allocatable CPU & memory while the GPU hourly rate accumulates for work that isn't GPU. The inference service can't scale: its pods contend for the same node or wait for new provisioning. For the full cost mechanics, see GPU resource management in Kubernetes.

Why Do Wildcard Tolerations Cause P99 Latency Drift?

The spot=true:NoSchedule taint provides no protection when the pod carries a wildcard toleration. After spot pre-emption, those pods reschedule wherever the scheduler finds room, often back on GPU nodes. Each eviction triggers a cascading placement event at exactly the traffic surge that caused the spot scale-out.

How Does Toleration Drift Break Cost Attribution?

Cost attribution in a Kubernetes cluster depends on workload-to-node mapping. When API pods run on GPU nodes, the GPU line item in your billing export is no longer "the ML team's." Finance & FinOps can't attribute it correctly, & every cost-reduction initiative starts from wrong baseline data. Improving bin-packing in Kubernetes requires a reliable workload-to-cost mapping, & misattributed spend removes that baseline.

Why Does A Single Blanket Toleration Break Bin-Packing Cluster-Wide?

The scheduler fills GPU nodes with API pods because they fit. The node appears busy. Karpenter or Cluster Autoscaler doesn't see room to consolidate. The cluster runs at apparent high utilization, while actual work output per dollar is low. This is the bin-packing failure that taints were designed to prevent. One blanket toleration undoes the entire design.

How Do Taints Differ From Node Affinity & Pod Affinity?

Taints and tolerations operate on a repulsion model: the node pushes away pods that don't belong. Node affinity operates on an attraction model: the pod pulls itself toward nodes that match. They're complementary, not alternatives.

Mechanism

Side

Direction

Strength

Taint

Node

Repels pods

Hard (NoSchedule) or soft (PreferNoSchedule)

Toleration

Pod

Permits despite repulsion

Unlocks eligibility

Node affinity

Pod

Attracts to matching nodes

Required or preferred

Pod affinity

Pod

Co-locates with other pods

Required or preferred

The production pattern that works: taint the nodes to protect them (NoSchedule), then use node affinity on the pods to direct them toward the right nodes. Taints without affinity let tolerating pods land anywhere in the eligible set. Affinity without taints lets other pods compete for the protected capacity. You need both.

For spot capacity, pair taints with tolerationSeconds on the pod side. Batch jobs that can absorb pre-emption set tolerationSeconds: 30 to drain inside the AWS two-minute spot interruption window. Latency-sensitive workloads should carry no spot toleration at all.

What Are the Best Practices for Taints and Tolerations?

Most taint problems aren't introduced in the initial cluster setup. They accumulate over time as teams add taints for new node pools, forget to update namespace-level defaults, or inherit deployments from other teams without auditing their toleration specs. The practices below close the most common drift vectors.

How Should You Name Taint Keys?

Use workload-centric key names, not organizational ones. workload=ml-inference:NoSchedule ages better than team=ml:NoSchedule. Orgs restructure; workloads stay. Namespace your keys in multi-tenant clusters: inference.platform.io/gpu=present:NoSchedule is unambiguous two years later; single-word keys like gpu=present collide when multiple teams manage node pools.

The AWS EKS scheduling best practices documentation recommends the <domain>/<key>=<value> pattern for managed Kubernetes environments. The cost of a naming convention is low. A naming collision that routes the wrong workload to a GPU node is not.

When Should You Use NoSchedule vs. NoExecute?

Use NoSchedule for placement intent: you want new pods to avoid this node unless they belong here. It's the right choice for GPU nodes, dedicated tenant nodes, and spot pools. Existing workloads on the node are unaffected.

Use NoExecute for hard isolation: you want existing pods that don't tolerate the taint to be evicted. Apply it deliberately. Adding NoExecute to a running node pool without verifying that all existing pods tolerate it triggers mass eviction.

PreferNoSchedule is a scoring hint. The scheduler avoids tainted nodes but uses them under pressure. Don't substitute it for NoSchedule when the boundary is real.

How Often Should You Audit Tolerations?

Quarterly, scoped to long-lived deployments. A service deployed 18 months ago may have a toleration that matched the cluster topology at the time, but mismatches it now. List all pods with non-trivial tolerations and verify that each toleration has a corresponding taint in the current cluster. Blanket tolerations (operator:Exists without a key) are the ones that bite; orphaned tolerations for deleted taints are harmless but signal drift.

How Should You Handle GPU and Spot Taints?

Treat GPU and spot taints as cost boundaries, not routing hints. Any pod carrying a toleration for nvidia.com/gpu=present:NoSchedule should be reviewed: does it actually need GPU capacity? If not, remove the toleration. Spot instance workloads in Kubernetes demand the same discipline: not all workloads belong on spot, and the toleration spec is how you enforce that in the scheduler. Batch jobs set tolerationSeconds: 30 on spot eviction taints to drain fast; latency-sensitive services carry no spot toleration at all.

For Karpenter-managed clusters, scaling GPU nodes with Karpenter depends on the taint/toleration topology being correct. Karpenter provisions the right node type only if the taint on the provisioned node matches the toleration on the pending pod. A wildcard toleration gives Karpenter no signal to prefer GPU over general-purpose instances.

What's New in Kubernetes v1.35: Numeric Toleration Operators

Kubernetes v1.35, released January 2026, introduced numeric toleration operators (Gt and Lt) as an alpha feature. Before v1.35, tolerations matched taints on exact key/value pairs or key existence only. Numeric operators let you express threshold-based placement logic: a pod that needs at least 24GB of GPU memory carries tolerations: [{key: "gpu-memory", operator: "Gt", value: "24"}], which matches any node where the gpu-memory taint value exceeds 24. This eliminates the need to maintain separate taints for every GPU tier and requires pods to carry specific tolerations for each.

This is alpha in v1.35. The API surface may change before beta promotion. Don't build production placement logic on Gt/Lt operators until they stabilize, but the feature solves a real combinatorial-taint-proliferation problem for GPU-heavy clusters.

Placement Rules That Protect GPU Capacity & Bin-Packing Efficiency

See how Sedai uses application-aware workload placement to continuously protect GPU capacity, reduce bin-packing drift & eliminate hidden toleration failures before they impact production.

Blog CTA Image

How Sedai Optimizes Within Toleration Boundaries

The Challenge: When Optimization Tools Don't Know What Tolerations Mean

Most cloud-cost and rightsizing tools treat the cluster as a flat pool of CPU and memory. They recommend shrinking pods, consolidating onto fewer nodes, or moving workloads to cheaper instance families, without reading the taint/toleration map.

The result is the same failure mode in three flavors: a "savings" recommendation that violates placement intent (moving an inference pod off a GPU node it actually needs), one that can't apply because the toleration doesn't permit the move, or one that applies cleanly but breaks SLOs at the next traffic spike. Each pattern traces back to the same root cause: the optimization layer doesn't read the taint/toleration map before generating recommendations.

Sedai's Approach: Application-Aware Optimization That Respects Placement Intent

The taint/toleration topology in your cluster expresses placement intent. Any optimization layer that ignores it isn't safe to run in production. Sedai treats that topology as a first-class input before making any autonomous decision.

Sedai is autonomous, application-aware optimization, built on a fundamentally different model than threshold-based tools. As the autonomy-vs.-automation distinction explains, automation fires on static rules; autonomous optimization reads real application behavior & adapts. For Kubernetes specifically, Sedai treats the taint/toleration topology as a constraint every optimization decision (rightsizing, node-pool selection, bin-packing) must satisfy. Sedai reads the four golden signals (latency, errors, traffic & saturation) through each cluster's native control plane, then applies Smart Rollback: incremental change, post-change verification against pre-change baselines, & automatic reversal on regression.

The Outcome: Real Customer Savings at Production Scale

Palo Alto Networks used Sedai to save $3.5 million on back-end services while keeping real-time anomaly response intact. Across all customers, Sedai has executed over 25 million autonomous actions in production with zero incidents. Clean tolerations widen Sedai's safe action space; misconfigured ones narrow it. Book a demo to see Sedai run in your environment.

How Teams Run Cleaner Workload Placement at Scale

Palo Alto Networks ran a mixed back-end at scale where placement intent mattered as much as resource limits. Sedai operated within those constraints to deliver autonomous savings without taking workloads off their intended capacity. 

"Sedai has helped us save millions of dollars by optimizing & managing our own back-end services. But most importantly, what Sedai has done very well is allow us to respond in real time when anomalies are detected." 

— Suresh Sangiah, Senior Vice President of Engineering, Palo Alto Networks.

Where Workload Placement Becomes a Cost Decision

Taints and tolerations are not a routing layer. They are the configuration that decides whether bin-packing is even possible, whether GPU spend is contained or sprawled & whether the optimization layer above can make sound decisions or just generates noise.

The platform team that audits tolerations quarterly, treats NoExecute as a hard isolation primitive rather than a knob & names taint keys for workloads (not orgs) buys back optimization headroom that no rightsizing tool can recover after the fact. A blanket toleration that matches every taint in the cluster is a tax on every optimization decision that follows.

Placement intent is upstream of every cost decision. Get the hygiene right, and the optimization layer earns its keep. Get it wrong, and even an autonomous platform is bound by the action space your tolerations permit. The Kubernetes resource optimization guide covers resource sizing and pool selection; this article covered the placement rules that gate both.

FAQs About Kubernetes Taints and Tolerations

What Is the Difference Between a Kubernetes Taint and a Toleration?

A taint is a key/value/effect rule on a node that repels pods. A toleration is a matching declaration on a pod spec that lets the pod bypass that repulsion. The taint sets the node-side constraint; the toleration is the pod-side permission. A toleration permits placement on a tainted node but doesn't force it. The scheduler still scores all eligible nodes to pick the best fit.

What Are the Three Taint Effects, and When Should You Use Each?

NoSchedule is a hard constraint: new pods without a matching toleration are rejected. Use it for GPU nodes, spot pools & dedicated tenant capacity. PreferNoSchedule is a soft hint: the scheduler avoids the node but uses it under pressure. NoExecute adds eviction of already-running pods after tolerationSeconds. Use it only for hard isolation or node decommissioning; it is the strictest enforcement level.

What Are the Default Taints Kubernetes Applies Automatically?

Kubernetes applies node condition taints automatically. The six most common are node.kubernetes.io/not-ready, unreachable, memory-pressure, disk-pressure, unschedulable & network-unavailable. A pod with operator: Exists and no key tolerates all of them, meaning it stays on memory-pressured or unreachable nodes longer than intended. Always review wide-scope tolerations against the full well-known taint list before deploying to a production cluster.

How Do Taints and Tolerations Affect Bin-Packing and Cost?

Misconfigured tolerations let the scheduler place pods on nodes they shouldn't occupy. GPU nodes fill with non-GPU workloads, spot nodes carry latency-sensitive services & general-purpose nodes go underutilized, shrinking the scheduler's room to consolidate. Cost attribution breaks too: when API pods run on GPU nodes, that spend is no longer attributable to the ML team. Clean taints and tight tolerations are the prerequisite for any bin-packing improvement.

Can a Pod Have Multiple Tolerations?

A pod can carry any number of tolerations, each evaluated independently. To schedule on a tainted node, the pod must tolerate every taint on that node. If a node has two taints and the pod tolerates only one, the pod is rejected. This additive model means adding a new taint to a node pool is safe: pods without the new toleration are excluded, even if they matched existing taints.

How Does Sedai Use Taints and Tolerations in Its Optimization?

Sedai reads the cluster's taint/toleration topology as a first-class input before making any optimization decision. Every autonomous action (rightsizing, node-pool selection, consolidation) is constrained to the action space the toleration map permits. Sedai does not move pods to nodes they can't tolerate. This is the practical difference between autonomous, application-aware optimization and threshold-based automation that fires without reading cluster topology.

Sources

  1. CNCF, CNCF Annual Survey 2025 (2026): https://www.cncf.io/wp-content/uploads/2026/01/CNCF_Annual_Survey_Report_final.pdf
  2. Kubernetes, Taints and Tolerations: Scheduling and Eviction (2025): https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  3. Kubernetes, Kubernetes v1.35: Numeric Toleration Operators (2026): https://kubernetes.io/blog/2026/01/05/kubernetes-v1-35-numeric-toleration-operators/
  4. Kubernetes, Well-Known Labels, Annotations, and Taints (2025): https://kubernetes.io/docs/reference/labels-annotations-taints/
  5. Google SRE Book, Monitoring Distributed Systems: The Four Golden Signals (2016): https://sre.google/sre-book/monitoring-distributed-systems/
  6. AWS, Amazon EKS Best Practices: Scheduling (2025): https://docs.aws.amazon.com/eks/latest/best-practices/scheduling.html