Most teams know which FinOps maturity stage they're in. What they don't know is why they can't leave it.
The FinOps Foundation's crawl-walk-run model describes what each stage looks like. That definitional work is useful. But practitioners don't need a better description of the stages. They need to understand the specific technical blockers that prevent moving between them.
These blockers are technical, not cultural. Culture & buy-in matter, but they cannot overcome a tooling gap. Each transition stalls because of a precise engineering gap: wrong data at crawl, wrong signal at walk, & no continuity at run.
- Crawl: The Attribution Gap
- Walk: The Trust Problem
- Run: The Continuity Problem
- Beyond the FinOps Maturity Model: Optimization as a System Property
Crawl: The Attribution Gap
Billing exports and Cost Explorer allocate spend to accounts & services. They do not attribute spend to the pods, containers, or functions actually consuming resources.
Tag coverage is incomplete. Not because teams are careless, but because tagging discipline is hard to enforce across engineering teams with different deployment patterns. When tags are inconsistent, attribution is approximate. When attribution is approximate, prioritization is guesswork.
Namespace-level metrics are missing. A team can see that an EKS cluster costs $40,000 per month, but cannot determine which workloads or application teams are responsible for that spend. Without workload-level attribution, you cannot have a credible conversation with engineering about where to focus.
"Good enough" means: 80% of controllable cluster spend is traced to the controller (Deployment, StatefulSet, Job) level, refreshed daily. That threshold is where you stop guessing & start profiling. It requires tooling beyond billing exports & namespace-level metrics that billing APIs do not expose by default.
Walk: The Trust Problem
Teams at walk have data. They cannot act safely on it.
The blocker: wrong signal. Average CPU and memory utilization tells you what a workload consumes under normal conditions. It does not tell you what happens at the 95th or 99th percentile of traffic. It does not account for seasonality, batch job bursts, or the behavior of a service hours after a new deploy.
A cold start on a fresh deploy or a batch job spike ten minutes after a config change can cascade into latency breaches or error spikes that a month of average-based optimization never saw coming. A configuration that looks safe at mean utilization can push latency past SLO thresholds under peak load. That's why every rightsizing action still requires human review, a change window, & a rollback plan.
The approval loop is the symptom. The blocker is the absence of a signal.
For that to change, the system making the recommendation needs to observe application behavior before acting. Not average utilization. Traffic patterns, seasonal variance, & p99 latency. As covered in The Hard Truth: FinOps Inform Doesn't Pay the Bills, a recommendation is not a result. The execution gap is what keeps teams at a ‘walk’.
Run: The Continuity Problem
Teams at run execute optimization. The blocker: they execute it once, then move on.
Rightsizing becomes a project, not a practice. A team runs an exercise, captures savings, & closes the ticket. A single Friday afternoon deploy — a code change, a replica addition, a new data dependency invalidates weeks of manual work.
The half-life of a one-time rightsizing exercise is measured in weeks, not months. Three months later, the application has been redeployed multiple times, traffic patterns have shifted, & the configuration that was right-sized is now wrong-sized again. McKinsey found that only 10% of cloud transformations achieve full value. Optimization, durability, & continuous re-evaluation are the core challenges separating leaders from laggards in this space.
Continuous re-evaluation requires a different technical capability than periodic review. It requires the system to observe workload behavior after every deploy & reassess resource configuration against current behavior, not the snapshot from the last optimization pass. Most teams at run have tooling that supports review-and-act. What they lack is tooling that observes & re-evaluates continuously.
For a broader look at what optimization tactics look like at this stage, Top 17 FinOps Cloud Optimization Strategies for 2026 covers the full range. The pattern across all of them is the same: episodic optimization fails because the system it optimized for what changed.
The difference between a team that optimizes & a team that stays optimized is this: the first treats optimization as an event. The second has made it a part of the operating system.
Understand FinOps Maturity Model
See how Sedai explains FinOps maturity models in 2026 for growth, control & cost efficiency.

Beyond the FinOps Maturity Model: Optimization as a System Property
What distinguishes teams that reach "optimized" is not better tooling selection at the start. It is treating optimization as an ongoing operational function. Cloud spend is not a state to be achieved. It is a variable that shifts with every change in workload behavior.
Closing the loop between workload observation, configuration change, & outcome validation requires a system that acts continuously. That loop cannot run on human approval cycles at scale. Not every single time, at least. Beyond Recommendations: The Case for Autonomous Cloud Optimization makes this case in full.
Each blocker has a technical solution.
- At crawl-to-walk, the answer is workload-level attribution using application-aware observability, connecting resource consumption to the namespace, service, & workload generating it.
- At walk-to-run, the blocker is the signal. The solution is observation of application behavior — traffic patterns, seasonality, & p99 latency before acting. Safety verification moves from post-deployment (rollback) to pre-deployment (signal-driven decision).
- At run-to-optimize, the blocker is an episodic action. The solution is continuous re-evaluation triggered by behavioral shifts, not calendar cycles. Re-evaluation after every deploy keeps configuration aligned to current workload behavior, not the snapshot from yesterday's optimization pass.
Sedai: Removing Each Blocker
Sedai removes each blocker through workload-level attribution, signal-aware decisioning, & continuous re-evaluation.
- Crawl→Walk: Application-aware observability traces resource consumption to workloads, enabling controller-level attribution without perfect tagging.
- Walk→Run: Observation of traffic patterns & p99 latency before acting, with incremental reversible changes.
- Run→Optimize: Continuous re-evaluation after every deploy, keeping configuration aligned to current behavior, not yesterday's snapshot.
