Flow & Measurement

The Seven Failure Modes of a Stuck Delivery System

27 Oct 2025·9 min

Every stuck team says the same things. Standups mention blockers. Retrospectives surface communication problems. Leadership adds oversight. Nothing improves.

The conversations are the same because the symptom — slow delivery — is the same. But the data underneath looks different depending on where the failure actually lives. A team in a rework loop and a team with an approval bottleneck both have long cycle times and frustrated engineers. They need different first moves.

The seven failure modes below are not exhaustive. They are the recognisable patterns that appear most consistently — each with a data signature distinct enough to identify, and a first intervention specific enough to act on.

Why the first move matters

All seven failure modes share the same root mechanism: queues forming because throughput at some stage is insufficient relative to the inflow. The mechanism is the same. The cause differs.

Hiring testers into a rework loop does not fix the rework loop. It gives you more testers watching defective code cycle back to development. Adding a WIP limit to an approval bottleneck does not fix the approval bottleneck. It reduces the queue in front of a gate that is still just as slow.

The right first move is the one that addresses the actual cause of the queue in question. Getting this right requires knowing which failure mode you are in.

Visual fingerprints of the seven failure modes. Each has a distinct cycle time or WIP pattern that distinguishes it from the others — the basis for the correct first intervention.

The seven modes

Intake flood. More work starts than finishes each week. WIP rises. Items age across all stages simultaneously — no single stage stands out, the whole system is congested. The data signature is a rising WIP trend and a flat or declining throughput. The first move is a hard limit on new starts. Stop taking on new work until the WIP comes down. This is politically uncomfortable and structurally necessary.

Batch factory. Items are large. The development stage dominates cycle time. Releases are infrequent. The distribution is wide but the long tail is in dev, not review or test. Completion events are clustered — quiet for weeks, then several items finishing at once. The first move is to break items smaller. The improvement in cycle time from splitting a two-week item into four three-day items is larger than most process changes can deliver.

Rework loop. Items cycle between development and review or test repeatedly. The cycle time distribution is multi-modal — a cluster around typical completions and a heavy tail of items that took three or four times as long. The data shows items revisiting "in dev" after having been in "in review." The first move is to address the definition of ready or the review quality — items are entering review before they are ready to pass it.

The rework loop: items rejected at review return to development, accumulating 3–5 days of cost per cycle. A multi-modal cycle time distribution is the data signature of this pattern.

Approval bottleneck. One stage dominates age. Development, test, and deployment are healthy. Review, sign-off, or release gate is where items sit. The age distribution is narrow for most stages and wide for one. The queue depth at that stage is consistently high relative to others. The first move is targeted at the specific gate: add capacity, reduce the scope of what requires approval, or change the routing.

Dependency gridlock. Items are blocked waiting for something external — another team, a platform service, a sign-off from outside the delivery system. Blocked item count rises while active item health is normal. The first move is to make the dependencies visible and explicit: which items are blocked, what they are blocked on, and who controls the blocker. Then address the upstream dependency — either decouple, pre-fetch, or negotiate a service agreement.

Context-switch swamp. Many parallel tracks. High per-person WIP. Each developer is working on four or five items simultaneously. Throughput is low not because any stage is slow but because no one is focused long enough to complete anything. The data shows many items in early stages, few in late stages, and long elapsed time per item despite low stage time. The first move is to reduce parallel tracks, not add capacity. More people working on more things in parallel makes this worse.

Speciation creep. The actual workflow has diverged from the stated workflow. Items follow paths not captured in the board structure. Statuses are used inconsistently. Process behaviour chart analysis shows instability without any obvious assignable cause — the system behaves differently from week to week without a clear external reason. The first move is to audit actual versus stated workflow: shadow a few items through their real journey and compare it to what the board shows. The discrepancy is the problem.

Where age accumulates differs by failure mode. Intake Flood spreads across all stages. Approval Bottleneck concentrates in Review. Rework Loop loads Dev and Test simultaneously.

Insight

Each failure mode has exactly one thing to fix first. The mistake is applying the wrong first move because the symptom looked similar.

Identifying yours

Five diagnostic questions narrow the field quickly.

Is WIP rising week over week? If yes, you are likely in intake flood or context-switch swamp.

Does one stage consistently have more aged items than others? If yes, look at approval bottleneck or rework loop.

Are items cycling back to earlier stages? If yes, rework loop.

Are blocked items rising while active items are healthy? If yes, dependency gridlock.

Does the board accurately reflect how work actually moves? If no, speciation creep.

Most teams are in exactly one of these modes. A few are in two simultaneously, usually intake flood plus one other. The sequence matters: intake flood is always addressed first, because flooding a system that has another failure mode makes both worse.

Every stuck team feels the same. The data shows seven different problems.