Systems & Transformation

Aggregation Lies: What Your Portfolio Dashboard Isn't Telling You

4 Dec 2025·7 min

The portfolio average held steady at 12 days through a quarter in which one team's cycle time collapsed to 22 days and another's improved to 5. The deterioration and the improvement cancelled each other in the average. The dashboard showed no change. Leadership saw no change. The conversation that needed to happen didn't happen.

This is not a story about a badly designed dashboard. It is a story about what averages do when applied to heterogeneous populations — and they do it reliably, every time, by definition.

The mathematics of cancellation

An average describes a population by its central value. When the population is homogeneous — all teams working on similar things at similar scales with similar processes — the central value is meaningful. When the population is heterogeneous, the central value describes a point that may not correspond to any actual team.

Two teams with cycle times of 5 days and 22 days have an average of 13.5 days. Neither team operates at 13.5 days. The average is accurate and useless simultaneously.

The failure becomes more expensive when the heterogeneity involves opposing trends. A team improving from 18 to 12 days releases approximately the same signal as a team deteriorating from 10 to 16 days. In a portfolio of four teams, these two cancel. The portfolio average barely moves. The dashboard is quiet. One team is in trouble and one is succeeding, and neither fact surfaces.

A fast team and a slow team average to a distribution that looks unremarkable. Neither team's situation is visible in the combined view.

Three failure patterns

Opposing tails. The clearest case: one high-performing team and one struggling team average to a number that looks unremarkable. The faster the fast team improves and the slower the slow team gets, the more stable the average appears. The average is actively concealing the problem.

Composition drift. The mix of work types across the portfolio changes — more large features this quarter, more bugs last quarter. Average cycle time shifts not because any team's performance changed but because the work composition changed. Leadership interprets it as a performance signal when it is a structural one. Decisions made on this signal are wrong in a specific direction.

Size weighting. A large team with slow cycle time dominates a portfolio average. Small teams with fast cycle times contribute little to the weighted mean. The fast teams are invisible in portfolio discussions. The slow large team shapes the narrative even when it is improving.

The portfolio average (bold) looks stable. Behind it, Team A is deteriorating and Team B is improving. Neither signal is visible in the aggregate.

What to look at instead

The replacement is not a better average. It is the individual team distributions, surfaced separately.

Each team's cycle time distribution — shown as a scatter, histogram, or percentile band — tells you whether that team is stable, improving, or deteriorating. Four small charts showing individual team health are more informative than one combined metric, because they can be different from each other.

The portfolio view should answer: which teams are healthy, which are not, and is the unhealthy state getting better or worse? A single average cannot answer these questions. Individual distributions can.

The operations challenge is real: a portfolio of 15 teams produces 15 distributions. The practical solution is to define a stability threshold and surface teams outside it — not an average, but a status. Teams within normal operating range don't need attention. Teams outside it do.

Aggregation is only safe when both team homogeneity and work type similarity are high. In most portfolios, neither precondition holds.

Warning

An average across heterogeneous teams is not a summary. It is noise with a number attached — and the noise will cancel exactly the signals you most need to see.

When aggregation is legitimate

Aggregating cycle time across teams is valid when the preconditions hold: teams are working on similar types of work, at similar scales, through similar processes. In that case the teams are drawing from approximately the same distribution and the average describes that shared distribution reasonably well.

The preconditions are worth checking explicitly before trusting any aggregate. They are rarely examined and frequently violated.

The portfolio average was fine. The portfolio was not fine. These two facts are compatible.