Process Optimization

Only by increasing flow through the constraint can overall throughput be increased – Eliyahu M. Goldratt, The Goal, 1984

Turning this around we get the corollory that optimizations made to areas other than the constraint only serve to worsen the constraint.

Agile practices, and particularly Scrum and SAFe, have been widely adopted in large enterprises over the last decade or so, spawning entirely new departments dedicated to locking down the process. Agile Centers of Excellence (CoEs) produce materials explicitly describing every aspect of their company’s specific flavour of Agile, and have coaches dedicated to enforcing it across the teams.

In my opinion, this goes against the very essence of agile but that’s a topic for another day.

The challenge I see with centralized process control like Agile CoEs is that they can only optimize the process for the departments under their influence, and many Enterprises have separate departments for testing, deployments (change management), and tooling. These departments have distinct reporting hierarchies outside of the Agile CoE’s jurisdiction.

As such, the Agile CoE focuses on the areas it can control: requirements and development. They completely ignore what happens after the code is written because they have little or no authority over that part of the value stream.

Two-week sprints are pretty standard in most large enterprises but if any of the downstream phases take more than two weeks then the Theory of Constraints states that this will cause a bottleneck in those phases. In Lean Manufacturing terms, we’ll have inventory (code) piling up at the door faster than the trucks can deliver it.

Large enterprises try to deal with this by parallelizing the test/deploy phase with multiple test environments each tied to separate “release branches” and “release trains” managed by Release Engineers. The management overhead caused by this approach cannot be understated.

Bugs found late in the release cycle have to be backported to multiple release branches introducing the possibility of errors, requiring that the bug fix be retested on each “release train”, and severely limiting the development team’s ability to refactor the codebase. This last point bears repeating: These long-running release branches result in multi-day merge hells for the developers if any of the branches have significantly diverged. To avoid merge hell, the developers stop refactoring the code to keep the code as similar as possible on all branches. This is a Bad Thing.

Despite the fact that the time and cost of the release process dwarfs the development effort, large enterprises focus on squeezing an extra 1% of productivity out of their sprint teams which exacerbates the downstream test and deploy problems.

Agile CoEs are incentivized to manage the metrics with nice linear burndown charts, consistent velocity metrics, and teams that meet their Sprint commitments. They are not rewarded for actually delivering software into production.

When project timeline projections show a delayed delivery date, the first reaction is to lean on the development teams to go faster even though there is very clear evidence that the development process is not the constraint in the system. Producing features faster results in more delays due to the bottleneck in testing and deployment, and the management overhead caused by it, not to mention the subsequent degradation of the codebase caused by a lack of refactoring.

Ironically, reducing development capacity would improve the delivery velocity.

If you want to deliver software faster and cheaper, you have to take a good hard look at the end-to-end delivery process, and follow the simple continuous improvement loop:

  1. Identify the system’s constraint.
  2. Decide how to address the system’s constraint.
  3. Subordinate everything else to the above decision.
  4. Repeat

The boundaries of the system must include every step in creating software: the funding process, requirements, design, build, test, and deploy. All departments involved have to be measured against the same enterprise-wide goal of delivering quality software sooner.

The key part to making this work is to avoid the blame game. The metrics have to be unbiased and, critically, not judged. They cannot be used as targets. The metrics are the facts of the case and they are undisputed. The goal is to improve the system as a whole so local optimizations are done with that mindset.

Similarly, the process mapping has to be honest about all the actual steps in the process. We also exclude metrics when the process has been subverted through escalation or politicking because we want to improve the default process, not the exceptions.

Fix the institutional problems that inhibit software delivery and you’ll dramatically improve your organizational efficiency, employee satisfaction, and become a high-performing enterprise.