Alfred Nobel created the Nobel Prizes to recognize those individuals who had most benefited man-kind. Alfred Nobel was an entrepreneur who also invented Dynamite which he intended to be used to make mining safer.
The DORA metrics were created with the most positive intent. They provided a mechanism for development teams to demonstrate that they were improving in their ability to deliver software into a production environment. Like dynamite, used with the right intent, they are a powerful and beneficial tool.
The right tool in the right place for the right people
The DORA metrics help a team understand how effective they are at getting finished code into production. Deployment frequency is a measure of the transaction costs associated with getting code into production. If the transaction cost of getting code into production is low, teams can afford to do it more frequently. If it is too high, they have to wait until they have more value to deliver. The lead time for changes indicates the level of automation, in particular the test automation maturity. The change failure rate shows whether changes are successful or need to be fixed. Finally the mean time to recover shows how quickly the team can react to fix a problem in production. All of the DORA metrics are incredibly valuable at the team level, helping a team identify improvements in their process and diagnose whether they have been successfully implemented. For a team that wants to improve, the DORA metrics are the right tool in the right place for the right people.
Most cars today are what we used to call automatics, meaning that the changing of gears is automatic. The opposite of an automatic car was a manual (in the UK) or stick (in the USA). In a manual car, the driver decided when to change gear and manually changed it, pressing down on the clutch pedal, changing gear, and releasing the clutch. Much of learning to drive was learning how to perform this action smoothly and with little thought. To assist drivers, the dashboard of manual cars have two large dials… the speed and the rev counter. These two dials represented the most important information that the driver of a manual car needed to know. The rev counter would show the driver when they need to change gear if they could not hear it or feel it from the state of the car. The rev counter was so important it was given half of the valuable dashboard. In an automatic car, the car automatically changes the gears and as a result the rev counter is no longer consider useful to the driver. There is nothing that the driver can do if the revs are too high, or too low other than take the car to a mechanic and have it “tuned”. The rev counter on an automatic would be confusing and so the designers removed it.
Outside of the team, the DORA metrics are like a rev counter to the driver of an automatic car. Important but not their problem, and nothing that they can do apart from tell the team to go faster.
The wrong metric in the wrong place for the wrong people
Last year a large global consultancy with the ear of many business and technology executive, and little to no knowledge of technology told executives that the DORA metrics were a great way to measure developer productivity. They told the drivers of an automatic car that they should focus on the rev counter. The business and technology executives were delighted, realizing that there was nothing they could do to improve the DORA metrics. This meant that they could not be held accountable for failure to improve the technology investment process, and they could blame the technology teams for failing to deliver. The DORA metrics do not make sense to executives.
DORA works when a single team is delivering value to a customer. When more than one team is involved in delivering value to the customer, the value of the DORA metrics is reduced… just like the rev counter. In most large organisations it is difficult if not impossible for all teams to be able to deliver value to a customer independently. In most organisations, several teams are needed. Some organisations require teams to deliver value independently and that becomes more important than satisfying the most valuable need for the customer. Amazon is famous for creating a technology stack where teams could build valuable software independently but it took years of investment by skilled leaders and skilled developers, something many organisations have not been able to replicate.
The reality is that there are many things that executives can do to improve the technology investment process. The changes are things that the business and technology executives can improve but they tend to make worse. These changes do not show up in the DORA metrics but they do show up in “cross-team” metrics like the “Lead time* from starting an investment until it realises value for the customer”.
The biggest problem with DORA outside of the team is that it focuses on getting software into production, it does not focus on delivering value to customers. This simple difference has huge implications, and results in a metric that is valuable to leaders but toxic to the failureship as they will now have some accountability.
The wrong metric for cross team development
There are many toxic aspects to the DORA metrics when used in a cross team delivery context:
- DORA ignores value delivered to the customer for cross team delivery. DORA can mean that teams focus on getting software into production, even if no value is delivered. This means that the teams get to “tick the box” of delivery even if the software does not work from the customer’s perspective.
- “Getting to production” rather than “Getting to value” means that teams adopt “Latent Code Pattern” strategies like feature flags. Whilst feature flags are a fantastic mechanism for prevent the need to synchronise releases over short periods of time, they introduce huge operational risk if used over extended periods of time. Teams find it hard to understand “what production is” if there are more than one or two latent code patterns in use. The Knight Capital disaster gives an indication of the dangers of not having tight control on “what is production?”.
- Comparing one team to another means that the same definition of DORA is needed. The same start point, the same end point and the same amount of work. This means that teams are forced to adopt standard processes to enable comparison, even though the process may add unnecessary overhead for some teams. If DORA metrics are used as a single team metric to show improvement only, rather than to compare teams, then each team can have its slightly different variant according to its context.
- DORA metrics are often built into “strategic” tool sets making it easier to measure them. Those teams who have not migrated to the “strategic” tool set may be discouraged from adopting it because the DORA metrics will show them in an unfavourable light.
- Change failure and the associated risk is understated in a cross team environment. Whilst an unused piece of code sits in production may not generate a failure, the same code may result in a change failure when another team delivers their part of the value puzzle and a failure is detected. Furthermore, the failure is incorrectly attributed to the wrong team. For example, team A delivers ten changes to production which do not result in a failure. Team B delivers code months later that uses six of the team A changes and a failure occurs. Team B delivers more code months later that uses three more of the team A changes and another failure occurs. Team B has two change failures and Team A may now have disbanded. There is another team A change sitting in production unused and untested.
In a cross team delivery, Lead time for changes, change failure rate and deployment frequency should all use delivery of customer value as the end point, and not deployment of software into a production environment. Unfortunately, this is unlikely to be adopted by most organisations with cross team deliveries as business and technology executives have a huge impact and thus responsiblility for the metric.
In a cross team environment, using delivery of customer value as the end point results in the following responsibilities for business and technology executives:
- Ensure customer value is delivered. This is easier than it sounds. If a customer uses a features, and comes back and continues to use the feature, then the feature has delivered value to the internal or external customer. If the customer uses the feature once but never comes back (day one churn), then the customer has a need but the feature does not meet that need for some reason. If the customer never uses the feature, then they cannot find it, or it does not satisfy a need for them.
- Ensure teams collaborate and communicate. Often team are working in a feature factory or ticket processing hell with no awareness of the value of their work. Business and technology executives should ensure that teams are working collaboratively together to deliver value to customers.
- Limit work in progress. Business and technology executives should ensure that teams are not being forced to do too much work in any period. For those teams that act as a constraint on the organisation’s ability to deliver, the teams should have a clear priority order for items which is consistent across the organisation and agreed by ALL stakeholders.
Obviously business and technology executives should also ensure that items delivering value to customers are also delivering the appropriate business value to the organisation… but that does not relate to the discussion of DORA metrics.
In a cross team delivery environment, the DORA metrics can lead to harmful behaviour if they are used to compare teams. The DORA metrics should only be used within the team to help them improve their processes. Using DORA metrics outside the team can lead to the wrong behaviour. Just like story points, they are an optional tool that teams can use but no one outside the team should use them to judge a team, or compare teams. Anyone suggesting the use of DORA metrics or Story Points outside of the team clearly does not know what they are saying and should be shunned for the safety of the organisation.
* Experts will realise that I’m referring to Cycle Time rather than Lead Time. Non experts do not know the difference between Lead Time and Cycle Time and do not care. This post is targeted at non experts.












