The quest to measure the productivity of tech teams and organizations has bothered the minds of tech and business leaders since the very early days of software engineering.
This quest has passed through many stages, from very naive attempts like measuring the lines of code to complex methods like Functional Point Analysis or simplifications like team velocity.
These days, many tech leaders are still finding that the method they’ve chosen has too many flaws, which prevents them from successfully measuring productivity. You need a reliable way to do it, not only in order to measure the productivity of your tech organization but also to quantify your problems and measure how fixing them impacts the productivity of your tech team. If you don’t have a reliable way to do this, how do you know that you are solving your biggest problems? And if you do solve them, how will you quantify the impact?
In this blog article, I’ll share one approach which you can use to define, measure, and improve productivity. It can be applied to individual teams, as well as to the whole engineering organization.
Why do we need to measure productivity?
First of all, let’s answer the very simple question of why we need to measure productivity.
The most obvious reason is that this way, you can quantify improvements in your tech organization, as well as detect deteriorations. If you don’t measure productivity, how can you justify the transformation from horizontal to cross-functional teams, for example? If, tomorrow, there is a new hype, will you again reorganize your tech teams?
While many tech people might know that moving to cross-functional teams, for example, has many benefits, your business counterparts might not be aware of these benefits, so it is vital for you to be able to easily quantify the result of such improvements and demonstrate it to non-tech people, too.
I used the following approach in a fast-growing tech company which I had to transform. It was formed, as were many other startups at that stage, around technical silos. The central product of the company was a mobile application. The engineering teams building it were split into several mobile teams and backend, QA, UX design, and data science teams.
People in the company were, by and large, very smart, and many of them proactively raised a lot of problems that, according to them, were hindering the productivity of the tech team. These problems were primarily (i) cross-team dependencies and long waiting times (the primary reason was that product features were split between tech silos) and (ii) some of the teams had huge batches of work, while others were “starving”. For example, the QA team was way overloaded with bugs, while the mobile engineering team frequently had to wait.
The heads of engineering and the managers of the teams were very smart people, who immediately recognized the need to transform into cross-functional feature teams, envisioning how this would resolve many of the problems that their teams currently had.
You might think that my task should have been quite easy – just reorganize the teams. While that was the general direction, I had to make sure that we executed the transformation in the right way, solving the right problems. Let’s go into the actual steps.
Identifying problems to solve is not enough. Quantify them!
I had to get a good understanding of the specific problems in the organization, many of which people had listed. After that came the more difficult part. I wanted to quantify the problems because you should understand how big the problems are. This way, you’ll be able to tackle the biggest problems first. For example, if you have many defects, then this is definitely a problem. But how do you know whether this is the biggest problem in your tech organization? Aren’t there other issues which are bigger and more important?
In addition to making sure that you’re tackling your biggest pain points, quantifying the problems allows you to measure how you are improving the situation throughout your transformation journey.
In general, quantifying problems is a very powerful weapon. If you aren’t doing it, you might be challenged by these questions:
- Why are you transforming into vertical teams? If, tomorrow, there is a new fad, will you abandon feature teams and follow the new trend?
- What problems are you actually trying to solve?
- How do you know that the problems you’re tackling are the biggest ones?
- What indicators will you use to monitor whether you are solving the problems? How will you know that you are improving?
So how did I quantify the problems? I looked at the tech organization as a system and conducted a system-level analysis. Let’s see how I did this.
Depict the current state
My initial step was to draw a simple process flow diagram to depict how the tech organization was working at the time. No improvements, no improvisation – I just depicted the tech teams and their interactions as they were.
I drew all the processes, including the different steps for specification preparation, with inputs coming from marketing, the support team, the CEO, etc., the different interactions between the technical silos, each one of which was listed as a separate process, the interaction with the support team, etc. Some good advice is to start your process diagram from the customer and work backward. This way, you start with the customer value, and then you see how the different processes contribute to the creation of that value.
Perform a system analysis
The next step was to perform a system analysis. The process diagram looked good on paper, so what was the problem? In order to start identifying the problems, I had two options.
The first was to analyze the processes within the system. One way is to analyze the value added time vs. non-value added time, which gives you a metric for the efficiency of each process. You’d be surprised how many companies have processes with less than 1% value added time. In this case, I decided not to do this. These were the primary reasons:
- This analysis is valuable in manufacturing, but its application in tech companies has limited value because one minute of work for employee A is not as valuable as one minute of work for employee B. Improving value added time still has merit, but you can’t compare the value between different employees or teams.
- If you have smart people who are experts in their domain, they’ll figure out how to improve the process without your help. For example, if you have software engineers who are smart, skilled, and experienced, and they deploy their code manually, don’t you think one of the first things they’ll try to improve is the manual deployment? Or if they test their changes manually before deploying, don’t you think they’ll try to automate the tests and the CICD process? Most certainly they will. Also, if you want to perform a system analysis, reviewing each process is not your top priority. You might identify a process which can be improved, for example, increasing the value added time percentage. Are you sure this is your bottleneck? What will happen if the system-level bottleneck occurs right after the process is improved? The batch size of unfinished work will become even bigger, and your system performance will deteriorate.
My suggestion is not to start with analyzing and improving the individual processes. You should do this only after you’ve analyzed the whole system, found your bottlenecks, and identified metrics to help you identify whether you have improved the system.
The second option was to analyze the whole system. I did it in the following way. I wanted to identify and quantify the batches of work in progress between each process cell. Now this gives a lot of information. For example, identifying the biggest batch in the system helps you to identify your bottleneck. But how do you identify batches in a tech organization? You might have three types of batches: software code, defects, and specifications. The problem is that each of these batches has different states. For example, the CEO has an idea for a new feature, meets the Head of Product, and tells her about the idea. This is one form of specification already. Then the Head of Product describes the idea briefly in some project management system and assigns a task to the Senior Product Manager to further research and evaluate the idea. When this is done, the specification transforms again into another state, much more detailed. Transformation continues as the specification travels through the system. The same happens with defects and software. They are not static but frequently transition into another state.
Define the right KPIs to measure the improvement
In theory, you have three types of batches, but in practice, they have many different states. How do you identify and quantify them?
If you’re thinking of going to each process group and asking them to do this manually, don’t – it will be super disruptive for the team. Also, remember that you don’t need this data only once – you want to track it continuously to make sure you’re improving your system.
So what did I do? I used one of the magic cards in my pocket. Many companies have project management tools which help them conduct their Kanban, Scrum or other processes. This company used Kanban, and the queues were in the system. Guess what – the Kanban queues were almost a 100% match with the processes within the company. And this was not a coincidence but quite normal. For example, the mobile dev team was focused on its queues, depicting the different states of the implemented code and then handing over to the backend team, which started to implement and integrate with the mobile team, moving on its Kanban queues. The backend queue was input to the QA team, and it started to move its queues on. The Product Managers had several queues to reflect the different states of the features before handing over to the engineering teams. All process steps depicted in my process flow diagram had their corresponding queues in the project management system.
Why is this good? Because most of these systems have very good out-of-the-box analyses. I used a Cumulative Flow Diagram, for example, to see the batch sizes between the processes. This gave me immediate, super clear visibility into the bottleneck of the system – the QA team. The batch of fixed but unverified defects was huge. Also, there was an enormous pile of implemented features which had not been deployed into the product because they, too, had not yet been verified by the QA team. This was also a very easy way for me to see the imbalance between the silos – the bigger the batch sizes and the older the average age of the items in the batch, the more out of balance the involved processes were. The Average Age report in the project management system was also very helpful for that analysis.
After obtaining those metrics, I wanted to set up very simple KPIs to demonstrate whether we had improved the system as a whole.
You might be familiar with cycle time and lead time. Cycle time is the time needed to process one piece of item in your system – from the start of the processing to the moment the item is with the customer. Lead time is the time from the moment the request for the production of the item is received to the moment it is with the customer. So lead time is the cycle time plus the time the request waits before it begins to be processed.
Cycle time is an indicator of the throughput of your system. Throughput is the units your system produces in a certain time. The shorter the cycle time, the higher the throughput.
Again, the project management system had a Control Chart, which showed the cycle time and lead time for the product. This was an excellent indicator of the effect of a process change on the team’s productivity. I had visibility into the historical performance of the system and its productivity and had the tool to monitor its improvement with the process changes that we were about to make.
As a recommendation, once you start implementing that first type of improvement to your system – monitoring and improving the throughput – improve your system analysis also by enlarging it to other areas. For example, I would suggest you also monitor standard deviations, which will help you improve individual processes, as well as manage special causes in your system. Yes, many project management systems already have this analysis in place.
Once you really understand your system dynamics, you can start improving individual processes. By this stage, you should know which processes should be improved, why, and by how much.
Productivity improvement approach
I analyzed the whole system and equipped myself with good metrics to monitor it. Now I was ready to start applying the improvements.
One suggestion here. Many books or gurus suggest the following three-step approach:
- Analyze the current state.
- Depict the future north star state.
- Prepare a plan to reach it.
I don’t recommend this because, quite often, the gap between the current state and the future state is too big and your plan has to be huge, containing many phases and steps. People won’t necessarily understand all of them, and some of them might solve problems which are not yet visible but hidden behind other problems which are currently exposed. Something else I don’t like is that this approach has a one-and-only future desired state. I would rather recommend that you have the expectation that there is no perfect organization and set the process for continuously reevaluating it and improving it. Business is changing, your tech teams scale, technology advances. If everything changes, how can you expect there to be one north star that fits all possible scenarios? That’s quite naïve. Another drawback is that when you present such a big improvement plan, people automatically see this as a big and disruptive change (“Oh, yet another reorg…”). If you tackle a few top problems in your organization, people don’t see this as an org update but as a very focused effort to resolve very specific problems in a very clear way.
In my case, I identified the top problems in the system. They were visible to everybody, so there was no hesitation or resistance to what we were doing. As I said at the beginning of this post, these problems had even been raised by the employees in the company. I just helped to quantify them, ensure that they were really the top problems in the company, and set a framework to monitor their resolution.
Once we’d solved them, we identified the next big problems. As they, in turn, were now visible to everyone, it was easy to engage people in the org design updates to solve them.
Repeat and improve your analysis as you go
As your tech organization gets better and better, your analysis will also become better and better. For example, if you identify critical processes within your system that are complex and need improvement, you might do process analysis, by optimizing value added time vs. non-value added time or by inspecting the lean wastes within the process and eliminating them.
Also, as you become better, you will start to inspect deviations and analyze them. The throughput of the system dropped the last release – is this normal for the current design of our system, or is there a special cause which we should analyze and fix? If this is due to the boundaries of the system, how can we improve it? If the special cause is in a positive direction, how can we apply its learnings to the rest of the system?
You should apply this analysis both to your overall system and to the individual processes.
It’s important to note that this transformation ensures that your system is efficient. But that doesn’t mean that your system is delivering the right thing. You might deliver the wrong thing very efficiently. In parallel to all these changes to improve your efficiency, you must make sure that your teams are formed around the right customer domains and that they solve the right customer problems. This is more of a product management topic; feel free to check here if you want to learn more about it.
Just a reminder to all tech leaders that whatever you do, it must be easily explainable to non-tech people. You must be able to quantify the positive impact of your transformation and show clearly the business benefits. If you communicate a “50% improvement in our efficiency,” it will be easy for them to understand that with the same resources, you are delivering 50% more features.