Last month, I wrote about the value of good program managers (PMs). Some people liked the column (mostly PMs). Some people hated it (folks with bad PMs). However, the most common response was that Microsoft has too many PMs. Can you have too much of a good thing? Heck yeah!
Why is having too many PMs a bad thing?
- Because PMs are useless? No, PMs are our secret weapon. Read last month’s column, PM: Secret weapon or wasted headcount?
- Because there is a magic ratio of PMs to developers (12:1, as some suggested)? No, this isn’t a theme park. Magic is for fantasy novels and Las Vegas shows.
- Because PMs spend their time creating work for others? Yup, that’s it exactly. PMs orchestrate work for their teams. Having too many PMs yields too much work.
Having too many PMs will facilitate too many ideas, too many meetings, and too many designs and scenarios. The same is true for other roles. Having too many developers will lead to unaligned, untested, and unnecessary features. Having too many testers will generate unsupported, unexpected, and unreliable tests, or worse yet, more test harnesses. How do you right-size your team roles? It’s time for a little TOC—theory of constraints.
The weakest link
As I mentioned in De-optimization, the Theory of Constraints (TOC) tells us that the fastest way a project team can accomplish anything is constrained by the slowest step. Say your PM team can spec an average of six features in a month, your development team can code two features in a month, and your test team can validate three features in a month. There’s not much point in your PM team going full speed, is there?
Of course, you can speed up the dev team by removing interruptions, doing pragmatic implementation design, and driving quality upstream (all of which I’ve covered in prior columns). However, at the end of the day, you’ve got to pace the PM and test teams to the dev team. Sure, you want buffers (like extra specs) to account for variability between features and different phases of the project, but you never want the PM and test teams to outpace the dev team.
The TOC strategy of controlling pace to match the slowest step is called Drum-Buffer-Rope.
- Dev is the drum beat that paces the work, because in this case dev is the slowest step.
- Buffer is the extra work (like specs) available for dev in case implementing a feature happens to be easier than usual. You always want the drum beating.
- Rope is the constraint on PM and test to keep them on pace. Without the rope, developers are flooded with work and requests, causing them take short cuts, drop quality, and actually lower overall productivity with the subsequent rework, bug fixing, and discarded work. Watch Lucy and Ethel in the candy factory for a quick demonstration.
I know we’re not a factory—many aspects of software development are creative in nature. Of course, we’re not artists either. We have schedules, deadlines, dependencies, and sync points. People do get overwhelmed, take short cuts, lower quality, and throw away unused specs, code, and tests. That’s wasted effort and time that we can’t get back at the end of projects when we desperately need extra effort and time. With too much rope, we hang ourselves.
Do we just start firing PMs and testers? In my example, that would certainly constrain PM and test to not flood dev with extra work and requests. However, we do need PM and test at some level (see Undisciplined—What’s so special about specialization?). How many PMs and testers do we need? How do we avoid wasting time and effort?
At the time you are writing specs, code, or tests, it’s not obvious which ones will never ship; at least it isn’t when using traditional Microsoft methods and multiyear ship cycles. Even an annual ship cycle is too long to know what features will be shipped in advance using a waterfall-ish software development model.
The Feature Crew lean approach from Office helps avoid unused work. In Feature Crews, PM, dev, and test team members tie themselves to one feature at a time until it’s completely tested and integrated. They can’t get far ahead of each other. Versions of Scrum and Extreme Programming (XP) work the same way.
However, Feature Crews and Agile methods like Scrum and XP don’t explicitly tell you how many PMs and testers you need (if any). They also don’t tell you how to avoid waiting on your discipline peers or avoid going to excessive planning meetings. Isn’t there a simple approach that embraces traditional roles and doesn’t rely on magic or dogma? Glad you asked.
There’s a signpost up ahead
Over a year ago, I started introducing my teams to Kanban. Four of them are currently using it. Two teams switched from Scrum, and two teams switched from a variety of Microsoft methods. All of them are happy they changed.
Kanban tells you how many of each role you need and shows your whole team exactly where and when folks are waiting or stuck. Kanban translates roughly from Japanese as “signboard.” You post all your work items on a board near where your team sits. Here’s a photo of one of my team’s boards.
The board has columns for each step a feature goes through, left to right. Typical steps, like the ones in the photo, are breakdown, implement, and validate. (I know it’s hard to read, but that’s intentional given this is a public blog.) Each step has two columns (active and done) and a work-in-progress (WIP) limit. There’s also a column on the far left which holds the backlog of work items in rough priority order from top to bottom.
As you can see in the photo, the work items are simply post-it notes (you could use index cards or whatever). Team members move the notes from left to right at a daily 10- to 15-minute standup, based on each step’s completion rules at the bottom. (The PM updates the work-item tracking system to match the Kanban at the end of each day.) To see a simplified Kanban in action, click here.
The slogans at the top state Kanban’s core values: 1. Visualize your work. 2. Limit WIP. Visualizing work helps everyone see progress and blockages without planning meetings or retrospectives. Limiting WIP, well, there’s the rope.
Many Scrum and XP teams use signboards with post-it notes. Kanban differs in three important ways:
- Steps are divided into active and done columns. This keeps items from flooding the next step. Instead, they are held in the done column of their current step as a buffer.
- There are posted completion rules for each step. Some Scrum and XP teams also define done, but Kanban clearly denotes and defines each step’s transition from active to done.
- There are limits on the number of items that may be in each step. These limits act as the rope, as described in the next section. The rope regulates flow and identifies blockages without requiring planning meetings or retrospectives.
Together, these differences make Kanban a so-called “pull model.” Work is pulled from the right when needed, rather than pushed from the left, as is the case with many Scrum and XP boards. Pull models are generally more efficient because you only produce what you need.
WIP it good
Each post-it note represents work in progress until it makes its way to the final done column. Imagine you had far too many PMs and didn’t employ a drum-buffer-rope strategy. Developers would be overwhelmed with specs for unimplemented features.
If you had a post-it note for each feature, the Kanban would be loaded with yellow notes waiting for the implementation step. There’s no way the dev team would ever implement them all. Likewise, if you had too many developers or too few testers, the Kanban would be overflowing with post-it notes in the implementation-done column waiting for validation.
Unfinished features overloading the Kanban represent waste—work that is waiting and will likely never be finished. It’s visual and obvious. The trick to limiting waste is to limit the work in progress—you need WIP limits for each step. The WIP limits constrain the number of items that can be in any step (active or done). The limits form the rope that aligns all the steps to the pace of the drum (the slowest step). The done columns are the buffers which ensure the drum always has work to do.
As a glorious byproduct, reducing WIP reduces cycle time—the time it takes to complete features from start to finish. As I described in Cycle time—the soothsayer of productivity, this means you can react more quickly to changes in plan or competition (less lead time required) and have a shorter feedback and improvement loop with people using or testing your products.
You can also track bugs on the Kanban if you wish. Typically, they are tracked in a separate big row apart from the main flow (a separate swim lane), often with their own separate WIP limits. Of course, you can also simply track bugs in your bug tracking system. Those systems are fairly similar to Kanban, especially when you limit the number of active bugs people or teams are allowed.
I’ve seen both approaches for bugs work. For tiny bugs that take an hour to fix, leaving them in the bug tracking system is fine. For larger bugs, it’s good to see their impact and flow visually on the Kanban.
With either approach, the key is that bugs take precedence over new work items. After all, bugs represent rework. Since you work in priority order, fixing prior work must be more important than any new work. The message to developers is that you’ve got to get it right before you get to move on.
When something’s going wrong
To see WIP limits in action, let’s say you’ve got too many PMs. They quickly take features from the backlog and break them down into spec’d work items. The breakdown step quickly reaches its WIP limit (six in the photo). Now the PMs aren’t allowed to break down more features. What do they do? They can analyze blocking issues and resolve them, help implement features, do customer research or prototyping, work on a different project, find another job, or whatever. What they can’t do is overwhelm developers.
Let’s say you’ve got too many testers. They quickly take completed features from the implementation-done column and validate them. Soon the implementation-done column is empty. What do they do? They can analyze blocking issues and resolve them, help implement features, work on improving automation or tools, work on a different project, find another job, or whatever. What they can’t do is overwhelm developers.
Notice this works for any step or role, since the slowest step can vary by feature. Problems are visual and obvious immediately, not after a retrospective. Replanning can be done continuously, without requiring special meetings (using whatever estimation method you prefer—my favorite is planning poker). It’s all made possible by limiting work in progress in a way everyone can see at all times.
We still do formal retrospectives on demand with Kanban. We use them to analyze and take action on particularly troubling issues or recurring problems.
Something in your size
How do you set the right WIP limits? How do you know the right number of PMs, developers, and testers? Easy—start with the drum, the slowest step. Typically, it’s development.
You always want the drum beating, so make the WIP limit for implementation equal to the number of developers plus one or two for a buffer. How many developers should you have? Take the number of features you need to deliver and divide it by the number of features per month each developer can write (based on past average performance). Then divide by the number of months you have for development. That gives you the number of developers you need and the corresponding implementation WIP limit.
You then set the WIP limits for the PM and test steps to match their throughput to the dev throughput. Let’s use the hypothetical average rates in my original example, but make them per person: a PM specs six features per month, a developer implements two features per month, and a tester validates three features per month. (Actual averages depend on your approach and the type and size of your typical features.)
Say you need three developers to implement all your features on time. Dev throughput would be six features per month. To make PM throughput match, you need one PM and two testers. The WIP limits should be two for specs, four for implementation, and three for validation (the number of folks plus one for a buffer). Since you replan continuously, you can adjust your WIP limits as needed. Just don’t increase limits over and over again to avoid being blocked—that’s cheating. Analyze and fix the problem instead.
In the Kanban photo above, the breakdown WIP limit is six. That’s because that team doesn’t use many hefty PM specs. Instead, the team is co-located and uses real-time design discussions, white boards, and iteration as they go. As a result, the breakdown step is simply to break down backlog items into small work items—a quick operation. The WIP limit of six ensures there are always six items ready for the implementation.
The team’s one PM does write specs for particularly large or tricky areas that span a bunch of work items. He also updates the work items in TFS daily to match the Kanban, facilitates design, meets constantly with customers, and does others things great PMs do.
Doing it well
You could match PM and test throughput to dev throughput without WIP limits or Kanban. Good teams do this intuitively to avoid wasted effort. They “right-size” their teams and work collaboratively to succeed together. Kanban just makes this process visual, simple to arrange, and easy to track. It also makes blockages and problem areas readily apparent.
Kanban is no substitute for high-level planning, release management, and syncing milestones across teams and divisions. It doesn’t replace cross-team scenario-focused engineering or the need for senior PMs to drive key cross-team scenarios.
What Kanban does do is tell you at the team level how many people you need and how best to use their time on a daily basis. It makes the flow of value to the customer continuous and minimizes time and effort lost to team planning, waiting, or work that doesn’t ship.
Making the most of the time you have is often the difference between a panicked product you’re ashamed to ship, but too exhausted to block, and a timely product customers love. Too much of a good thing can cause tremendous trouble. Stay focused, set the right pace at the right size, and reap the benefits of a team in flow.
Many problems can disrupt the flow of a team. Kanban makes those problems visible. Some tips that solve common issues:
- If one step is constantly slowing down other steps, increase the number of people on it, try new approaches to make the step faster, or decrease the WIP limits for the other steps.
- If one step commonly has items in its done column for more than a day, lower the WIP limit for that step.
- If you’ve got two or more steps in a row that the same people always do on the same items, combine those steps and simplify your Kanban. You don’t need lots of steps in order for Kanban to work.
For more information on Kanban, check out Agile Project Management with Kanban and Kanban anti-patterns.
Our organization uses Scrum but has a chronic problem of carrying over many stories sprint-to-sprint. It is rare too see a completed sprint. Based on TOC, this would suggest our Dev team is overwhelmed and distracted ( we are). Based on our story-carryover problem, is this a good enough argument to convince upper management that we should consider Kanban?
Somehow this oversimplifies what the PM role should do in today's role. Their job does not end with writing a spec for a feature nor getting it out for the first time. Anyone can do that. The harder part is to figure out whether it did the job it is supposed to do based on market/customer metrices. The even harder part is also to order the investments and juggle things around so that teams can make progress in matrixed organizations and "dynamic" priorities. One way to constrain PM is to hold them accountable till the feature earns the right to become a part of the product. 10 years back we shipped stupidity and comfortably invested in compatibility with stupidity. These days it is not true. Over time the product grows all kinds of calcifications and barnacles which are competitive threats to responding to real challenges. You also did not cover how the dev team is supposed to maintain 100 features in month 1, 150 features after month 2, 200 features after month 3.