Time to weigh in on a development debate for the ages. No, not where to put the curly braces (they go on their own lines). The debate is, “What the heck are developers?” Are we like designers and artists, creative people who need time to think and imagine? Is development a craft and we are craftspeople? Or, as our titles imply, are we “software engineers”? That last suggestion really bristles people. Get over it; the issue has been decided for us.
Oh, there are plenty of people who still think that this is an open debate. I’ve followed the developer aliases, seen the websites, and heard the arguments. Heck, I’ve made the arguments, claiming forcefully, “We’re developers, not engineers.” (See Pushing the envelopes: Continued contention over dev schedules).
Developing software is a creative process. It’s an unpredictable process, dealing with custom components that have poorly understood properties. Many believe that there’s no hope of applying engineering practices to this process in our lifetimes. They contend that we are at worst hacks, cowboys, cowgirls, and reckless amateurs, and at best creative craftspeople. Well, treating software development as a craft is no longer adequate.
Craft a desk, engineer a car
Don’t get me wrong, I love crafts. There’s nothing like a sturdy hand-crafted table and chairs, an elegant hand-crafted timepiece, or even a well-designed and crafted home in which you can raise your family. I just don’t want to drive my car over a well-crafted bridge. I don’t want someone to stick a well-crafted pacemaker in my chest. And I don’t want to rely on well-crafted software to run my business, protect my assets, or direct my actions. I want well-engineered software for these tasks and so do our customers.
So, what separates a hack from a craftsperson and a craftsperson from an engineer? A hack learns as he goes, acts then thinks, cleans up his mess after someone’s stepped in it. Sound familiar? In contrast, a craftsperson studies, plans, uses the best practices and tools, and takes pride in her work. This describes the best of us who develop software. But craftsmanship doesn’t quite reach engineering status because you still don’t know what you are going to get. Craftsmanship lacks certainty and predictability; you make your best estimate instead of knowing.
It’s what you know
Engineering, on the other hand, is all about knowing instead of guessing. It’s about measuring, predicting, controlling, and optimizing. An engineer doesn’t wonder about things, he looks it up. An engineer doesn’t estimate, she calculates. An engineer doesn’t hope, he knows. This doesn’t mean that engineering lacks creativity or innovation, just that there are known boundaries to safe behavior that must be enforced to achieve reliable results.
But we all know that software is unpredictable. How can we possibly apply structured engineering practices to software? The secret is so obvious; it kills me that I didn’t see it sooner. Don’t try to predict the software, predict the people who make it. The constant in software development isn’t the software, it’s the developer. People are creatures of habit and our habits are predictable. That realization may not sound profound, but it changes everything.
To thine own self be true
It may hurt to think that you are predictable, but you are. A little introspection will reveal that truth. You make the same mistakes over and over again. You take about the same amount of time every time to write certain types of functions or objects. You even write them by using about the same amount of code. It’s scary but true, and more importantly, measurable and predictable. Holy sh-t.
Okay, so I didn’t believe it either at first, but then I spent a couple of weeks measuring myself programming and graphing the results. Like the other 4,500 programmers who tried this before me, I’m an open book. For any given type of function or class, I take roughly the same amount of time, write the same number of lines, make the same number of mistakes of the same kind, and take about the same amount of time to fix them based on type. This insight is embarrassing, but powerful.
Eric Aside The two weeks I spent measuring myself were part of a course on the Personal Software Process (PSP) from the Software Engineering Institute (SEI). PSP is part of an engineering team approach to software development called the Team Software Process (TSP). I was impressed with their demonstrated results and the theory behind them. My team tried TSP for a while. I’ll talk more about how it went shortly.
If you just draw a diagram of the classes that you need to write, you can know with measurably high confidence how long it will take, how many lines you’ll write, how many bugs you’ll have and of what type. You can also know how many bugs will surface in design review, code review, by the compiler, in unit test, and by the test team.
What’s in a number
So what? So this: If you know how many bugs you’ll find, you can know to keep looking, you can know when you’ve looked enough, you can know how many other people should look and what they should look for. You can say, with confidence, we’ve found and fixed 99.9999% of the bugs, period. In other words, you can know instead of guess. Congratulations, you’re an engineer.
Okay, what’s the cost? How much crud do I need to track in order to do high-confidence predictions? Here’s the full list of the measurements you must collect:
· Work time spent between checkpoints This is the amount of time that you actually spent doing work between each checkpoint. (Checkpoint examples include design complete, design review complete, code written, code review complete, code builds cleanly, unit tests run clean, and code checked-in.)
Eric Aside If you use Kanban, which you should, this data is trivial to collect. You can derive the data directly from the Kanban board by using a cumulative flow diagram (a burn-up chart). It’s simply the time each story takes to move through each Kanban bin.
· Time and rework spent on check-pointed work This is the amount of time spent on reworking stuff you “finished” in an earlier checkpoint, along with a one-line description of what happened and some categorization so that you can reference it later. (This is typically a task like design changes after design complete or code changes for bug fixes or whatever.)
Eric Aside This is also fairly easy to collect from a Kanban board if you track story rework separately from story linear flow. Corey Ladas covers this in his LeanSoftwareEngineering.com article “Accounting for bugs and rework.”
· Number of lines of code you added, deleted, or changed This one is obvious and can be automated easily.
That’s it. It’s all information that you can get with a decent timer and notepad, although teams are working on tools to make it even easier. You must be consistent for accurate results; but with only these data points, you get more information than you could have dreamed possible.
For instance, you can answer questions like, “How much time did I spend doing real work this week?”; “How many times did we have to change the API, and how long did that take?”; “What percentage of bugs is found in code review?”; “How does the percentage of code review bugs found relate to the time spent in code review?”; “What kinds of bugs are mostly found early vs. late?”; “What kind of bugs take the most time to fix, when are they introduced, and when are they found?”
It’s their habits that separate them
So, what’s the catch? The data is only good for one person. Everyone’s habits are different; you can’t compare my data to yours and have a meaningful discussion. This is actually a good thing because managers shouldn’t be using data for comparisons anyway. As I discussed in my article More than a number—Productivity, when managers use data against people the measures get gamed.
Although data can’t be shared or compared for individuals, it can be aggregated for teams. This is a manager’s dream come true. You can do all this prediction and quality management on the team level with little effort and extremely high levels of accuracy. Because aggregated data drives toward the mean, the results for teams are no less accurate than they are for each individual. You could manage 100 people and be able to predict completion dates and bug counts to the level of accuracy of a single individual. Yowza!
Think big to get small
Okay, what’s the punch line? Maybe engineering software is possible to some extent. Maybe you can predict code size, bug counts, development time, and so on starting from just a swag at the list of objects in your system. How does that data translate to results? Teams both inside and outside Microsoft that used data like this to target and control their bug counts have lowered their defect rates from the typical 40–100 bugs per thousand lines of code to 20–60 bugs per million lines of code. In other words, the typical Microsoft team of 15–20 devs that produces 3,000–5,000 test bugs per year would instead produce 3–5 test bugs per year.
And yes, those low bug rates include integration bugs. People bicker and moan about how complicated bugs can be in our big software systems. It’s true; you do find a large percentage of bugs during integration. But what kind of bugs are they after you find them? Are they wildly complex timing bugs with weird unpredictable multithreaded interactions? Maybe one or two of them are, but the remaining thousands of bugs are brain-dead trivial parameter mix-ups, syntax errors, missing return value checks, or even more commonly, design errors that should have been found before a line of code was written.
Good to great
Of course, the way great teams control their bug counts is by using design reviews, code reviews, tools like PREfast, and unit testing. However, these methods alone only make a developer a craftsperson, dropping bugs by a factor of 10 or so, not by a factor of 1,000. The drop isn’t that large because you have to guess where and how to apply your craft; you don’t know. By investing in a small number of measurements and taking advantage of your own habits, you can know. You can graduate into being an engineer and earn that factor of 1,000 improvement.
That’s a big and necessary step for delivering the quality and reliability that our customers demand. You must also discern the requirements and create a detailed design that meets them, but those amusing subjects will have to wait until next time. For now, dropping your bug counts to around 10 per year would be a nice start.
Eric Aside So how did my team’s experiment with TSP go? Did we achieve a 1,000 times reduction in bugs (rework)? Not exactly. To be fair, my team isn’t a typical team and we didn’t stick with TSP long enough to get reliable results. The problem wasn’t the methodology, though it was unnecessarily burdensome and rigid at times. The problem was the tools—they were unusable and unreliable, and they didn’t scale to large teams.
Since that time, we’ve focused on process changes that are easier to adopt but that still make a huge difference. These include Scrum, TDD, planning poker and Delphi estimation, inspections, the basics of design and code reviews, unit testing, code analysis, and, most recently, the adoption of Kanban and scenario-focused engineering.