September 1, 2001

Programmer Velocity

Authors: Andrew Fuqua, Greg Houston, Matt Di Iorio

Each team may define programmer velocity differently. But within a team, every engineer must use the same definition – their velocities must take the same things into account. If not done consistently, tracking and load balancing will not work and the Iteration and Release plans will be unreliable. If you have not yet started using XP, this paper is not for you – Read this after reading the XP trilogy (see bibliography). If you are using XP, read on – this is how one team in ISS defined Programmer Velocity. This is what the XP literature calls a "local adaptation". More recently, teams in ISS use team velocity and do not track programmer velocity. But back when we did, this is how we did it.

Ideal Engineering Days

Before discussing velocity, we must define Ideal Engineering Days (IEDs). An IED is when everyone just leaves you alone to program. Unhook the phone; close Outlook, ICQ and AIM; cancel meetings; and put up a do not disturb sign.

Naturally, we never get 5 full IEDs in a week. You do have to go to meetings. You do have to check email and voicemail. And you do have to help others. In our case, the Events team, helping others takes the form of pair programming. If your team is not pairing, helping others takes the form of peer reviews, code walk-throughs, integration testing, and chalk talks.

Still, you might think that the more IEDs you can get in per week the better. Not necessarily – You may be neglecting something. You might not be spending enough time helping others, refactoring bad code and writing tests. If that's the case, your code will be difficult to maintain and your team will suffer.

We estimate tasks and stories in Ideal Engineering Days. We might also call these XPUs (eXtreme Programming Units). Ideally, stories should be broken down into tasks that will each take ½ to 2 IEDs.

Programmer Velocity

Ok, so we estimate in IEDs. What about all those meetings, phone calls, emails and interruptions? What about peer-reviews or pair programming? How do we account for those, and how do we deal with inaccurate estimates? Programmer Velocity must take all of that into account. Reread this paragraph to see what must be included in your velocity.

In the last iteration, I completed my tasks (those that I signed up for and am responsible for), originally estimated at four IEDs. It was a two-week iteration. So, I spent about three days each week pairing with others on their tasks, writing documents like this, attending meetings and fixing defects. In this example, my Velocity is 4.

Velocity is simply the original estimate in IEDs for the tasks you owned and completed last iteration. This is the definition used in the green "Planning eXtreme Programming" book. If I completed tasks originally estimated at five IEDs, then my velocity is five. It doesn't matter how long it actually took. What you count is the original estimate of the stories/tasks actually completed.

Each programmer does a different number of IEDs per week. Everyone's velocity is different.

•  Velocity takes into account how much "other stuff" you do. The more "other stuff" you do, the lower your velocity.
•  Velocity is also a measure of how over-conservative or ultra-aggressive your estimates are.
•  Velocity is measured. After each iteration, look to see how many IEDs you originally estimated for the tasks you completed. If you picked up more tasks during the iteration, add the original estimate for those into your velocity. If you did not complete all of the tasks you signed up for, do not add the estimates for those into your velocity. Remember; only count IEDs for tasks you led. Don't guess – measure it.

You may want to compute velocity over as few as one or as many as three iterations. If your velocity is stable or steadily moving in one direction, use "yesterday's weather" [Planning Extreme Programming, pages 33-34, 90, 117-118]: Your velocity for this iteration is your measured velocity from last iteration. If your velocity is unstable , first make it stable. If you can't, then measure velocity over more iterations.

There is no ideal value for velocity. However, if your velocity is more than 5 in a 10-day iteration, consider whether you are spending enough time refactoring, testing and helping others.

Managers, note: Velocity cannot be used to measure programmer productivity or value. "It's a result, not a control variable." [c2 wiki] It would be inappropriate to compare programmers by their velocity.

How to use an Individual's Velocity

After programmers sign up for and estimate tasks, the tracker balances everyone's workload. Programmers that have too much work give some tasks to other programmers. Balancing activity goes like this:

Total IEDs estimated for this iteration (my tasks)
VS.
My velocity.

Move tasks from programmers that have signed up for too many IEDs to those that have too few. Also, add or remove stories from the iteration to fit the actual size of the iteration.


A common mistake is to divide the number of days in the iteration by 2 "because we're pairing." Don't do this. If you have a 15-day iteration, do not divide 15 by 2 and say there are only 7 days. There are 15. Pairing is accounted for in everyone's velocity. If you multiply the task estimates by velocity AND divide days by two, you are overcompensating (double counting).

Example: Fred's velocity is 2.5. Fred should sign up as lead for 2.5 IEDs worth of tasks. These are tasks that he "owns" and is responsible for. Do not count tasks he intends to do as a partner this iteration – that's accounted for in his velocity.

Example: My velocity is 3.5. I sign up for tasks. My estimate comes out at 5 IEDs. I'm over loaded so I give a 1.5 IED task to someone else. Now I'm balanced. During the iteration, I get behind and need to defer ½ IED of a task to a later iteration. At the end of the iteration, I measure my new velocity. Since I didn't complete all of my tasks, my new velocity is 3 – the original estimate for those tasks I did finish.

Example: Chet's velocity is 6. So is Joe's. Chet and Joe correctly sign up for 6 IEDs for this iteration. Halfway through the iteration, the tracker notices that Chet is falling behind. Chet agrees to give one of his tasks, a 1.5 IED, to Joe, who is making great progress on his tasks. At the end of the iteration, Chet completed tasks originally estimated at 4.5 IEDs. Joe completed tasks originally estimated at 7.5 IEDs. Now, Chet's new velocity is 4.5 and Joe's is 7.5.

When tracking, I prefer to only track and balance a programmer's primary tasks – I don't try to balance work as pairs and I don't sign up as pairs. I like to keep it light. However, signing up and tracking as pairs actually gets people to pair up, so it may be worth the extra effort for your team to track and balance pairing activity as well.

Miscellany

Engineers should pair with others as much as others pair with them. For example, if Sue signs up to lead 5 IEDs, she should pair with someone else for around 5 IEDs. This will happen naturally if the tracker balances the workload.

If Sue doesn't help others with their tasks as much as others help her, some other team member has to pick up the pairing slack. This inflates Sue's velocity and deflates someone or everyone else's. This is not a problem as long as Sue is consistent – otherwise velocities will be unstable and less useful.

Special assignments, such as writing essays like this, can also impact your velocity. If a pair will be working on the assignment and if you can estimate it like a programming task, then treat it just like any other task in the iteration. If not, budget some of your time during the iteration for the special assignment – treat that time as unavailable. At the end of the iteration, measure your velocity as if you had fewer calendar days than everyone else.

Variations

Your team does not have to use the velocity definition presented above. You can come up with your own. I used to use a variation that used lots of messy division. It was based on the (now deprecated) Load Factor math from the white eXtreme Programming Explained book. I don't recommend it.

Here is another approach you could use. If your team thinks and estimates in terms of average size stories or tasks, then your team may want to define programmer velocity in terms of points per iteration, w here 1 point is an average size unit of work. For example, I might be able to do one point per iteration, whereas Charlie might crank out 1.5. My velocity would be 1 and Charlie's would be 1.5. Since tasks are not all the same size, Charlie can take on a task that is 50% larger than average, whereas I should work on an average one-pointer or two ½-pointers.

Pick an approach that works well for your team.

Story Velocity v. Task Velocity

Above, I defined velocity as the original estimate in IEDs for the tasks you owned and completed last iteration. You also have a story-velocity. Your team may decide to track Story Velocity or both Story and Task Velocities. If you decide to only track one, Story Velocity is more useful in that it can be used for both Release and Iteration planning. Task velocity is only useful for Iteration Planning. You can't do Release planning without your story velocity. (You cannot predict your end-date if you don't know your Story velocity.) I personally believe this and it's also what Ron Jeffries said at XP Universe 2001.

Why wouldn't the two velocities be the same? Because stories are estimated at a higher (and less accurate) level of granularity. Also, story estimation is done less often and may have been done with different assumptions or different team members. Story estimation should be done by the whole team as a group, but some teams only have a sub-set of the team estimate the stories (not a good idea). Conversely, task estimates are done more often and are usually done by only 1 or 2 programmers. They are done at the start of the iteration in which they'll be implemented, which is the most accurate time to estimate.

Use Task velocity in Iteration Planning to see if the team or any team member is over-committed for the iteration.

Use Story velocity in Release Planning to get a rough idea of what stories are in what iteration.

Update BOTH velocities at the end of EVERY iteration.

Team Velocity v. Individual Velocity

In the definition above I said "you or your team". Some teams track Individual velocity (more paperwork, more granularity, more accurate balancing). Other teams track Team velocity (less work, lightweight).
It's up to your team to decide whether to use Team or Individual Velocity.

Bibliography

Kent Beck, eXtreme Programming Explained, Addison-Wesley, 2000. (The white book.)

Ron Jeffries, Ann Anderson, Chet Hendrickson, eXtreme Programming Installed, Addison-Wesley, 2000. (The purple book.) Pages 64 and 65 are especially useful.

Kent Beck, Martin Fowler, Planning eXtreme Programming, Addison-Wesley, 2000. (The green book.)

February 1, 2001

Refactoring

Authors: Chris Simpkins, Andrew Fuqua

A few months into development our director paid us a visit. Being a wise veteran of the software industry, she was concerned how well our system tolerated changes in requirements and scope. Many projects handle such changes poorly. Without hesitation, we confidently explained that our system could not only tolerate a significant amount of change, but that we could keep the system working continuously while making the changes. How could we be so bold? Are we hypercompetent überprogrammers? Or foolhardy script-kiddies who don't fully understand the question we were answering? The key to our confidence lies in our consistent application of refactoring. If you'd like to learn more about refactoring, and perhaps become half as good a programmer as us in the process, read this brief overview, then visit the resources listed at the end of this paper. 

What Is It?

Refactoring is changing code you've already written without breaking anything. More formally, it is changing existing code to improve its internal structure while preserving its external behavior. You are already refactoring, though you may not know it: You find a method that needs another parameter to do its job. You find a class which should be split in two. Making these kinds of changes is what refactoring is all about.

Often we make such changes haphazardly, perhaps even ripping up the entire system with the intention of putting it back together in an improved form. Often we break something along the way. But we don't have to. We can introduce discipline into this process. Like with design patterns, we can use a set of common "refactorings" – commonly needed changes to code. Each "refactoring pattern" describes the applicability of the refactoring and a proven step-by-step process for it. We don't have to re-invent the process of changing code every time we do it; we can follow a recipe developed and refined by masters of the art.

What's So Special About It?

"You said we're already doing it – why formalize?" The "father" of refactoring once wondered the same thing. Martin Fowler developed the idea of formalized refactoring after watching Kent Beck make large changes in a system by applying a series of very small changes.

Beck would break a large change into smaller steps and follow a painstaking process of making one change, running his tests to make sure everything still worked, and then repeating this "change, test" process until all the steps were done.

At that time Fowler's own method of refactoring was the slash-and-burn method described earlier – tossing up the code and putting it back together again – with all the debugging headaches that entails.

Fowler noticed that Beck was much faster and more successful at making changes using his incremental approach. The key to understanding why Beck's approach is better is realizing that smaller changes are much easier to understand and control. If you make large changes and the system breaks, you have to do a lot of sleuthing to find exactly what broke the system. But if you make very small changes, testing your system after each one, you know exactly what caused the problem and it's easy to undo the change.

A major tenet of software engineering is breaking large problems into chunks manageable by the human brain. Refactoring applies this principle to the process of changing existing code.

Can I Use It on my Project?

Yes, if you've satisfied a couple of prerequisites. The biggest prerequisite is that you have automated unit tests in place for the code you're changing. This is vital. You must be able to assert that your changes didn't break anything.

Another prerequisite is having some sort of source code control in place. A major benefit of refactoring in small steps is that it allows you to abort the process if you find it's not working. Source control allows you to do this by returning your code to its pre-change state.

How Will It Help me on my Project?

How many software projects have you worked on where the finished product was exactly as envisioned in the initial requirements? All software systems change during the course of development, and a large part of a team's success lies in its ability to deal with change. Refactoring will make your code better, allow you to spend less time on up front design, and give you confidence to make changes. To be successful, we must embrace change – not fear it. It's a fact of life. You have two choices: adapt, or die. Refactoring helps you adapt.

I'm Convinced! (or Just Curious) Where Can I Learn More?

Fowler, Martin. Refactoring: Improving the Design of Existing Code. Reading , Mass. : Addison-Wesley, 1999. This is the book on refactoring.

Martin Fowler's Refactoring web site contains a wealth of information, including the PhD thesis that started the formal Refactoring movement, a constantly expanding catalog of refactorings, and links to several Refactoring resources.

A refactoring plug-in for (X)Emacs. Yet another reason we should all be using the One True Editor (tm) :-)

Continuous Integration

Authors: Matt Di Iorio, Andrew Fuqua, Charlie Hubbard

Have you ever been on a project that practiced big-bang integration – wait until all the features are done, then throw it all together and see what happens? It usually turns out just like the name implies – a totally chaotic, primordial soup of software. Just getting the code to compile is an amazing feat. Then come the bugs. How painful it is to harness the power of nature to integrate your code!

Or say you have a daily build. But for some reason your daily build is broken daily. So you spend half the day tracking down the slacker who checked in the bad code.

"Ah, but my daily build never breaks", you say. Great! But have you ever spent more than a day tracking down one bug? I thought so.

Our team has overcome these problems using simple techniques: We check-in frequently. And we build and test continuously.

"We measure success one build at a time"

Our automated build and test process runs continuously – not daily. Well, it actually runs every half-hour, but that's close enough to continuous for me.

By building and testing continuously, we find it trivial to figure out what broke the build. Given the short time span between builds, the system is usually built with only one developer's changes at a time. If it broke, we readily know why.

By checking in frequently, we avoid file contention. This means never having to merge two sets of changes.

By doing both of these, we never need more than a few minutes to integrate a piece of code into the system. The code in MKS is always buildable – it always works – every minute of every day. I can't stress this enough: it is never in a broken state. That certainly makes integration easy. And since we check-in frequently, we never have much to integrate.

XP's practices build on each other. That's what makes the process so strong. Like XP, continuous integration's practices are multiplicative. Continuous integration draws its strength from good unit tests and frequent check-ins: The tests are more valuable because they are run more often. The frequent builds are more valuable because by running the tests they do more work each time.

Our House

You will get some benefit just by increasing check-in and build/test frequency. We discovered that some additional things we do make continuous integration even better. You may have different needs and different practices. But here is what we do.

Farewell Make

We broke our love affair with Make. She was never a very accommodating partner – always wanting things to be perfect, never compromising. I wanted spaces; She wanted tabs. We just couldn't go on.

We started using a tool called Ant. It's an extensible, cross-platform build tool written in Java. When we discovered that it uses XML for the build file, we just had to use it. Ant pulls from MKS, compiles the code, builds and signs the jar files, runs the tests, and lets us know if anything failed. A Windows Scheduled Task kicks it off every 30 minutes.

One cool Ant feature is the ability to write "build listeners". We wrote a listener to yell at us when the build fails. To notify the developers we use a "net send" command. We wanted to make it intrusive as possible so the developers couldn't ignore failures.

Developers need to build and test on their workstations before checking in their changes. Our IDE allows us to build and test. But we also wanted each developer be able to build with the same process as on the "official" build machine. This required us to modify our architecture but it helps keep the build running smoothly.

"Just the facts, ma'am"

Once you start running an automated build, you've got to see the results. Our results are on a web page. Prominently shown at the top is the Success / Failure message. Then come the details. If the build was not successful, you can see which files didn't compile or which tests failed.

When the build does fail, it only takes us a couple minutes to find and fix the problem. (Remember all the benefits I mentioned above?) In keeping with the spirit of "continuous", we don't wait 20 minutes for the next regularly scheduled build. We can kick off an unscheduled build at any time from our build page.

Our build page even has a cool plot of the number of unit tests run each day. That helps us focus on writing more tests.

Keep it up

It's always good to be on the lookout for things that you can do to make your life easier (or safer). For example, Mike Slifcak suggested we build the database from scratch before running all the tests. This ensures our production code and our install code don't get out of sync. Now we have continuous integration of the install and production code. See how much you can automate.

Our main goal with the build process was to get the most value by doing the least amount of work. There were lots things we could have added to our build process, like automatically logging all the changes between builds. But that wasn't necessary and would only give us more to maintain. Stay light!

Links

http://www.martinfowler.com/articles/continuousIntegration.html
Martin Fowler's article on continuous integration that got us started.

http://jakarta.apache.org/ant/
Here's the great great Ant.

Automated Unit Testing

Authors: Greg Houston, Andrew Fuqua, Tom Gagnier


Did you hear the one about the programmer who wouldn't rewrite some bad code because he was afraid of breaking something else? Since the product couldn't be easily enhanced, innovation was stilted and the company's market value tanked. Our protagonist spent the next two years maintaining that same ugly code. It's not a funny story, but it happens all the time. It doesn't have to. We can rewrite code without fear. We don't have to be the butt of that sad joke. All we need are fully automated unit tests to validate the behavior of our classes and methods.


"Okay, fine," you may be thinking, "but where do these tests come from? They don't exactly fall from the sky." No, they don't. Programmers write tests before they write the production code. Start right now – Write tests for each new method. It's easy and it has done wonders for innumerable projects.

But Wait, There's More!

A.K.A., The Benefits


You can not safely repair or rewrite poor code unless you have tests to prove that things still work. If you do have tests, you can restructure, replace, rewrite until your heart is content. So unit testing enables Refactoring , which keeps working code from degrading as changes are made. When you have unit tests, you have confidence to change any code, even code written by someone else. If the tests pass after making your change, you know you didn't break anything. If the tests worked before, but don't work now, you know you broke it! It's black and white. This enables shared ownership of the code.

Wow! Now we have clean, refactored, tested code and confident programmers. But isn't refactoring and extra testing going to slow us down? NO! Developers actually program faster with fewer defects when using Unit Tests: First, clean, refactored, tested code is easier (read: quicker) to maintain. Second, the tests either pass or fail. The programmers get immediate feedback. They aren't left wondering whether they've tested everything. Third, when tests fail, the code is still fresh in mind – the bugs are easier to find and fix . Fourth, the tests are run every time the code is changed. A programmer is not going to repeat by hand all manual tests ever performed on a class. It's not possible and if it were it would be too time consuming. Happily, automated tests can be run numerous times every day . Finally, you know when you are done because all the tests run!

But wait! That's not all. Automated Unit Testing actually identifies design mistakes . When a design involves too much complexity or coupling, unit tests are difficult to write. By restructuring the design to make it testable we accomplish goals of good design: Low Coupling, and High Cohesion.

All this for one LOW PRICE!

Benefits Outweigh Costs

Write Unit Tests test for everything that could possibly break, using automated tests that must run perfectly all the time. Writing the test code adds to the initial overhead of writing the production code. And, you MUST maintain the tests just as you maintain production code. Thankfully, once written, tests rarely need to change. As explained earlier, automated tests pay for themselves in programmer speed and code quality. With a little practice, you'll be able to write tests quickly.

TESTIFY!

Did you actually do this or is this just theory?

Yes, we actually did and do this. Here's a little testimony from three of ISS's internal projects (code names are used).

Stingray and Diablo

The SAFEsuite Events team started writing automated unit tests in the Stingray project (Events Controlled Release). On that effort, we only wrote tests for the Java Applet portion of the system. At first, our team had difficulty. In a short time, we developed habits and techniques to easily incorporate unit testing into development. When shipped, the Applet had over 100 tests. The experience was so positive that our current project (Diablo) uses unit tests throughout the entire system. The Engineers now rely on unit tests . The benefits have been demonstrated. We only want to work with code that has unit tests. Unit tests have enabled the team to go faster, simplify design, measure progress, and share ownership .

Magellan

The Magellan team started writing automated tests in early to mid 2000, after the project was well underway. They, at first, struggled to internalize test writing. But even under extreme deadline pressure, they learned to test and saw benefits. Most importantly, they stuck with it even after the original unit-test evangelist had left the team. Now, they have over 70 tests. They're hooked – They'll never write code without writing tests.

But First, the Requirements…

You can't expect these benefits with just any old wad of test code. There are a few qualities the tests must possess:

Easy to run – Programmers are lazy. They need to be able to press a single button to run the tests (all or a subset). Make it easy and developers will run them frequently. Make it hard, and no one will run them at all. Easy and Automated means the tests should not require any user input.

Self-checking – Programmers are lazy. They don't want to (and won't) examine a bunch of output to determine if anything failed. The test should check its own results and simply report "Pass" or "Fail".

Fast – Programmers are impatient. They'll be running many tests several times per hour. The tests must be quick. Tests that take too long will have to be optimized. We often use stubs: sometimes to fake I/O or provide a quicker access method; sometimes to short-circuit certain initializations.

Getting Started


The easiest way to get started is to grab someone who has done this and ask them to show you.

We use an off-the-shelf framework to manage the execution and reporting: JUnit for Java, CppUnit for C++. These excellent free frameworks are available online at xprogramming.com/software.htm. These frameworks make it easy to write and automate the Unit Tests.

Create unit tests when writing new code, fixing bugs, and refactoring . Don't worry about adding tests to existing code. Start out writing tests for new code and for any changes you make in existing code. You'll gradually add tests for the old logic as you make enhancements and fix defects. When making changes, first write tests for the parts of the system the changes will affect. The objective is to test everything that could possibly break. Use judgment; there is no need to go nuts. Use risk to drive which tests to implement. You should concentrate on where the risk is. [Refactoring; Martin Fowler; pp 101] There is a point of diminishing returns with testing – when you think you've done enough, stop!

IF and only IF your team is not writing enough tests, visibly focus on testing by tracking the total number of unit tests each day. We have code you can use that counts the tests and makes a cool plot (Figure 1).


Unit Testing the GUI

Don't. Don't test the GUI by firing UI events, scripting user input, scraping screens etc. Instead, use design patterns that separate the visuals from the logic behind the GUI. The rule is "Do No Processing In Your GUI Code." Some patterns that are popular are Model-View-Controller (MVC) [See getting_started2, search java.sun.com for MVC or read Krasner, Pope, A cookbook for using the model-view-controller user interface paradigm in Smalltalk-80, Journal of Object-Oriented Programming , 1(3):26-49, August/September 1988] and Application Façade [appfacades.pdf].

Write Unit Tests that exercise the logic of the system. With MVC, the tests would exercise the Model. For an Application Façade, the tests would exercise the Façade or the classes within the Façade.

Testing the visual portion of the interface is better left for humans. This is one area where we need the expertise that can only be provided by a skilled Quality Assurance team.

Debugging with Unit Tests

Defects indicate a missing unit test. Before coding a fix, develop a test that exposes the defect. Once it's fixed, your job is done – all the tests pass so Quit, Enough Done! (You will not have to run lots of tests by hand.) After fixing a defect, it is important to take a moment to think about the possibility of other missing unit tests. Write all those tests and make them pass. You'll quickly get good at this.

Unit Tests and Functional Test


Don't confuse "white-box" Unit Testing with "black-box" Functional Testing. Functional tests verify the entire end-to-end operation of each feature. Typically functional tests relate directly to system requirements (MRD). We still need automated functional tests (customer acceptance tests). And we still need a QA teams for their unique perspective and abilities. Unit Tests are very small tests on individual classes and methods of the system. They don't require the entire system to execute. A test may cover several classes or methods but the smaller and more specific the better. Since unit tests are written by development / for development and since all tests must pass before any changes are checked in, QA will never see a failed unit test.

Further Reading

Refactoring : Improving the Design of Existing Code (Addison-Wesley Object Technology Series); Addison-Wesley Pub Co; by Martin Fowler et al.; Chapter 4.
Chapter 4 gives a great introduction and how-to on Unit Testing. "The programming side of XP is all about being ready for the next requirement; refactoring is how you do it. Martin catalogs over 70 refactorings, the key steps in transforming a program to improve its structure while preserving its function. Refactoring is a core practice in XP, and this is the text." From Extreme Programming Installed

Extreme Programming Installed (The XP Series) ; Addison-Wesley Pub Co; by Ron Jeffries et al.; Ch 13 and 34.
Says the same thing we've said here, but they say it much more eloquently.

Extreme Programming Explained (The XP Series); Addison-Wesley Pub Co; by Kent Beck; Ch 18.
The book that started it all.

C2.com wiki on Unit Tests http://c2.com/cgi/wiki?UnitTests
Online community of Unit Test practitioners.

XP and Unit Tests http://www.xprogramming.com/Practices/PracUnitTest.html and http://www.xprogramming.com/publications/software_testing.htm
The home of XP on the net.

JUnit http://www.junit.org/
A framework for running Unit Tests.