February 1, 2001

Refactoring

Authors: Chris Simpkins, Andrew Fuqua

A few months into development our director paid us a visit. Being a wise veteran of the software industry, she was concerned how well our system tolerated changes in requirements and scope. Many projects handle such changes poorly. Without hesitation, we confidently explained that our system could not only tolerate a significant amount of change, but that we could keep the system working continuously while making the changes. How could we be so bold? Are we hypercompetent überprogrammers? Or foolhardy script-kiddies who don't fully understand the question we were answering? The key to our confidence lies in our consistent application of refactoring. If you'd like to learn more about refactoring, and perhaps become half as good a programmer as us in the process, read this brief overview, then visit the resources listed at the end of this paper.

What Is It?

Refactoring is changing code you've already written without breaking anything. More formally, it is changing existing code to improve its internal structure while preserving its external behavior. You are already refactoring, though you may not know it: You find a method that needs another parameter to do its job. You find a class which should be split in two. Making these kinds of changes is what refactoring is all about.

Often we make such changes haphazardly, perhaps even ripping up the entire system with the intention of putting it back together in an improved form. Often we break something along the way. But we don't have to. We can introduce discipline into this process. Like with design patterns, we can use a set of common "refactorings" – commonly needed changes to code. Each "refactoring pattern" describes the applicability of the refactoring and a proven step-by-step process for it. We don't have to re-invent the process of changing code every time we do it; we can follow a recipe developed and refined by masters of the art.

What's So Special About It?

"You said we're already doing it – why formalize?" The "father" of refactoring once wondered the same thing. Martin Fowler developed the idea of formalized refactoring after watching Kent Beck make large changes in a system by applying a series of very small changes.

Beck would break a large change into smaller steps and follow a painstaking process of making one change, running his tests to make sure everything still worked, and then repeating this "change, test" process until all the steps were done.

At that time Fowler's own method of refactoring was the slash-and-burn method described earlier – tossing up the code and putting it back together again – with all the debugging headaches that entails.

Fowler noticed that Beck was much faster and more successful at making changes using his incremental approach. The key to understanding why Beck's approach is better is realizing that smaller changes are much easier to understand and control. If you make large changes and the system breaks, you have to do a lot of sleuthing to find exactly what broke the system. But if you make very small changes, testing your system after each one, you know exactly what caused the problem and it's easy to undo the change.

A major tenet of software engineering is breaking large problems into chunks manageable by the human brain. Refactoring applies this principle to the process of changing existing code.

Can I Use It on my Project?

Yes, if you've satisfied a couple of prerequisites. The biggest prerequisite is that you have automated unit tests in place for the code you're changing. This is vital. You must be able to assert that your changes didn't break anything.

Another prerequisite is having some sort of source code control in place. A major benefit of refactoring in small steps is that it allows you to abort the process if you find it's not working. Source control allows you to do this by returning your code to its pre-change state.

How Will It Help me on my Project?

How many software projects have you worked on where the finished product was exactly as envisioned in the initial requirements? All software systems change during the course of development, and a large part of a team's success lies in its ability to deal with change. Refactoring will make your code better, allow you to spend less time on up front design, and give you confidence to make changes. To be successful, we must embrace change – not fear it. It's a fact of life. You have two choices: adapt, or die. Refactoring helps you adapt.

I'm Convinced! (or Just Curious) Where Can I Learn More?

Fowler, Martin. Refactoring: Improving the Design of Existing Code. Reading , Mass. : Addison-Wesley, 1999. This is the book on refactoring.

Martin Fowler's Refactoring web site contains a wealth of information, including the PhD thesis that started the formal Refactoring movement, a constantly expanding catalog of refactorings, and links to several Refactoring resources.

A refactoring plug-in for (X)Emacs. Yet another reason we should all be using the One True Editor (tm) :-)

Continuous Integration

Authors: Matt Di Iorio, Andrew Fuqua, Charlie Hubbard

Have you ever been on a project that practiced big-bang integration – wait until all the features are done, then throw it all together and see what happens? It usually turns out just like the name implies – a totally chaotic, primordial soup of software. Just getting the code to compile is an amazing feat. Then come the bugs. How painful it is to harness the power of nature to integrate your code!

Or say you have a daily build. But for some reason your daily build is broken daily. So you spend half the day tracking down the slacker who checked in the bad code.

"Ah, but my daily build never breaks", you say. Great! But have you ever spent more than a day tracking down one bug? I thought so.

Our team has overcome these problems using simple techniques: We check-in frequently. And we build and test continuously.

"We measure success one build at a time"

Our automated build and test process runs continuously – not daily. Well, it actually runs every half-hour, but that's close enough to continuous for me.

By building and testing continuously, we find it trivial to figure out what broke the build. Given the short time span between builds, the system is usually built with only one developer's changes at a time. If it broke, we readily know why.

By checking in frequently, we avoid file contention. This means never having to merge two sets of changes.

By doing both of these, we never need more than a few minutes to integrate a piece of code into the system. The code in MKS is always buildable – it always works – every minute of every day. I can't stress this enough: it is never in a broken state. That certainly makes integration easy. And since we check-in frequently, we never have much to integrate.

XP's practices build on each other. That's what makes the process so strong. Like XP, continuous integration's practices are multiplicative. Continuous integration draws its strength from good unit tests and frequent check-ins: The tests are more valuable because they are run more often. The frequent builds are more valuable because by running the tests they do more work each time.

Our House

You will get some benefit just by increasing check-in and build/test frequency. We discovered that some additional things we do make continuous integration even better. You may have different needs and different practices. But here is what we do.

Farewell Make

We broke our love affair with Make. She was never a very accommodating partner – always wanting things to be perfect, never compromising. I wanted spaces; She wanted tabs. We just couldn't go on.

We started using a tool called Ant. It's an extensible, cross-platform build tool written in Java. When we discovered that it uses XML for the build file, we just had to use it. Ant pulls from MKS, compiles the code, builds and signs the jar files, runs the tests, and lets us know if anything failed. A Windows Scheduled Task kicks it off every 30 minutes.

One cool Ant feature is the ability to write "build listeners". We wrote a listener to yell at us when the build fails. To notify the developers we use a "net send" command. We wanted to make it intrusive as possible so the developers couldn't ignore failures.

Developers need to build and test on their workstations before checking in their changes. Our IDE allows us to build and test. But we also wanted each developer be able to build with the same process as on the "official" build machine. This required us to modify our architecture but it helps keep the build running smoothly.

"Just the facts, ma'am"

Once you start running an automated build, you've got to see the results. Our results are on a web page. Prominently shown at the top is the Success / Failure message. Then come the details. If the build was not successful, you can see which files didn't compile or which tests failed.

When the build does fail, it only takes us a couple minutes to find and fix the problem. (Remember all the benefits I mentioned above?) In keeping with the spirit of "continuous", we don't wait 20 minutes for the next regularly scheduled build. We can kick off an unscheduled build at any time from our build page.

Our build page even has a cool plot of the number of unit tests run each day. That helps us focus on writing more tests.

Keep it up

It's always good to be on the lookout for things that you can do to make your life easier (or safer). For example, Mike Slifcak suggested we build the database from scratch before running all the tests. This ensures our production code and our install code don't get out of sync. Now we have continuous integration of the install and production code. See how much you can automate.

Our main goal with the build process was to get the most value by doing the least amount of work. There were lots things we could have added to our build process, like automatically logging all the changes between builds. But that wasn't necessary and would only give us more to maintain. Stay light!

Links

http://www.martinfowler.com/articles/continuousIntegration.html
Martin Fowler's article on continuous integration that got us started.

http://jakarta.apache.org/ant/
Here's the great great Ant.

Automated Unit Testing

Authors: Greg Houston, Andrew Fuqua, Tom Gagnier

Did you hear the one about the programmer who wouldn't rewrite some bad code because he was afraid of breaking something else? Since the product couldn't be easily enhanced, innovation was stilted and the company's market value tanked. Our protagonist spent the next two years maintaining that same ugly code. It's not a funny story, but it happens all the time. It doesn't have to. We can rewrite code without fear. We don't have to be the butt of that sad joke. All we need are fully automated unit tests to validate the behavior of our classes and methods.

"Okay, fine," you may be thinking, "but where do these tests come from? They don't exactly fall from the sky." No, they don't. Programmers write tests before they write the production code. Start right now – Write tests for each new method. It's easy and it has done wonders for innumerable projects.

But Wait, There's More!

A.K.A., The Benefits

You can not safely repair or rewrite poor code unless you have tests to prove that things still work. If you do have tests, you can restructure, replace, rewrite until your heart is content. So unit testing enables Refactoring , which keeps working code from degrading as changes are made. When you have unit tests, you have confidence to change any code, even code written by someone else. If the tests pass after making your change, you know you didn't break anything. If the tests worked before, but don't work now, you know you broke it! It's black and white. This enables shared ownership of the code.

Wow! Now we have clean, refactored, tested code and confident programmers. But isn't refactoring and extra testing going to slow us down? NO! Developers actually program faster with fewer defects when using Unit Tests: First, clean, refactored, tested code is easier (read: quicker) to maintain. Second, the tests either pass or fail. The programmers get immediate feedback. They aren't left wondering whether they've tested everything. Third, when tests fail, the code is still fresh in mind – the bugs are easier to find and fix . Fourth, the tests are run every time the code is changed. A programmer is not going to repeat by hand all manual tests ever performed on a class. It's not possible and if it were it would be too time consuming. Happily, automated tests can be run numerous times every day . Finally, you know when you are done because all the tests run!

But wait! That's not all. Automated Unit Testing actually identifies design mistakes . When a design involves too much complexity or coupling, unit tests are difficult to write. By restructuring the design to make it testable we accomplish goals of good design: Low Coupling, and High Cohesion.

All this for one LOW PRICE!

Benefits Outweigh Costs

Write Unit Tests test for everything that could possibly break, using automated tests that must run perfectly all the time. Writing the test code adds to the initial overhead of writing the production code. And, you MUST maintain the tests just as you maintain production code. Thankfully, once written, tests rarely need to change. As explained earlier, automated tests pay for themselves in programmer speed and code quality. With a little practice, you'll be able to write tests quickly.

TESTIFY!

Did you actually do this or is this just theory?

Yes, we actually did and do this. Here's a little testimony from three of ISS's internal projects (code names are used).

Stingray and Diablo

The SAFEsuite Events team started writing automated unit tests in the Stingray project (Events Controlled Release). On that effort, we only wrote tests for the Java Applet portion of the system. At first, our team had difficulty. In a short time, we developed habits and techniques to easily incorporate unit testing into development. When shipped, the Applet had over 100 tests. The experience was so positive that our current project (Diablo) uses unit tests throughout the entire system. The Engineers now rely on unit tests . The benefits have been demonstrated. We only want to work with code that has unit tests. Unit tests have enabled the team to go faster, simplify design, measure progress, and share ownership .

Magellan

The Magellan team started writing automated tests in early to mid 2000, after the project was well underway. They, at first, struggled to internalize test writing. But even under extreme deadline pressure, they learned to test and saw benefits. Most importantly, they stuck with it even after the original unit-test evangelist had left the team. Now, they have over 70 tests. They're hooked – They'll never write code without writing tests.

But First, the Requirements…

You can't expect these benefits with just any old wad of test code. There are a few qualities the tests must possess:

Easy to run – Programmers are lazy. They need to be able to press a single button to run the tests (all or a subset). Make it easy and developers will run them frequently. Make it hard, and no one will run them at all. Easy and Automated means the tests should not require any user input.

Self-checking – Programmers are lazy. They don't want to (and won't) examine a bunch of output to determine if anything failed. The test should check its own results and simply report "Pass" or "Fail".

Fast – Programmers are impatient. They'll be running many tests several times per hour. The tests must be quick. Tests that take too long will have to be optimized. We often use stubs: sometimes to fake I/O or provide a quicker access method; sometimes to short-circuit certain initializations.

Getting Started

The easiest way to get started is to grab someone who has done this and ask them to show you.

We use an off-the-shelf framework to manage the execution and reporting: JUnit for Java, CppUnit for C++. These excellent free frameworks are available online at xprogramming.com/software.htm. These frameworks make it easy to write and automate the Unit Tests.

Create unit tests when writing new code, fixing bugs, and refactoring . Don't worry about adding tests to existing code. Start out writing tests for new code and for any changes you make in existing code. You'll gradually add tests for the old logic as you make enhancements and fix defects. When making changes, first write tests for the parts of the system the changes will affect. The objective is to test everything that could possibly break. Use judgment; there is no need to go nuts. Use risk to drive which tests to implement. You should concentrate on where the risk is. [Refactoring; Martin Fowler; pp 101] There is a point of diminishing returns with testing – when you think you've done enough, stop!

IF and only IF your team is not writing enough tests, visibly focus on testing by tracking the total number of unit tests each day. We have code you can use that counts the tests and makes a cool plot (Figure 1).

Unit Testing the GUI

Don't. Don't test the GUI by firing UI events, scripting user input, scraping screens etc. Instead, use design patterns that separate the visuals from the logic behind the GUI. The rule is "Do No Processing In Your GUI Code." Some patterns that are popular are Model-View-Controller (MVC) [See getting_started2, search java.sun.com for MVC or read Krasner, Pope, A cookbook for using the model-view-controller user interface paradigm in Smalltalk-80, Journal of Object-Oriented Programming , 1(3):26-49, August/September 1988] and Application Façade [appfacades.pdf].

Write Unit Tests that exercise the logic of the system. With MVC, the tests would exercise the Model. For an Application Façade, the tests would exercise the Façade or the classes within the Façade.

Testing the visual portion of the interface is better left for humans. This is one area where we need the expertise that can only be provided by a skilled Quality Assurance team.

Debugging with Unit Tests

Defects indicate a missing unit test. Before coding a fix, develop a test that exposes the defect. Once it's fixed, your job is done – all the tests pass so Quit, Enough Done! (You will not have to run lots of tests by hand.) After fixing a defect, it is important to take a moment to think about the possibility of other missing unit tests. Write all those tests and make them pass. You'll quickly get good at this.

Unit Tests and Functional Test

Don't confuse "white-box" Unit Testing with "black-box" Functional Testing. Functional tests verify the entire end-to-end operation of each feature. Typically functional tests relate directly to system requirements (MRD). We still need automated functional tests (customer acceptance tests). And we still need a QA teams for their unique perspective and abilities. Unit Tests are very small tests on individual classes and methods of the system. They don't require the entire system to execute. A test may cover several classes or methods but the smaller and more specific the better. Since unit tests are written by development / for development and since all tests must pass before any changes are checked in, QA will never see a failed unit test.