UncleBob's Three Rules of TDD



This was one of the three TDD cards I posted while at Object Mentor. A more attractive set of hand-drawn cards then appeared (with full attribution) at Brian DiCroce's site a little while after. I still wish I'd drawn them first. The other have been presented here: FIRST principles and our site's most popular card to date, the Red, Green, Refactor card.

These three laws originated with Robert "Uncle Bob" Martin, who has provided such a wonderful write-up that there is no value I can add other than shrinking the sentences to fit on a card. Bob is a great guy and a solid techie. And by solid, I mean SOLID.

By the way, I've submitted a paper to Agile2009 to do a little session on doing "in a flash" teaching sessions with index cards. If it's accepted, I'll see you there.

F.I.R.S.T



Source: Brett Schuchert, Tim Ottinger

In my Object Mentor days Brett and I were looking at ways to improve some class materials on the topic of unit testing. I noticed that our list of properties almost spelled FIRST. We fixed it.

We refer to these as the FIRST principles now. You will find these principles detailed in chapter 9 of Clean Code (page 132). Brett and I have a different remembrance of the meaning of the letter I and so I present for your pleasure the FIRST principles as I remember them (bub!). He had it right the first time so we will cut him some slack.

The concepts are very simple, and of course achieving them once you've gone off the track can be very hard. Always better to start with rigor here, and maintain it as you go.

Fast: Tests must be fast. If you hesitate to run the tests after a simple one-liner change, your tests are far too slow. Make the tests so fast you don't have to consider them.

A test that takes a second or more is not a fast test, but an impossibly slow test.

A test that takes a half-second or quarter-second is not a fast test. It is a painfully slow test.

If the test itself is fast, but the setup and tear down together might span an eighth of a second, a quarter second, or even more, then you don't have a fast test. You have a ludicrously slow test.

Fast means fast.

A software project will eventually have tens of thousands of unit tests, and team members need to run them all every minute or so without guilt. You do the math.

Isolated: Tests isolate failures. A developer should never have to reverse-engineer tests or the code being tested to know what went wrong. Each test class name and test method name with the text of the assertion should state exactly what is wrong and where. If a test does not isolate failures, it is best to replace that test with smaller, more-specific tests.

A good unit test has a laser-tight focus on a single effect or decision in the system under test. And that system under test tends to be a single part of a single method on a single class (hence "unit").

Tests must not have any order-of-run dependency. They should pass or fail the same way in suite or when run individually. Each suite should be re-runnable (every minute or so) even if tests are renamed or reordered randomly. Good tests interferes with no other tests in any way. They impose their initial state without aid from other tests. They clean up after themselves.

Repeatable: Tests must be able to be run repeatedly without intervention. They must not depend upon any assumed initial state, they must not leave any residue behind that would prevent them from being re-run. This is particularly important when one considers resources outside of the program's memory, like databases and files and shared memory segments.

Repeatable tests do not depend on external services or resources that might not always be available. They run whether or not the network is up, and whether or not they are in the development server's network environment. Unit tests do not test external systems.

Self-validating: Tests are pass-fail. No agency must examine the results to determine if they are valid and reasonable. Authors avoid over-specification so that peripheral changes do not affect the ability of assertions to determine whether tests pass or fail.

Timely: Tests are written at the right time, immediately before the code that makes the tests pass. While it seems reasonable to take a more existential stance, that it does not matter when they're written as long as they are written, but this is wrong. Writing the test first makes a difference.

Testing post-facto requires developers to have the fortitude to refactor working code until they have a battery of tests that fulfill these FIRST principles. Most will take the expensive shortcut of writing fewer, fatter tests. Such large tests are not fast, have poor fault isolation, require great effort to make them repeatable, tend to require external validation. Testing provides less value at higher costs. Eventually developers feel guilty about how much time they're spending "polishing" code that is "finished" and can be easily convinced to abandon the effort.

Automating Tasks



A naive team (or boss) will want to start their agile process by researching, purchasing, and installing all sorts of automated, web-based tools. This is odd when you consider that a major value of Agile is to prefer human interactions over tools and processes. Agile begins with people.

In order to help break the tendency toward automation madness, Jeff and Tim offer these tips:


Do strive to automate build, test, and install processes. You don't really want to press a button on a web page to get the tests to run. You want the testing built into your IDE or watching your source directories for changes. The dream is to always "just know" if you've broken anything, and to make releases a non-event. That's hard to do when your processes require manual intervention. Automate early and often.

Automate dull, tedious, or repetitive work because it makes your day seem long. Prefer to acquire rather than build, because the fun work of automating the dull work has a siren's song. Get it working, make it invisible, and get on with the work. Don't fear shell scripts, IDE automation scripts, or the like. But don't be fascinated with them either; automation by individuals (or pairs) still has to pay off in increased productivity for the whole team.

Automate to free time for creative, intellectual work. Automating is not something you do to seem more professional, to impress your peers, or to earn some kind of "automation points" with superiors. The point is to not have to spend time doing trivial and boring work. Rather than getting good at switching between multiple files and cut-n-pasting code, one should eliminate the wasted motion through design or automation. Then one can take on more interesting, difficult work and complete it within an interation boundary.

Never automate evolving processes. If you standardize a process that isn't settled in, you will either freeze the process prematurely, or you will be playing catch-up as the process continues to change. Alternatively, if you must automate an evolving process, be prepared to evolve your automation regularly or discard it and take over manually.

Do not automate in order to avoid human interaction. Building "avoidance technology" is contrary to the first principles of the agile manifesto. It is never our goal to avoid interaction with the Customer or between the QA and Programmers. Instead, add consider what techniques (low-tech or otherwise) will draw all the participants closer together or leave it alone.

Prefer physical artifacts with coercive immediacy over virtual artifacts stored somewhere in software. A corkboard loaded with 3x5 index cards will have more influence on a team's actions than a set of virtual cards in a web server somewhere. Manual processses with physical items have been a mainstay of agile development. Low-tech, high-touch processes can be quite satisfying.

Software serves the team, never the reverse. Nobody comes to work just to feed the machines. When the automation software demands tedious and/or meaningless efforts from the team, is it worth keeping? Automation is supposed to make it possible for us to do more interesting work, it is not supposed to increase tedium. When software gets in the way, the next retrospective action should be obvious.

All developers owe a debt to those who build great tools like Eclipse, Vim, Emacs, etc. Great tools tend to have a wonderful transparency. In a good editor, the code seems to almost dance and shape itself before your eyes. Great build tools like Paster and Maven take a lot of the effort out of building deployable modules. These things are wonderful. On the other hand, someone has to turn out the business applications during the day job, and for those hours of the day it is important that the application is the star of the day and the tools melt into the background.

Coding Standards



Coding standards? That tired, old chestnut? Who talks about coding standards these days?

You do. You are talking about coding standards when you see someone has their tabs set wrong, when you complain about bracing, when you ask someone why they put their spaces where they do, when you add comments and someone else says delete them, ...

The need for a common style guide was glossed over in Collective Code Ownership, in the following paragraph:
The team has a single style guide and coding standard not because some arrogant so-and-so pushed it down their throat, but rather the team adopts a single style so that they can freely work on any part of the system. They don't have to obey personal standards when visiting "another person's" code (silly idea, that). They don't have to keep more than one group of editor settings. They don't have to argue over K&R bracing or ANSI, they don't have to worry about whether they need a prefix or suffix wart. They can just work. When silly issues are out of the way, more important ones take their place.

So what kind of document is the code standard? The authors have seen plenty of large, complex, detailed guides that strive to be comprehensive. Who has the time? Perhaps the answer should be to look for the least documentation one can afford. How much is too much? How much is too little?

The first recommendation is that a team should standardize to avoid waste. In this case, "waste" includes rework, arguments, and work stops while issues are settled. In this regard, it is better to have even a bad decision than a variety of opinions. If we find that we are enduring waste, then we add a line to the standard. Again, we have to build collective ownership and any choice is better than arguing the same points repeatedly.

The team should start with an accepted community standard if one can be found. If there are multiple, choose one of them. For the python community, PEP 8 is a wonderful starting point. One may look at a ubiquitous tool's default styling as a community standard (community of tool users), since it seldom pays to to fight your tools. Notice that public style guides tend not to be the smallest document one can afford, but the goal is really to make development time productive and non-contentious (with regard to silly issues) so even a longer document may be successful.

If there is some particular contentious difference of opinion, then the team should time-box the argument. They might choose to debate for an hour, agree, and write it down. Some teams don't time-box the argument, and others do not reach resolution. Do not let contentious issues remain, or they will remain contentious. Once agreed and recorded, the issue should not be revisited. Nor should a pair partner allow his partner to spend iteration time arguing over decisions already made. Nobody has to love the decision, but they should admit to a fight well fought and a final answer that should not get in the way of getting real work done well and often.

Ideally, the standard should be minuscule. A standard-by-example will have a better chance of being concise and obvious. Therefore, the standard should be mostly code and fit on one page. This is especially true of the initial version of the team's style guide. To some degree, arguments and uncertainty will lead to the accumulation of additional guidelines but it is wise to start small. Our working documents should be like our code in the sense that it is as small, simple, and unambiguous as possible.

The style guide is an agile artifact. it is subject to corrective steering. A team should revisit the standard every iteration until no one cares any more. Iteration time is too valuable for these arguments, but retrospective time exists to help the team eliminate waste and turbulence. If further discussion of style points will help smooth the coming iteration, then it should be brought up.

The rule for simplifying any system is to obviate and then remove steps, processes, and instructions. If the code is written to standard, then the code ultimately becomes the standard and the standard becomes redundant. It is perfectly reasonable for a mature team to discard their documented style guide and keep the style. You will know the style guide is unnecessary when all new code looks like the existing code. And it all looks good.

Top 10 Agile Failure Factors



In working with teams attempting to be agile, we tend to see them make many of the same "mistakes" over and over. This list captures some of the most common pitfalls we've seen teams stepping into.

Please take a moment to help refine this card by completing a short poll. It'll be open for a week, and we'll feed the results back into this list--both to help prioritize it and possibly swap out some factors for others. If you want to challenge any of the elements, you can of course add your thoughts in a comment.

Courage


Sources: Beck, Kent. Extreme Programming Explained, Addison Wesley, 2000; Ottinger, Tim; Langr, Jeff.


When I first started teaching XP, I remember having more than enough to say about the other three XP values (communication, feedback, simplicity). Courage? Uh... well, you know, you need it to be able to stick to your guns with respect to the other three values. Moving on...

So I particularly like this card for its specifics on when you need to be courageous. It also highlights the opposite dependency of courage on the other three values: An environment where you're not communicating, acting simply, or generating lots of feedback is going to make you feel like Tommy (or Helen Keller) piloting a helicopter.



  • To make architectural corrections - It takes little courage to put a model on paper and make everyone sign their name in blood, insisting that we cannot change anything going forward. We're all courageous about the distant future (which we might never be a part of).

  • To throw away tests and code - "I worked on that mess all day long, are you kidding?" Often the best results are produced by discarding a poor solution and reworking it. Even short-term, this takes far less time than most people think, and long-term it usually returns many times over the modest rework.

  • To be transparent, whether favorable or not - It's so easy to hide in your cube, or to use a long, drawn out development period that makes it seem like real progress is being made. Yes, short iterations can make it obvious that you know nothing about things like getting the software successfully integrated in a timely fashion.

  • To deliver complete, quality work in the face of time pressure - Push back. If your manager tells you, "never mind with the testing, just get it out the door," push back. "We don't have time to pair or review this iteration." Push back.

  • To never discard essential practices - Ditto, from the last one. If your manager says, "you're not allowed to refactor the code," push back. You will pay for omission of essential activities, to the point that having done them at all in the first place will have seemed like a waste of time. (In other words, if you're going to sling code like a cowboy, just do it and stop pretending you're agile or at all professional.)

  • To simplify code at every turn - Does this really take courage? Fortitude, maybe. Perhaps the courage here is learning to accept that your code probably does stink, and that you can almost always improve upon it.

  • To attack whatever code the team fears most - The usual reaction is to tread lightly. What's particularly interesting is that treading lightly can often lead to the worst possible design choices. For example, I need 3 lines of alternate behavior in the middle of a 2000 line method. My fear leads me to believe that the safest thing is to copy the entire method and change the three lines in the copy. Courage would allow me to refactor to a template method pattern or some other solution that resulted in only three additional lines of code.

  • To take credit only for complete work - The business can't use software that's 99.99% complete. If it doesn't do what the customer asked for, it's not complete. The courage here is to accept that incomplete work delivers no value, and admitting incomplete work must be reflected on the very visible plan.



Courage is excessively important with respect to communication in an agile environment, so there are many more specific elements that we could add here. It requires courage to speak up about the challenges we face on a development effort. Retrospectives, for example, are not at all useful without the XP value of courage.

Red-Green-Refactor



Drawing: Tim Ottinger

Photo: Libby Ottinger

Cleanup: Jeff L.


We show flash cards to students in order to help them completely ingrain a concept, to the point where they don't think about something, they just know it. The classic flash card presents a student with a vocabulary word or a math expression for which we expect almost immediate recognition and response. I show you the Spanish vocabulary word "acuario" and you blurt out "fish tank!"


So, red-green-refactor. By definition, TDD says write the tests first. They should fail, since you've not built the functionality that the tests specify, and a GUI test tool will show red at this point. That's useful feedback that tells you to write just enough code to get all existing tests to pass; the GUI test tool shows us green. Finally, you can ensure that the code has an optimal design during a refactoring step, since you have tests to give you the confidence to change things. Spend a few minutes, improve the design, and re-run your tests, which should all still be green. The entire cycle should take about 5 minutes on average, and no more than 10 minutes.


If you've been doing TDD for more than a few days--I mean really doing it and not writing code first and then sneaking in a few tests afterward and then telling people "I got muh TDD's done"--this cycle should be starting to sink in. Those of us who've done TDD a little longer--maybe a few months--don't think twice about it. Red-green-refactor feels like the natural way for us to build software.



  • Red - A common mistake for newbies to TDD is to gloss over the need to see the test fail first. It's part of keeping on course; the red is extremely valuable feedback that tells us our assumptions still hold true. Once in a while, they won't, and you'll have saved yourself a lot of time by finding that out immediately.

  • Green - Keep rough track of how long it takes you to derive a solution that passes all your tests. If you're taking 5-10 minutes on average, or more, start figuring out ways to take smaller steps.

  • Refactor - You should always take advantage of the refactoring step. Even if you added perfect code (that doesn't duplicate any other code in the system), treat the refactoring step as "mandatory bonus time." Poke around the area you're changing. Get rid of a warning. Rename a test. Improve the readability of an existing method. Follow the boy scout rule: Make things a little better when you leave than they were when you arrived. Note that the refactor step doesn't necessarily represent a single green test run: You should look to decompose refactoring efforts into even tinier steps, getting a rapid succession of green bars before moving on.


Writing Characterization Tests




Source: Feathers, Michael. Working Effectively With Legacy Code (WELC), Prentice Hall PTR, 2005.


Test-after development (TAD) is tough; dependencies and non-cohesive code can make for a significant challenge. That's why Michael Feathers' book Working Effectively With Legacy Code is so useful.


Does working with legacy code really require a different set of skills than doing TDD? Well, sprout method and sprout class are the first avenues you should always explore, and they are strictly about test-driving the new code. But most of the other WELC techniques require less test driving and more digging about, trying different things. Some of the techniques even border on being too clever--they're about problem solving, and you do what you gotta do!


An important step in modifying existing code is understanding what it does before you make any changes. Feathers says characterization tests are like putting a vise around code: You want to pin it down before you attempt changing it, so that you know if it slips a bit when you do. So in a sense, we are back to test-first.


How many characterization tests do you need to write? You can read the WELC chapter on effects analysis, and start to get scared that you're going to have to write a lot of tests. Or you can use confidence as a guideline, writing as many tests as you think you need to understand the code you're about to change.


The flash card describes what are pretty obvious steps, but they back one of the underlying themes in Feathers' book: be methodical, be safe.

Story Format



Well, it had to happen sooner or later: a card that bothers me quite a bit. Mike Cohn's book Agile Estimating and Planning, a book I recommend for just about anyone working on an agile project, popularized this story card format.


"As an actor, I want to accomplish some goal, for some reason"--very similar to what we did with use cases in the early 90s. And back then, we spilled time bickering over format and wording, so I give credit to whoever came up with a template to help sidestep those wasteful arguments.


The problem is that the story card ain't the thing, it's a placeholder and "promise for more communication," per Ron Jeffries. The card could have one word on it, and that would be sufficient. Bob Koss often told a story about a story for the U.S. Navy regarding how to adjust large gun firing based on sonic feedback. The story card said simply "boom splash!" That was enough to initiate talking about it, and no one forgot who the story was for, what it meant, or why it existed over the course of the iteration.


I realize I'm railing against a potentially useful flash card. As a tool, the format template can help us considerably in improving our collections of stories. Some same lessons from use cases apply: Thinking about the actors involved (the "as a" people) can help trigger the introduction of important stories that might be missed. The phrase "I want to" reinforces that stories are goals for customers the system, and not just technical pipe dreams. And "so that," well, as we write out dozens of stories in release planning, we'll want to remind ourselves of the rationale behind certain stories (but not all of them!). And sometimes this "why" can trigger other useful considerations.


To be fair to Cohn, he says many similar things in his book. But you know how people are. I've already encountered spreadsheets and other software tools that rigidly insist on this format. Never mind that the actor is the same for every last story, or that the reasons are pretty obvious for most stories, or that typing all that stuff is just redundant crud that we would stamp out if it were in our code.


Remember: The cards we present as part of the Agile In a Flash project are tools, not gospel. They're here to help prod you if you get stuck, to give you ideas, and to give you guidance. In most cases, you should follow them unless you have darn good reasons--and even then, you should follow them until you know why the rules exist, and only then should you consider taking your own path.

Breaking Down Larger Stories



Source: Jeff Langr, Tim Ottinger; also, Cohn, Michael. Agile Estimating and Planning.

One ideal for agile development would be to be able to deliver a "done done" story every day. We get a lot of resistance on this thought, and we push right back. Our goal is not to insist on the ideal, but instead to get you to move toward that goal, and not to defend stories that we think are too large. "Too large?" There's no consensus. Anything taking over a half iteration undoubtedly should be scrutinized.


One way to deliver stories more frequently is collaborate a bit more. Instead of one developer working a large story alone (sadly common), consider adding another developer or two or three. If they can help deliver the story sooner without inordinate overhead costs (including coordination efforts to avoid stepping on each other's toes), go for it.


The other route is to split stories. Breaking down larger stories is an often challenging proposition, but the more you do it, the easier it gets. Next time you have a large story, step through the list on the card.


The most challenging stories are "iceberg" stories. These are stories where the customer sees only a small impact, but the algorithmic or data complexity required to implement the story is large.


Sometimes a split isn't worth it. For example, you look to consider an alternate case as a separate story, but the developer says, "well, it's only going to take me ten minutes to implement that."


A key thing to think about when looking to split stories is the tests.


  • Defer alternate paths / edge cases - If anything, thinking about how to split stories around alternate cases will force you to make sure you've captured all of them!

  • Defer introducing supporting fields - If the story involves user input of good-sized chunks of data, support a few key and/or significant fields and introduce the remainder later.

  • Defer validation of input data - Demonstrate that you can capture the information; add the ability to prevent the system from accepting invalid data later.

  • Defer generation of side effects - For example, creating a downstream feed when the user updates information.

  • Stub other dependencies - "Fake it until you make it."

  • Split along operational boundaries (e.g. CRUD) - This is directly from Cohn's book. Note that you don't necessarily have to have "C" (Create) done before you implement "U" (Update).

  • Defer cross-cutting concerns (logging, error handling) - This generates an interesting challenge around an important point--devising acceptance tests that verify the addition of robustness concerns.

  • Defer performance or other non-functional constraints - Sometimes it's possible to bang out an implementation using an overly simplistic algorithm. A similar challenge: How do you devise acceptance criteria for the performance improvement?

  • Verify against audit trail that demonstrates progress - I'm not enamored with this one, but sometimes there are no other obvious solutions. Sometimes this will expose more implementation details than you would like. Perhaps these tests can disappear once the entire story gets completed.

  • Defer variant data cases - Will it simplify the logic if we don't have to worry about special data cases? Or more complex data variants? For example, it's probably a lot easier to devise a delivery scheduling system that supports only one destination.

  • Inject dummy data - In some cases, data availability or volume can be a barrier to full implementation of a story.

  • Ask the customer! - You may be pleasantly surprised!



Well, this list isn't necessarily complete, and needs a bit of work. What other story splitting mechanisms have you used successfully?