Essential Unit Test Cards

What are the essential unit test cards? Some of you have asked for these to be assembled into one place, so here we are:
With this deck of cards, you have enough information to understand how to go about TDD, how to work around some problems, and why you should bother. Throw in the wikipedia article and its links (, a nice xUnit framework for your language, and you should be able to competently carry out TDD on your current project. If not, let us know what TDD cards you really need, but don't have.

Plan-Do-Check-Act


Font: Daniel Black

Thanks to Igor Czechowski for suggesting this card.

At the core of agile is short cycles of Plan-Do-Check-Act (PDCA). These steps are also what it means to be scientific in approach, at least per the definition of science that says you are following the scientific method: hypothesize, experiment, evaluate. Those who say agile isn't disciplined have not made this connection.


Plan-Do-Check-Act is echoed in agile practices, particularly TDD. The Plan step is about "making the expected output the focus," per Wikipedia. Writing a test that first fails captures your plan. After observing test failure, Do means you write enough code to make the test pass, and Check tells you to verify the actual results against the expected output. If there are differences you must Act to determine their cause and correct your implementation (or sometimes your expectations). In any case, you must also Act by observing the changes to the environment--the rest of the system--and "determine where to apply changes that will include improvement," which can mean some doing some incremental refactoring.


The iterative-incremental development core of agile also follows the cycle:


  • Plan - iteration planning/definition of acceptance tests

  • Do - day-to-day iteration execution

  • Check - verification of results using acceptance tests

  • Act - retrospectives and subsequent planning



As with many of the best modern ideas for quality control, PDCA in part comes from Dr. W. Edwards Deming. While Deming credited Walter Shewhart for the original concept of PDCA, Deming gets credit for popularizing the cycle.

Naming Fail or Comment Fail?

Names changed to protect the innocent:

/// Adds Sessions which fit in specified date-time range
private void ReadSessions() {

Abbreviations



Font is Mechanical Pencil

Source: Vadim Suvorov, Tim Ottinger

When can we use abbreviations as names in our source code? Can we ever use abbreviations as variable names? Vadim and I explored this issue, and Vadim in his orderly way of thinking enumerated these these principles. I'm sure that not everyone will find these to his liking, but I think these principles are well-reasoned and sufficient. I think these nest nicely into my naming rules in general, though my preference is to avoid any kind of encodings.

  • Shared, not Personal: the abbreviation should not be something the author has invented, and which other programmers will not recognize on sight.

  • Consistently Used: the abbreviation is not punned, so that it means one thing in one context and another thing entirely in a different context. Note that a very short abbreviation has a greater likelihood of collision (fn = function or filename or ...?).

  • Must Be Justified: If the programmer is to use abbreviations, then he should have clear reasons why the abbreviation is required. If, for instance, the abbreviation helps the reader see the unique part of the name without being distracted by context warts (prefix ofr suffix). My addition here is in the case of parallel names Persistant.User v. Domain.User if only one name is present, then no prefix is justifiable. My partner in this enterprise may not agree (likely with well-considered reason).

  • Special Latitude Given for Domains: in solution domains, some abbreviations are common and it is beneficial for the programmers to know them. If I worked with military jet software and didn't know IFF, or in education and didn't understand ILT, or if I worked in accounting and didn't grasp AP or AR then I would be less effective when communicating with the Business/Customer.



To the extent that your team deems to use abbreviations, we recommend these criteria for your consideration. Clean naming is one of the most important factors in writing understandable code, and has no negative effect upon compilation or runtime speed, and so is very precious to me. Yet, in the appropriate context, I am open to sacrificing the "no encodings of any kind" rule to appropriate use of well-reasoned abbreviations, with the caveats given above.

Planning Poker (R)



Source:
Wikipedia
Font: AndrewScript 1.6

Planning poker, trademarked by Mike Cohn, is a modernizing of a 50+-year-old estimating process known as Wideband Delphi. Estimating is not far from the dark arts, and attempts to make the process serious and exacting are ill advised. James Grenning devised planning poker as a quick and entertaining way to come to consensus on estimating stories in agile. I've found it can help dramatically minimize the tedium of estimating through a large stack of stories.


A typical point scale might be 1, 2, 3, 5, 8, 13. Resist larger scales--toss the higher-value cards. You might replace them with one card that says "too big."


You might want to include a few additional cards: 0, ? ("I don't have clue"), and infinity. The value of adding a 0 card is debatable (nothing is free, and even if development is "free," testing a story is never free), but you may find some usefulness in having it: Sometimes, completing one story automatically includes another. Or, a story might simply represent a milepost achieved.


The wikipedia site provides good detail on the steps involved, but I highly suggest you make your own rules and stick to them. The section on anchoring is particularly useful: part of the reason James devised planning poker was to counteract the heavy influence on estimates coming from one individual. Make sure that when people divulge their card selection, they aren't watching and waiting on certain other individuals to show theirs first!


Before starting the meeting, figure out how long you'd like to spend estimating. If your backlog of stories looks pretty good, and there's a good understanding of the project by most people in the room (obviously not always the case), you might find that 5 minutes per story works well. Appoint a facilitator who can keep time and help keep the estimating session on track.


If the product you're building is less well-known to the participants, this process will take considerably longer, maybe 10-15 minutes per story. Do the planning poker estimates, regardless, and plan on doing them again during a quicker second meeting. If you feel like you are bogging down on a story, and understanding of it is not "critical path," set it aside, and plan to come back to it after other stories are visited.


For a backlog of not-well-understood stories, you will probably want a couple sessions. Some stories will need to be set aside to split or researched offline. Some stories will need to be revisited by the customer. One of the best things to do is give people time to go off and think about things (and having at least one night between sessions is always a good idea).


Still, you want to avoid investing too much time in estimation. The more time you invest, the higher the expectation that the numbers coming out of it are anywhere near perfect. Estimates are guesses!


Instead, the estimation meeting is best seen as a way to ensure that we have good, appropriately sized stories that are fairly well understood by everyone involved. The consensus mechanism in planning poker will quickly let you know if this is not the case. Getting confidence from a good ballpark project plan is almost a bonus!

Principles of Package Cohesion


Post: Tim & Jeff
Source: Uncle Bob
Font: Segoe Print

Coupling and cohesion are the two most important principles guiding the quality of an object-oriented class design. Most programmers learned about these principles in their first week of exposure to OO. The rise of TDD has helped reinforce the value of low coupling and high cohesion (although many programmers still unconsciously and even consciously resist truly small, cohesive classes and methods).

Uncle Bob teaches us that these core OO principles apply equally to packages. The Agile in a Flash card for Principles of Package Coupling covers the dependency side of the dynamic duo--how do you structure packages so as to improve the coupling relationships between them? This card covers the other side--how should you compose a package? What is the definition of "cohesive," as it applies to packages?

  • The Reuse-Release Equivalence Principle (REP) - The REP tells us that classes should be packaged together because they are used together. This seems obvious, but many package structures are instead based around ideas like "functional areas," "architectural layers," or "originating team." The result? Users are inconvenienced, for example, by having to recite a litany of import lines at the top of each file. REP tells us to consider the destination instead of the source or even the structure of the code itself. It urges us to group things together for user convenience.

    An entire system shoehorned into a single package/library would comply with this principle. But if the rate of change in the library was not extremely low, it would suffer from the problems addressed by the remaining principles.

  • The Common-Reuse Principle (CRP) - This almost seems like a restatement of the REP, but the emphasis here is on limiting the impact to consumers. Imagine an API package with two sets of reusable class clusters. A programmer might choose to consume only one reusable component, but changes to the other component necessitate redistribution of the entire package. An unwanted release unnecessarily burdens the consuming programmer, who must re-integrate and re-test the entire library in order to stay current. Where the REP leads us to conglomerate, the CRP leads us to split packages apart. This tension between the principles leads us to find the level of granularity that provides greatest convenience (least negative impact) to users.

    In spirit, the CRP is a package-level restatement of the Interface Segregation Principle, which says to keep interfaces small and focused for similar reasons.

  • The Common-Closure Principle (CCP) - The CCP is an application of the Open-Closed Principle at the next level up. This principle suggests that you should group your classes around the impact of change. A change should optimally impact only one package; as many other packages as possible should be closed to that change. This varies from the other two principles as it recommends grouping classes around the way the code is maintained, rather than the way it is used. Modules with a high rate of change might need to be grouped by closure rather than by use. Highly stable packages might be better conglomerated (REP).

The principles of package cohesion are not absolute. Sometimes a packaging may satisfy all three principles at once, but often the principles represent competing principles. The development team will have to make trade-offs in order to find a balance that works. In general, smaller (and even smaller) packages provide the best chance for adherence to the CCP and CRP principles, although the REP says there is a limit to how small you will want to go.

Following these principles will require occasional re-packaging, which is upsetting to many users. However, correcting less-than-optimal packaging is a single deep cut that can halt the "death of a thousand paper cuts" caused when changes ripple across packages, or when users of a package have to deal with frequent irrelevant updates.

TDD Antipatterns


(font is brianne's hand)

James Carr enlisted a group of fellow travelers to define a list of TDD Antipatterns, errors in judgement common in TDD practice. He provided an initial list for us to base our work on, and did a very fine job of filtering out the duplicates and near-duplicates, providing catchy names, and writing up the result. Note that his list is longer than ours since we have a terseness constraint. Read the full list.

I wish I had thought of it first.

  • The Liar is a test that runs, but does not test what it claims to test. It could be named after a class, but actually be testing another. The test might be called ShouldNotThrowExceptionsForPositiveValues, but actually use the natural numbers less than 5. Liars give a false sense of security.
  • Excessive Setup is common when the architecture is badly coupled and mocking is not well-used. This is often evidence of insufficient pre-factoring. A little dependency injection and a little interface use can go a long way. This is also common when programmers give in to the urge to test software "in context." One hopes that when they get to the assert they'll remember what scenario they were testing.
  • The Giant is a single test that tests more than a single scenario. It may have excessive setup, but then follow with a large number of manipulations and assertions. It may test entire subsystems. Adding a new assertion requires a programmer to reverse-engineer his way through the Giant to find an assertion insertion point, and requries the programmer to exercise care to not leave side-effects that will cause the last half of the Giant to fail. Giants are ticklish and hard to understand.
  • The Mockery is a real piece of work. A pair/team/programmer has actually replaced the system under test with a test double. The test proves only that the mocks worked as expected.
  • Generous Leftovers are left behind by tests that dont clean up after themselves and cause later tests to fail when run in the suite but not when run in isolation. The leftovers are typicall in static memory, on disk, in a database (for shame!) or some other persistent store. When leftovers are found, it's often puzzling whose they are, and what they are.
  • Local Hero is a test that runs well, and tests a system well, but only passes on the author's machine or network. Being environmentally-sensitive, such tests fail for peers and CI systems. Typical failures involve hardcoded paths, locally-installed libraries, and OS-specific assumptions.
  • The Loudmouth is a test that produces copious output. It is like the boor at the party who thinks every trivial event in his live is worthy of an epic tale. At one time, the loudmouth's story may have been worth hearing, but now it's just idle chatter that gets in the way of a real conversation with more interesting guests/tests.
  • The Secret Catcher is a test that seems to do nothing at all, but is secretly (implicitly) depending on any errors to produce exceptions. The fact that the code executed without exception is expected to evidence that the code works. These tests can eventually be reverse-engineered by every programmer on the team. Failure of the test always results in code spelunking, an unpopular passtime.
  • The Hidden Dependency is a test that secretly requires setup that is not present in the test itself. Perhaps it requires a certain data setup script, or a change in a configuration file, or another test to have already run with generous leftovers. Hidden Dependency tests are a special kind of evil.
  • The Stranger is a perfectly good test in a perfectly wrong place. It is not a liar, it's just not testing the same system (SUT) that the other tests are testing. Strangers don't cause problem except when you're looking for tests for class X and it's hiding among the tests for class Y.
  • Success Against All Odds is a test that will always pass, no matter what. Due to a series of missteps, the test won't fail even if the code is wrong. Often these turn into complicated indirect versions of "assert true == true" or "false == false". These can be a variation on Mockery or The Liar. It is likely that a test with Success Against All Odds was written as a green test, without ever having seen the "red" part of the Red->Green->Refactor loop.
  • The Slow Poke is a test that takes too long to run. If you have 15000 tests, and you want them to run in less than 45 seconds, you have a budget of 0.003 seconds per test. A five-second test will cause some irritation. A few 10-second tests will make people think twice about running the tests at all. A test that takes a minute is unlikely to ever be run by a programmer. Imagine what hell awaits the author of some of the three-to-ten minute monstrosities that exist in the wild! In a TDD shop, slow pokes cannot be tolerated. Note that many slow pokes are also Giants with Excessive Setup. Be warned.
Tim recommends that readers follow the link to the original article, which covers more territory than the index card can allow.

What is missing?

We have dozens more cards we can write and post, so were not running short on ideas even though we can be short on time to get them all written up. Still, I wonder, if you could pick the next card for me, what would you want it to be about? Is there an area we've not addressed at all? Topics of interest to you that might be interesting to others? Guest authors we should contact? I'm all ears.

Pairing Workstation Configuration


(font is SD Marker still)

Teams beginning to use pair-programming often struggle because of poor workstation configuration as they attempt to use their individual programming space. There is more to it than merely adding a second chair. Take a moment to review Pair Programming Smells and recall that there are physical limitations as well as psychological limitations to overcome.

  • Chairs sit comfortably side-by-side so developers can have equal access. If either person is physically limited from grabbing the keyboard and mouse, then the setup is wrong. Pair programming is about sharing the editing of code together. It is necessary that both have equal access. Beware corners: if you have a monitor in desk/cubicle corner, then necessarily one person has better access than the other and pairing breaks down.
  • Add an extra monitor (or two!), preferably a nice, thin LCD or plasma screen. An extra monitor can be placed where the pair members can both see it equally well. A thin screen doesn't need a desk corner. Also, all of the annoying popups (mail, chat, etc) can be placed on the monitor where the code is not displayed, where it can be ignored. Finally, it is useful to run a countdown timer on one screen while programming on the other, to enable pomodoro-like techniques. Pairing is best done in time boxes with regular breaks.
  • Get some USB keyboard(s) for pairing. You should have a comfortable keyboard with a long cord. If one of you requires/prefers an ergonomic keyboard, then carrying around their favorite keyboard is a reasonable concession.
  • One mouse per keyboard is best. This is again about equal access for editing.
  • Get docking stations for notebook computers because shoving the computer back and forth is a pain, and because docking stations can give you extra USB ports and VGA ports. There really is not a lot of cost involved, and it really helps keep the pairing "fair".
  • Pens, scrap paper, and index cards should be in reach. You might be surprised how often you need to take a note now and come back as soon as the code is passing all the tests again. With two people, there are more ideas to choose from, and each learns tricks from the other. Writing things down allows each partner to process it when he is not pairing.
  • IDE/editor of choice with shared configuration is a contentious bit. We need to use the same tools if we are to have equal access for editing. In some teams, the emacs/vim/eclipse/scite wars will erupt, but it is better if the team chooses and all members learn to get by in the chosen environment. If they won't decide which editor to use, how are they going to make group decisions about design and architecture? It is better to learn use an "inferior" editor than to have to wrestle the editing environment from each other several times a session. When it comes to coding standards and standard editors, it is a good practice to "be a sport" and choose to make the team work better even if it will cost you a little in the short term.


  • Pair programming is not a utopian practice. Some people don't enjoy it at first, and some never warm up to it. Regardless, it makes the code better. If your team wants to make the code better through pair-programming, then it is important that their attempts are not hobbled by a poor workstation configuration. For a little bit of money, a leader can make a big difference in the way a company puts out software.

    SMART goals


    Font: Sterofidelic

    The best lists never die. The SMART mnemonic has been around for at least half a century, per Wikipedia. SMART is similar to INVEST, evaluating criteria for goals and objectives instead of for stories.


    SMART has generated many different word expansions over the years. Sometimes M is meaningful, manageable, or even motivational(!). Our Agile in a Flash card uses the supposedly preferred words. But as a result of this inconsistency, I struggled yesterday to recall the best choice for the letter R. "Realistic?" No, that's too close in meaning to attainable. A quick search revealed relevant (duh!). I supplanted my mild self-annoyance (at my inability to remember the better choice) with elation at the prospect for a new, relevant agile card!


    In the context of agile, I've found SMART goals to be useful when discussing action items to come out of retrospectives. True, there's no reason these couldn't be treated just like INVEST stories. But I've found that selling something half-a-century-tried-and-true can be a little easier with some crowds.


    Here are my thoughts about the relevance of the SMART criteria with respect to retrospectives.


    • Specific - Vague promises of improvement usually don't generate results. Think of the 5 W's: who, what, when, where, and why. Instead of "we'll try to get stories done earlier in the iteration," how about "the developers will deliver at least one story every two to three days to QA, who will complete testing of them within a day of delivery, so that we can ensure stories are 'done done' by iteration end." (And don't forget that "try" is a word you want to banish.)

    • Measurable - Attainment of our specific example goal might be validated by answering some questions: What was the average number of stories completed within two to three days? How many stories did not complete in this time? You can think of iterations as fixed time periods in which to run experiments; you can express a hypothesis that validates or disproves the value of each experiment by capturing relevant data.

    • Attainable - It's important that a team can check off completed goals, to reinforce the sense of achievement. Obviously goals that your team can complete in an iteration best meet this criterion, but you don't want to have only short-term goals. There's nothing wrong with long-term goals; just make sure there's a way to measure incremental progress.

    • Relevant - Too many trivial goals can give a bloated sense of achievement. Shortening daily stand-up meetings by limiting them to five minutes might seem beneficial, but does it really change anything? What's the real problem? Don't hesitate to attempt dramatic changes, and don't hesitate to think outside the box that pseudo-agile dogmatists might otherwise paint you in.

    • Time bound - Like stories, many teams tend to have a problem with letting things creep past iteration boundaries. "We just need a little more time." Set up the experiment, define completion and success criteria, and grade the experiment: it was either completed or abandoned, and the hypothesis either held true or was disproved.


    Get SMART today!