Test SCUM


Font is "Pen of Truth"

Source is Ben Rady and Rod Coffin of Improving Works, given in an agile presentation at Agile2009.

In sharp contrast to the FIRST principles of good unit tests, Rady and Coffin give us these properties of completely bad (scummy) unit tests.

  • Slow: We discussed now important speed is in unit tests. I find that when tests are more than 40 seconds, people have to decide to run them, rather than feeling free to run them immediately after every small step. Rady's Infinitest runs tests automatically and analyzes the tests to selectively exclude those which have no (discernable) transitive dependency on the system under test, just to eek out some extra speed. While it is preferable to run all the tests all the time, in many circumstances culling the unrelated tests will give back some speed.
    Tests are generally slow because thy provide insufficient insulation from the environment. They tend to spend a lot of time waiting on clocks, file systems, databases, web services, or the like. Slow tests are best rewritten with isolation from the environment.
    How fast is fast? I find that I can tolerate a usefully large battery of tests if each of them takes about .002 seconds. I can tolerate any number that starts with a dot and two zeroes. I might be able to tolerate a very few with only one zero immediately after the dot. More than that is "slow". It doesn't take many multi-second tests before developers feel reluctance to run them.
  • Confusing: We have emphasized that good tests isolate errors and their causes. A confusing test is one that fails to isolate a specific error. This may be because it has too many things going on, or because it is simply unreadable. It is preferable that the combination of the test class name, the test method name, and the assertion gives all the information a developer needs in order to correct a misstep. In a confusing test, those three items are not enough an the entire body of the test may not shed light on the failure mode. Confusing tests are best rewritten as smaller, more clear tests.
  • Unreliable: Tests should be repeatable: they should always fail the same way, or always pass for the same reasons. Tests that run only in isolation or on certain days or times of day are not reliable tests. One of the best ways to judge reliability is to run tests repeatedly. Note that overspecification often causes tests to fail in unexpected and non-useful ways.
  • Missing: The tests that you don't run provide you no benefit, and quite a bit of harm. Tests help you to design your production code, and also help you to tell if you are making decisions that break the system in unexpected ways. Not having the tests means that you just don't know. Maybe your design is iffy in ways you hadn't noticed. Maybe you've broken other code with your last line of code. Maybe the behavior you hand-verified last hour is no longer occurring as you expected it to. The tests you did not write are the ones that are really holding back your productivity.
While we could list other unwanted qualities of tests, these seem to describe the most painful sins.

12 Principles for Agile Software Development


Font (once again): MechanicalPencil

The canonical source for these principles is the agile manifesto website.

These are the basic agile principles, abbreviated to fit onto a 3x5 card without requiring the reader to hold a magnifying glass. They are "sound bites" and not the whole story. Each of these principles can (or has) launched myriad blogs/articles, and indeed many of the other Agile in a Flash cards touch on these principles. We could easily build a distinct card for each, though that would inflate the size of our eventual card deck quite a lot.

It has long been Tim's perspective that Agile is about having a short reach so we are allowing that point of view to color our summary of agile principles.
  1. Satisfy the customer through early and continuous delivery--We shorten the distance between requirements gathering and customer feedback. The period is shorter because we plan less change at a time, and in return we get more opportunities to steer the software in a direction the customer appreciates. Notice that the principle actually says "continuous delivery", not just "quarterly" or "bimonthly." Early and continuous delivery is not about working faster, only about working sooner.
  2. Welcome changing requirements, even late in development--We shorten the distance between conceiving and implementing an important change. We don't have to wait for a redesign of the whole system, or for the next system to be built. This is not to say that no features will be delayed; feature requests are frequently reordered or even dropped. The agile difference is that such changes take effect sooner.
  3. Deliver working software frequently--We shorten the distance between the system-as-designed and the system-as-built. Both will evolve as we learn more about the system as it should be built. We also shorten the distance between planning and delivery, giving more opportunity to improve the efficiency and effectiveness of our work.
  4. Business people and developers work together daily--We shorten the physical distance between a question and its answer. Moving the customer to a different building, area, room, or even to a cubicle just around the corner dramatically reduces the number of questions we ask the customer. With collocation, the business and technology sides learn to better understand each other and to make more mutually-beneficial decisions.
  5. Build projects around motivated individuals--We shorten the distance between intent and action. The goal is to build an agile team of skilled professionals who care about all the business concerns including schedule and content, and who will work in a highly-engaged and result-oriented way. Such individuals need no babysitting and very little direction. Such a team will refused to be blocked, and will produce their best work at all times. Management does not have to spend time on motivational issues or "cat herding" so common to less-motivated teams.
  6. Convey information via face-to-face conversation--We shorten the time between a question and its answer. Agile teams work in bullpens because it makes it much easier to ask questions and offer suggestions. Things that should be communicated get communicated, not forgotten, diluted, or otherwise insufficiently communicated. While there are many attempts to go agile without co-locating team members, we are not aware of any which have the kind of success which is typical in co-located teams.
  7. Working software is the primary measure of progress--We shorten the distance between thinking we're done and knowing we're done. The team should be judged by the product it produces, not by the rate of typing or number of degrees, or hours worked per month, or how quickly the members walk from the parking lot, or how quietly they work from their individual cubicles. A good team frequently produces quality software the customer wants. All other measures are subordinate or irrelevant.
  8. Maintain a constant pace indefinitely--We shorten the distance between productive bursts. This doesn't mean that management sets a large minimum hour limit for developers to endure months or years at a time. Excessive overtime cannot continue indefinitely without severely impacting quality. Instead, we choose a pace that allows us to go home tired and satisfied in the evening, and then return fresh and ready to rock in the morning. For normal people with families and bodily needs, that pace will not be 90 hours a week, and it probably won't even be 50 hours a week. We are never impossibly far from our life-sustaining relationships and activities.
  9. Give continuous attention to technical excellence--We shorten the distance between implementation and ideal. An agile developer is never more than a few minutes away from the last time all the tests passed. Collaborating classes are not at the opposite ends of long chains of contains/references/inherits relationships (e.g. "train wrecks"). Developers need not wait to clean up redundant or confusing code. If working code is the measure of agility, then excellent code must be defined as code that accepts changes gracefully. An agile team takes steps to ensure that code gets better with each iteration.
  10. Simplify: maximize the amount of work not done--We shorten the distance between comprehension and completion. We eschew things that don't matter. If we're less encumbered by unhelpful tasks and unwanted features, we've shortened the reach to useful work. We also attempt to simplify code (reducing the amount of reverse-engineering other programmers must do). Agile programmers tend to follow the rules of simple design.
  11. Teams self-organize--We shorten the distance between need and action. We don't wait around to be told who does what. We do what needs to be done, not waiting for direction or supervision. We attack problems with fervor, mitigating risks and clearing obstacles.
  12. Teams retrospect and tune their behaviors--We shorten the distance between introspection and adaptation. Improvement is never far away. Each iteration, we explicitly find ways to improve process simplicity, code quality, technical excellence, and predictability of results. We analyze problems and obstacles, and look for root causes and their solutions. We actively plan to use better techniques, tools, and process flows. We act on the plan in the subsequent iteration, without delay.

These principles are fairly simple in concept, but are profoundly deep in practice. If you are transitioning to an Agile work style or are looking for ways to improve your current Agile practice, we suggest you begin again with the principles espoused here.

Essential Unit Test Cards

What are the essential unit test cards? Some of you have asked for these to be assembled into one place, so here we are:
With this deck of cards, you have enough information to understand how to go about TDD, how to work around some problems, and why you should bother. Throw in the wikipedia article and its links (, a nice xUnit framework for your language, and you should be able to competently carry out TDD on your current project. If not, let us know what TDD cards you really need, but don't have.

Plan-Do-Check-Act


Font: Daniel Black

Thanks to Igor Czechowski for suggesting this card.

At the core of agile is short cycles of Plan-Do-Check-Act (PDCA). These steps are also what it means to be scientific in approach, at least per the definition of science that says you are following the scientific method: hypothesize, experiment, evaluate. Those who say agile isn't disciplined have not made this connection.


Plan-Do-Check-Act is echoed in agile practices, particularly TDD. The Plan step is about "making the expected output the focus," per Wikipedia. Writing a test that first fails captures your plan. After observing test failure, Do means you write enough code to make the test pass, and Check tells you to verify the actual results against the expected output. If there are differences you must Act to determine their cause and correct your implementation (or sometimes your expectations). In any case, you must also Act by observing the changes to the environment--the rest of the system--and "determine where to apply changes that will include improvement," which can mean some doing some incremental refactoring.


The iterative-incremental development core of agile also follows the cycle:


  • Plan - iteration planning/definition of acceptance tests

  • Do - day-to-day iteration execution

  • Check - verification of results using acceptance tests

  • Act - retrospectives and subsequent planning



As with many of the best modern ideas for quality control, PDCA in part comes from Dr. W. Edwards Deming. While Deming credited Walter Shewhart for the original concept of PDCA, Deming gets credit for popularizing the cycle.

Naming Fail or Comment Fail?

Names changed to protect the innocent:

/// Adds Sessions which fit in specified date-time range
private void ReadSessions() {

Abbreviations



Font is Mechanical Pencil

Source: Vadim Suvorov, Tim Ottinger

When can we use abbreviations as names in our source code? Can we ever use abbreviations as variable names? Vadim and I explored this issue, and Vadim in his orderly way of thinking enumerated these these principles. I'm sure that not everyone will find these to his liking, but I think these principles are well-reasoned and sufficient. I think these nest nicely into my naming rules in general, though my preference is to avoid any kind of encodings.

  • Shared, not Personal: the abbreviation should not be something the author has invented, and which other programmers will not recognize on sight.

  • Consistently Used: the abbreviation is not punned, so that it means one thing in one context and another thing entirely in a different context. Note that a very short abbreviation has a greater likelihood of collision (fn = function or filename or ...?).

  • Must Be Justified: If the programmer is to use abbreviations, then he should have clear reasons why the abbreviation is required. If, for instance, the abbreviation helps the reader see the unique part of the name without being distracted by context warts (prefix ofr suffix). My addition here is in the case of parallel names Persistant.User v. Domain.User if only one name is present, then no prefix is justifiable. My partner in this enterprise may not agree (likely with well-considered reason).

  • Special Latitude Given for Domains: in solution domains, some abbreviations are common and it is beneficial for the programmers to know them. If I worked with military jet software and didn't know IFF, or in education and didn't understand ILT, or if I worked in accounting and didn't grasp AP or AR then I would be less effective when communicating with the Business/Customer.



To the extent that your team deems to use abbreviations, we recommend these criteria for your consideration. Clean naming is one of the most important factors in writing understandable code, and has no negative effect upon compilation or runtime speed, and so is very precious to me. Yet, in the appropriate context, I am open to sacrificing the "no encodings of any kind" rule to appropriate use of well-reasoned abbreviations, with the caveats given above.

Planning Poker (R)



Source:
Wikipedia
Font: AndrewScript 1.6

Planning poker, trademarked by Mike Cohn, is a modernizing of a 50+-year-old estimating process known as Wideband Delphi. Estimating is not far from the dark arts, and attempts to make the process serious and exacting are ill advised. James Grenning devised planning poker as a quick and entertaining way to come to consensus on estimating stories in agile. I've found it can help dramatically minimize the tedium of estimating through a large stack of stories.


A typical point scale might be 1, 2, 3, 5, 8, 13. Resist larger scales--toss the higher-value cards. You might replace them with one card that says "too big."


You might want to include a few additional cards: 0, ? ("I don't have clue"), and infinity. The value of adding a 0 card is debatable (nothing is free, and even if development is "free," testing a story is never free), but you may find some usefulness in having it: Sometimes, completing one story automatically includes another. Or, a story might simply represent a milepost achieved.


The wikipedia site provides good detail on the steps involved, but I highly suggest you make your own rules and stick to them. The section on anchoring is particularly useful: part of the reason James devised planning poker was to counteract the heavy influence on estimates coming from one individual. Make sure that when people divulge their card selection, they aren't watching and waiting on certain other individuals to show theirs first!


Before starting the meeting, figure out how long you'd like to spend estimating. If your backlog of stories looks pretty good, and there's a good understanding of the project by most people in the room (obviously not always the case), you might find that 5 minutes per story works well. Appoint a facilitator who can keep time and help keep the estimating session on track.


If the product you're building is less well-known to the participants, this process will take considerably longer, maybe 10-15 minutes per story. Do the planning poker estimates, regardless, and plan on doing them again during a quicker second meeting. If you feel like you are bogging down on a story, and understanding of it is not "critical path," set it aside, and plan to come back to it after other stories are visited.


For a backlog of not-well-understood stories, you will probably want a couple sessions. Some stories will need to be set aside to split or researched offline. Some stories will need to be revisited by the customer. One of the best things to do is give people time to go off and think about things (and having at least one night between sessions is always a good idea).


Still, you want to avoid investing too much time in estimation. The more time you invest, the higher the expectation that the numbers coming out of it are anywhere near perfect. Estimates are guesses!


Instead, the estimation meeting is best seen as a way to ensure that we have good, appropriately sized stories that are fairly well understood by everyone involved. The consensus mechanism in planning poker will quickly let you know if this is not the case. Getting confidence from a good ballpark project plan is almost a bonus!

Principles of Package Cohesion


Post: Tim & Jeff
Source: Uncle Bob
Font: Segoe Print

Coupling and cohesion are the two most important principles guiding the quality of an object-oriented class design. Most programmers learned about these principles in their first week of exposure to OO. The rise of TDD has helped reinforce the value of low coupling and high cohesion (although many programmers still unconsciously and even consciously resist truly small, cohesive classes and methods).

Uncle Bob teaches us that these core OO principles apply equally to packages. The Agile in a Flash card for Principles of Package Coupling covers the dependency side of the dynamic duo--how do you structure packages so as to improve the coupling relationships between them? This card covers the other side--how should you compose a package? What is the definition of "cohesive," as it applies to packages?

  • The Reuse-Release Equivalence Principle (REP) - The REP tells us that classes should be packaged together because they are used together. This seems obvious, but many package structures are instead based around ideas like "functional areas," "architectural layers," or "originating team." The result? Users are inconvenienced, for example, by having to recite a litany of import lines at the top of each file. REP tells us to consider the destination instead of the source or even the structure of the code itself. It urges us to group things together for user convenience.

    An entire system shoehorned into a single package/library would comply with this principle. But if the rate of change in the library was not extremely low, it would suffer from the problems addressed by the remaining principles.

  • The Common-Reuse Principle (CRP) - This almost seems like a restatement of the REP, but the emphasis here is on limiting the impact to consumers. Imagine an API package with two sets of reusable class clusters. A programmer might choose to consume only one reusable component, but changes to the other component necessitate redistribution of the entire package. An unwanted release unnecessarily burdens the consuming programmer, who must re-integrate and re-test the entire library in order to stay current. Where the REP leads us to conglomerate, the CRP leads us to split packages apart. This tension between the principles leads us to find the level of granularity that provides greatest convenience (least negative impact) to users.

    In spirit, the CRP is a package-level restatement of the Interface Segregation Principle, which says to keep interfaces small and focused for similar reasons.

  • The Common-Closure Principle (CCP) - The CCP is an application of the Open-Closed Principle at the next level up. This principle suggests that you should group your classes around the impact of change. A change should optimally impact only one package; as many other packages as possible should be closed to that change. This varies from the other two principles as it recommends grouping classes around the way the code is maintained, rather than the way it is used. Modules with a high rate of change might need to be grouped by closure rather than by use. Highly stable packages might be better conglomerated (REP).

The principles of package cohesion are not absolute. Sometimes a packaging may satisfy all three principles at once, but often the principles represent competing principles. The development team will have to make trade-offs in order to find a balance that works. In general, smaller (and even smaller) packages provide the best chance for adherence to the CCP and CRP principles, although the REP says there is a limit to how small you will want to go.

Following these principles will require occasional re-packaging, which is upsetting to many users. However, correcting less-than-optimal packaging is a single deep cut that can halt the "death of a thousand paper cuts" caused when changes ripple across packages, or when users of a package have to deal with frequent irrelevant updates.

TDD Antipatterns


(font is brianne's hand)

James Carr enlisted a group of fellow travelers to define a list of TDD Antipatterns, errors in judgement common in TDD practice. He provided an initial list for us to base our work on, and did a very fine job of filtering out the duplicates and near-duplicates, providing catchy names, and writing up the result. Note that his list is longer than ours since we have a terseness constraint. Read the full list.

I wish I had thought of it first.

  • The Liar is a test that runs, but does not test what it claims to test. It could be named after a class, but actually be testing another. The test might be called ShouldNotThrowExceptionsForPositiveValues, but actually use the natural numbers less than 5. Liars give a false sense of security.
  • Excessive Setup is common when the architecture is badly coupled and mocking is not well-used. This is often evidence of insufficient pre-factoring. A little dependency injection and a little interface use can go a long way. This is also common when programmers give in to the urge to test software "in context." One hopes that when they get to the assert they'll remember what scenario they were testing.
  • The Giant is a single test that tests more than a single scenario. It may have excessive setup, but then follow with a large number of manipulations and assertions. It may test entire subsystems. Adding a new assertion requires a programmer to reverse-engineer his way through the Giant to find an assertion insertion point, and requries the programmer to exercise care to not leave side-effects that will cause the last half of the Giant to fail. Giants are ticklish and hard to understand.
  • The Mockery is a real piece of work. A pair/team/programmer has actually replaced the system under test with a test double. The test proves only that the mocks worked as expected.
  • Generous Leftovers are left behind by tests that dont clean up after themselves and cause later tests to fail when run in the suite but not when run in isolation. The leftovers are typicall in static memory, on disk, in a database (for shame!) or some other persistent store. When leftovers are found, it's often puzzling whose they are, and what they are.
  • Local Hero is a test that runs well, and tests a system well, but only passes on the author's machine or network. Being environmentally-sensitive, such tests fail for peers and CI systems. Typical failures involve hardcoded paths, locally-installed libraries, and OS-specific assumptions.
  • The Loudmouth is a test that produces copious output. It is like the boor at the party who thinks every trivial event in his live is worthy of an epic tale. At one time, the loudmouth's story may have been worth hearing, but now it's just idle chatter that gets in the way of a real conversation with more interesting guests/tests.
  • The Secret Catcher is a test that seems to do nothing at all, but is secretly (implicitly) depending on any errors to produce exceptions. The fact that the code executed without exception is expected to evidence that the code works. These tests can eventually be reverse-engineered by every programmer on the team. Failure of the test always results in code spelunking, an unpopular passtime.
  • The Hidden Dependency is a test that secretly requires setup that is not present in the test itself. Perhaps it requires a certain data setup script, or a change in a configuration file, or another test to have already run with generous leftovers. Hidden Dependency tests are a special kind of evil.
  • The Stranger is a perfectly good test in a perfectly wrong place. It is not a liar, it's just not testing the same system (SUT) that the other tests are testing. Strangers don't cause problem except when you're looking for tests for class X and it's hiding among the tests for class Y.
  • Success Against All Odds is a test that will always pass, no matter what. Due to a series of missteps, the test won't fail even if the code is wrong. Often these turn into complicated indirect versions of "assert true == true" or "false == false". These can be a variation on Mockery or The Liar. It is likely that a test with Success Against All Odds was written as a green test, without ever having seen the "red" part of the Red->Green->Refactor loop.
  • The Slow Poke is a test that takes too long to run. If you have 15000 tests, and you want them to run in less than 45 seconds, you have a budget of 0.003 seconds per test. A five-second test will cause some irritation. A few 10-second tests will make people think twice about running the tests at all. A test that takes a minute is unlikely to ever be run by a programmer. Imagine what hell awaits the author of some of the three-to-ten minute monstrosities that exist in the wild! In a TDD shop, slow pokes cannot be tolerated. Note that many slow pokes are also Giants with Excessive Setup. Be warned.
Tim recommends that readers follow the link to the original article, which covers more territory than the index card can allow.

What is missing?

We have dozens more cards we can write and post, so were not running short on ideas even though we can be short on time to get them all written up. Still, I wonder, if you could pick the next card for me, what would you want it to be about? Is there an area we've not addressed at all? Topics of interest to you that might be interesting to others? Guest authors we should contact? I'm all ears.