Thursday, Jul 19, 2012

Blame the Code not the Test

"This test is too coupled to the implementation."

This complaint is commonly levied when—on account of test double setup—you have spec code that looks a lot like the subject's ("SUT's") implementation code. I hear this complaint most often in cases where the subject has little responsibility beyond passing a value from dependency A to dependency B and returning it.

Because isolation tests specify not only the externally observable behavior of the subject, but also the subject's contracts with its collaborators, it should be obvious that isolation testing is going to bring complex interactions with collaborators to the forefront in a way that an integrated test would not.

Before immediately jumping to the conclusion that the use of test doubles is to blame for tests made painful by this fact, it's always helpful to remember that it's just as likely the production code is the root cause of the pain, and that the painful test is merely a symptom. That is to say, instead of responding to that pain in the test by changing how you test, one could use it as an opportunity to change the design so that the code is doing more than manually passing some value from dependency A to B (to C to D).

Say we have this code:

Invoice =
  build: ->
    count = CountCalculator.calculate()
    price = PriceCalculator.calculate(count)
    withTax = TaxCalculator.calculate(price)
    withShipping = ShippingCaclulator.calculate(withTax)

The above code was demanded by this spec:

describe "Invoice", ->
  Given -> window.CountCalculator = calculate: jasmine.createSpy().andReturn(1)
  Given -> window.PriceCalculator = calculate: jasmine.createSpy().when(1).thenReturn(2)
  Given -> window.TaxCalculator = calculate: jasmine.createSpy().when(2).thenReturn(3)
  Given -> window.ShippingCaclulator = calculate: jasmine.createSpy().when(3).thenReturn(4)

  When -> @result = Invoice.build()

  Then -> expect(@result).toEqual(4)

Ouch. Painful test. But it does assert the behavior that we want (even the order we want!). And it does so without any backdoor manipulation of the SUT; it may feel like backdoor manipulation because of the spec's intimate arrangement of everything the SUT will be doing, but this is more of an indictment of the SUT for not doing anything interesting or useful than an indication of wrongdoing on the test's part. [Note: in this contrived example, that nuance might be lost because we're effectively spying on global functions instead of using dependency injection, but I didn't want to risk confusion by making the code snippet any longer than necessary.]

A reaction I often hear from seeing a test like this focuses squarely on the use of test doubles, and the argument goes something like, "because the test resembles the implementation so closely, test doubles shouldn't be used." Usually, the person forwarding the argument will advocate for a coarser, less-isolated test in response to this pain. And, of course, that would probably work fine, and it would probably result in a more readable spec than the one above, but unfortunately it would not take advantage of the greatest strength of isolation testing: alerting you to design problems by inflicting acute pain.

Instead, I'd prefer to address the pain by searching for a way to make the implementation less rote, since it doesn't have any behavior of its own other than to call its collaborators in a series. Once that responsibility has been generalized, new isolation tests will emerge and hopefully be simpler (and, as a side effect, less resemble their production code).

For example, the code above can be refactored like this (the spec still passes):

ChainsCalls =
  chain: (dependencies, method) ->
    _(dependencies).inject((memo, dependency) ->
      dependency[method](memo)
    , undefined)

Invoice =
  build: ->
    ChainsCalls.chain([CountCalculator, PriceCalculator, TaxCalculator, ShippingCaclulator], "calculate")

And after this refactor, we can tidy up our tests so that they mirror the production units:

describe "ChainsCalls", ->
  Given -> @a = go: jasmine.createSpy().andReturn("foo")
  Given -> @b = go: jasmine.createSpy().when("foo").thenReturn("bar")
  When -> @result = ChainsCalls.chain([@a,@b], "go")
  Then -> expect(@result).toBe("bar")

describe "Invoice", ->
  Given -> window.ChainsCalls = chain: jasmine.createSpy()
  Given -> @dependencies = [CountCalculator, PriceCalculator, TaxCalculator, ShippingCaclulator]
  Given -> ChainsCalls.chain.when(@dependencies, "calculate").thenReturn("yay")
  When -> @result = Invoice.build()
  Then -> expect(@result).toEqual("yay")

Obviously the above example is more contrived than most similar situations you'll run into. The point is that when you experience pain while using test doubles, it doesn't necessarily mean that the test doubles or the practice of using them is to blame.

One thing I've learned after several years of trying to become competent at isolation testing: when your tests inflict pain, if you're unwilling or unable to change the design of your production code, then the extra effort to perform rigorous isolation testing (as opposed to relying on fewer, less granular tests that exercise more of your code) is wasted.

This sentiment is often at mind when building an application hand-in-hand with a framework. When your code serves a framework (say, Rails) and the framework causes pain in your isolation tests (like ActiveRecord might), you're often stuck in a bind. You can't reasonably respond to your application's tests' pain by dramatically refactoring the frameworks on which you depend. As a result, the prudent thing to do is often to try to isolate as much of your application code from the framework as you can and to forfeit isolation testing where it's too painful (say, where your objects must extend from framework classes).

If, like me, you're disinclined towards frameworks that rely on users extending framework classes, you can add this justification to your list of reasons why.

Blame the Code not the Test

Got a taste for hot, fresh takes?