Saturday, Oct 23, 2010

How I Write Java These Days

Over the last year, I’ve made an effort to better identify the styles, idioms, and smells I encounter when reading and writing new Java code. [And, already, a takeaway point! To some of my more successfully sheltered rubyist friends, it may be sorry news to hear that there continues to be new Java code written.]

In any case, I’ve made a concerted effort to internalize habits that I find valuable and to develop a reflex to resist those which I do not. This will be my first effort at documenting either, so I’ll just express them in terms of how my Java looks these days, replete with code snippets as needed.

My real goal in documenting any of this publicly is to have a page that I can easily cite whenever I begin working with a new colleague to develop software using Java. I presume it will either help him (or her) identify the path of least argumentation or, alternatively, provide a platform by which to scheme to the ends of changing my mind toward a better way of doing something. Either of those outcomes are welcome. To everyone else, you too may find something worth either adopting or calling me out on.

As if this post required fewer than a dozen other caveats, I will mention that much of what I say here may only apply to Java development, and may even then only apply to greenfield Java, as opposed to existing code without tests.

1. Achieve isolation with low-friction, generated test doubles

My highest priorities when I write a unit test are achieving utter isolation, testing the front door first, minimizing the cost of creation, maximizing readability, and ensuring that change remains cheap.

In response to the first of those priorities, my current mode of operation is to use Mockito to isolate my SUT.

I’ve become more careful to use the term test double appropriately as opposed to calling every generated double a “mock”, as most early Java “mocking” frameworks were actually born from the school of endotesting where they actually mean for mock objects to blow up in your face every time your SUT interacts with them in a way not anticipated by your test. Tools that force developers to specify so much low-value setup for each test case has naturally led to a backlash in support of “classical TDD” and witty presentations about how “mocks suck.” Mockito, despite its name, simply generates the tabula rasa of test doubles until you choose to stub some behavior. I appreciate Dan North’s take on the issue.

Here’s an example of how some of my test setup might look if I were building a salesman selling shards ‘o’ glass pops, using Mockito:

What do I like about Mockito over its alternatives? Well,

Its authors hold great opinions that foster the user’s learning about isolation testing for TDD
Setup is minimal enough to make class creation cheap, but not so invisible as to be indistinguishable from magic
No goofy record-playback metaphor to inhibit arrange-act-assert
Even in large projects, custom test-scope classes are rarely necessary, keeping change cheap

Another benefit: it discourages folks from wasting brain cycles determining whether to achieve isolation from every single depended-on-component or settle for pseudo-functional tests, because Mockito’s annotation-based test setup is actually easier than instantiating all the SUT’s dependencies oneself.

2. Test the front door first (and then nail all the other doors shut)

I treat my test as the first customer of my class, and as a result I never-not-ever provide my test with a means of backdoor manipulation. As a result, I always test-drive the responsibility of the class through its (often only one) public method(s).

An example smell I frequently see indicating that someone doesn’t recognize this rule, is to find a protected or package-private method in the SUT whose visibility was obviously only increased for the benefit of making a test easier to set up. Note the package-private helper method here:

I’ve even seen code that apparently agrees that modifying a method’s visibility on the SUT for this purpose is a bad idea, but then takes a turn in an even harder-to-refactor direction by using reflection to test a private method. There simply aren’t enough unicorns left in the world to stand by and let that happen.

3. Don’t write static methods

The first day I worked with Rich Dammkoehler earlier this year, he laid down the law: no static methods. I normally don’t appreciate broad dictates telling me not to use a major language construct, but this one makes sense to any practitioner of TDD that craves unit test isolation. It’s also an easier pill to swallow if you plan to lean on a dependency injection framework like Guice or Spring, since nabbing a singleton instance of a class isn’t much harder than referencing it statically.

Why are static methods bad? Google testing blog did a great job outlining why static methods make isolation testing hard.

4. Wrap those static methods that you do need

When I was writing the jasmine-maven-plugin, I started by failing to test(!) numerous methods that relied on static methods provided by a FileUtils class. As my guilt increased, I moved onto using PowerMock, as icky an experience as that can be.

If I had it to do again, I would have wrapped whatever I needed from the FileUtils class with plain ol’ instance methods, so my PowerMock usage could have been limited to a single place, rather than spread like a plague along with my usage of the static method. I imagine that I’d specify such a wrapper as follows:

5. JavaScript is code too, so test-drive it

I’ve been singing this song for a while, but at this point in history—where applications that differentiate themselves do so with an exemplary front-end user experience, as opposed to immaculate backend code—placing an emphasis on well-crafted JavaScript is critical to success.

To any web developer, I strongly recommend adopting Jasmine, which is delightfully expressive and is starting to reach critical mass. To Java web developers in particular, take a look at my jasmine-maven-plugin, which you can use to get started with JavaScript testing extraordinarily quickly (and if extraordinary isn’t fast enough, try playing with the plugin’s jasmine-archetype).

If you require yet more exhortation to start testing-driving your JavaScript, check out my recent presentation on JavaScript Craftsmanship.

6. Convention > Configuration (with annotations) > Configuration (with XML)

I almost hesitate to advocate using annotations, because the Java crowd still doesn’t seem to quite grok the lesson of convention over configuration, but whenever the choice comes down to annotations or XML, I’ll choose the locality and readability of annotations every time.

The primary discomfort I’ve witnessed among developers regarding annotations is that they exist in this murky ether, halfway between code and configuration. That ambiguity led Mike Busch and I to test-driving the placement of Java annotations (and we had success writing custom Hamcrest matchers to specify @Action and @Result in Struts apps). I’m thinking about publishing a library of annotation-based matchers for common situations; let me know if you’re interested in pairing with me to make that happen.

7. I agree, getter/setter code sucks, but keep your members private anyway.

Believe me, I understand the underlying sentiment when I see code that looks like this:

I hesitate to abide this for a couple reasons. First, non-private members are usually counter-conventional. Second, especially for classes intending to encapsulate the state of something, the ubiquity of the JavaBeans spec has led numerous libraries to require your objects be JavaBeans compliant.

This is not to mention that raising the visibility of your member fields makes an even bigger mess of a class’s encapsulation than blindly generating a getter & setter for every field (an act that’s also usually unwarranted).

8. Bury getters & setters

As alluded to above, accessor and mutator methods are perhaps the least informative code in any given class, so why place them prominently in your class’s listing, bogging down your reader?

I regularly find myself moving them to the bottom of a class, getting them out of the way for the benefit of others.

9. Never write another assertEquals()

Vanilla JUnit assertions like this one are still nearly ubiquitous in new Java code:

How can you blame the author above for forgetting that the first parameter of assertEquals is supposed to be the expected (as opposed to actual) value? I certainly can’t. It’s hardly apparent that the correct ordering is this:

Even if the above method signature were well known to all of us, it still wouldn’t be particularly expressive. Hamcrest offers a much more obvious, readable style of assertion—and manages to retain that style even for asserting arbitrarily complex facts by way of custom matchers. I struggle to imagine anyone finding assertEquals to be more clear than this:

10. Write readable tests that follow Arrange-Act-Assert

The arrange-act-assert pattern is greatly helpful in encouraging minimal, readable test cases. I’ll typically try to minimize the number of “arrange” lines, keep my “act” to a single line, and limit myself to a single “assert” line when I can. I also separate each phase with a newline to make my intentions dead-obvious.

Using this as a convention leads to terrifically simple tests, like this one:

In fact, I may find myself writing verbatim “assertThat(result,is(expected))” numerous times a day. It may be repetitive, but I’d take that over obscure tests any day.

11. Write humane matchers to describe your most common assertions

Say you’re testing a Struts application, and you very frequently find yourself testing that an action method has the side effect of adding an “action message” for the view. Sure, you could write something like this test, ad nauseam:

Or, with the one-time cost of remembering how to create one, you could write a slightly more meaningful matcher like this:

And each test case could become even more readable:

12. Packages don’t really matter

Spending significant time noodling about what belongs in which package is usually a low-value activity, and is more often mere friction getting in the way of my productivity (and happiness). If I put a class in a package and you don’t like it, move it. I mean no offense—you may just care more about this than I do.

13. Even if packages matter to you, they shouldn’t matter to your code

Code that breaks when it moves from one package to another is an obnoxious type of coupling enabled by the Java language, and is almost always the result of either aimless or too-clever designs.

I don’t mind at all if a project’s package naming & hierarchy is your carefully-tended zen garden, but I have yet to witness a design benefit from restricting a member’s visibility to only classes with the same package name. As a result, I usually interpret package-private anything as a code smell that something went wrong somewhere—most frequently, it will have been an attempt to provide a backdoor for a test, a malady discussed here elsewhere.

14. No acronym is too good to be camelCased

I’ll pick on the DTO moniker here, as it’s one of my least favorite vestiges of EJB.

Even if I were to suffix a class with such an acronym, I’d name it in camelCase, like “GlassPopDto”.

Why? Because some future actor on that class always arrives later to the scene, and nobody wants to have to parse what a “GlassPopDTOBuilder” is supposed to do.

15. Life is easier when you name each dependency after its class

Note that I’d never advocate an honest-to-god variable be named something completely uninformative like:

However, I do always name my class’s dependencies after the class that represents its contract. This makes the code less surprising and DI autowiring more idiot-proof. [I name the dependencies identically in my tests, too, so that I can refer to them without guessing or remembering some arbitrary prefix (like “mockedBoxOfGlassPops”).]

16. Ditch interfaces that will only ever have one implementation

The Spring framework encouraged and really popularized coding to interfaces. That had its benefits at the time, but the lasting result was inevitable: a lonely debate over whether every single class on the planet should be named “DefaultInterfaceName” or “InterfaceNameImpl”.

I’ve come to believe that enforcing this platitude violates YAGNI. Unless some really awesome framework requires you to do otherwise (and what awesome framework would?), test-drive a class first and extract interfaces only once they’ll facilitate some useful purpose. My experience tells me that they rarely will.

17. Don’t override equals() & hashCode() without reason

I think it’s a natural reflex to look at a newly minted class and think to yourself, “now I’ll just override equals() and hashCode() and it will be all-grown-up.” I’ve learned to avoid this temptation, for a few reasons:

Both methods inspire some really hard-to-read, error-prone code
It can be painful to test-drive all the edge cases of the contract without leaning on an external library to do it for you
It has to be maintained as the class changes
It’s probably not really necessary

The last bullet—that overriding these methods is usually not necessary—is the best reason not to override them. I rarely find myself in situations where I have multiple instances of a class that are logically equivalent and also likely to bump into one another. And when I do find myself in such a situation, the objects probably have unique identifiers that I can compare, like when I’m sorting out who gets the nuts between these two squirrels:

And even when I am in a situation where the logical equivalence of two objects matters, the means of determining that logical equivalence is often only relevant to a single user of the class, and therefore it wouldn’t be appropriate to universalize that logic by storing it in the class itself. A separate Comparator is often a better solution in these situations.

18. The Statist vs. Behaviorist debate is a false dichotomy

Too often I see a team room draw a line in the sand between advocates of state verification and those favoring behavior verification. This strikes me as a bit of a butter battle sometimes, as both mechanisms are useful (necessary?) for facilitating clean production code.

I typically default to verifying state changes, but it’s often the case that a method is better specified via verification of its interactions with its dependencies. This is especially true when the side effect would be so hard to assert that you’re tempted to violate command-query separation and make a command function return a value only for the benefit of a test, when the API otherwise wouldn’t benefit from it.

19. Useless layers are useless

If a Java project has a Widget model object, it’s a pretty safe bet that someone is going to create a WidgetService at some point. And what would be the WidgetService’s single responsibility? Its name certainly doesn’t do anything to tell us.

Having seen a few widget services in my time, all I can infer is that

It’s likely to be annotated as @Transactional, and
It will probably soon become a dumping ground of numerous, unrelated responsibilities.

I don’t know the cause for this. Whether it’s the proliferation of dogma in the Java community, a side effect of all the domain-knowledge-free cookie-cutter Java EE projects, the SOA buzzword’s popularizing of the word “service”, or something else entirely, I would hope that any rational developer would agree that a class doesn’t need to come into existence until it has something to do. Making sure an arbitrary layer is sufficiently populated isn’t a nearly good enough reason.

20. Class extension should not be used to mask complexity

Even though few people’s code consistently adheres to the single-responsibility principle, most people do seem to have a natural aversion to excessively gargantuan classes. Unguided, this aversion results in class complexity being conveniently swept under the nearest rug. One such rug that is often used to fantastic effect is class extension.

I’ll keep picking on Struts here. Take this snippet from ActionSupport, from which Struts action classes may extend:

Any time that your ActionSupport extension wants to resolve a message key for i18n, your test class is stuck dealing with the complexity of a real TextProvider instance provided by the TextProviderFactory. This is because no setter exists on ActionSupport, so you can’t easily replace it with a test double.

So in this case you’re left with one of several unsavory options, including (for the sake of making this point): negotiate with a real TextProvider instance or stub out the getText() methods by overriding the SUT.

In the case of the former, your test loses its isolation of the SUT. In the latter, you’re stuck dealing with instantiating the SUT with an anonymous override, like this one:

So now, thanks to some convenience methods tucked away in a super class, we’re stuck resorting to a form of backdoor manipulation—and worse, by having overridden our SUT, we can never truly test the real thing. Our test starts its life out as a lie.

Hardly seems worth the convenience of sweeping the shared convenience methods under the rug of a super class.

21. Static analysis can make a great WTF detector

Static code analysis is no golden hammer, but using Sonar has demonstrated to me that it can have great benefits. If you’ve never seen Sonar, here’s a public instance analyzing Apache CXF.

Of course, automated code analysis that yields no red flags does not necessarily indicate clean, quality, or even working code. However, properly tuned code analyses that yield a bevy of violations can certainly grab your attention and quickly place it on potential problem areas.

I view static analysis as a way to get fast feedback. It makes me aware of sore spots, usually before they lead to bugs or (depending on their underlying severity) spiral out of control. Even if the report is only used to facilitate the ongoing curation of code as it’s being developed, surely it stands to reason that generating and reading a Sonar dashboard report once a week would require less effort than scouring a team’s every commit by hand.

It’s with that last image (that of painstakingly auditing a team’s work) in mind that I can appreciate Sonar as a handy tool that supports a sort of trust, but verify workflow.

And there you have it! As I wrote this entry in a single straight pass, I’m sure there is still plenty of room for improvement. I hereby solicit your feedback and corrections, whether it’s by tweeting at me or e-mailing me.