I’ve been looked at strangely for saying that I’m not a big fan of unit testing. Since I don’t have time to write any code this week, I thought I’d formalise this a little. Let me start by saying, in big bold font, unit testing is really helpful and you should definitely do it.
Unit testing is a way of formally asserting that your code works. In R, we have the great testthat package, which allows your to say things like:
…and you can be damn sure that your new factorial function will return the correct value when it’s given a value of 3. When working on big projects with lots of people it’s a great way of making sure nobody breaks someone else’s code; if all the tests pass at the end of your edits, everything is fine. It’s great for stuff only you are doing too – my shiny database must be right because all the tests assert test data is loaded correctly.
However, quantity is not quality when it comes to testing. Testing a function that computes a factorial twenty times to make sure it fails when given phylogenetic trees and character strings is not the same as testing to make sure the function works. Test the things that matter first – if someone’s going to pass your function a character string and they need a specific error for that, there’s probably nothing you can do. Achieving complete coverage of tests is no use if all of your checks are nonsense.
I think the most important thing in programming is writing good code, and that many have fetishised unit testing to the point where it’s no longer helpful for maintaining good code. So, in the spirit of being helpful, here are some general practices I think would make everything better:
- Check the interactions.
At a certain level, your code isn’t going to fail because you missed a comma somewhere, it will fail because two functions will interact in a way you hadn’t predicted (…and therefore couldn’t have written a test for). Check how the individual components of your program interact with one-another, and then (go nuts) write some tests for that.
- Test your assumptions.
If you know what will break your program, either fix it, or write a check for that conditions and make it fail gracefully. Everyone has confirmation bias – if you’re not finding errors when you thoroughly check your code with different input values, then you’re either the best programmer in the world or you’re not being honest with yourself. Find bugs and squash them, don’t write lots of line of tests that make you feel better about the bugs you’re pretending you don’t have.
- Write DRY code.
Don’t Repeat Yourself. You wouldn’t have to write so many damn tests if you abstracted and re-factored your code once in a while. Modularise everything so that you are quicker writing new things. The whole purpose of computers is to make our lives easier – grab discrete modules of code you know already work (and are tested), and use them to get your current task done.
- Listen to feedback.
When I leave a bug report, it’s because I’m hoping to help someone. Even if I’m being stupid, can you say in all honesty that you couldn’t make your program a little clearer? I often leave bug reports because I want to draw the authors’ attentions to things that could cause problems further down the line; these aren’t bugs, they’re just things that might be helpful. Such ‘bug’ hunters probably like you and your work, so chill out!
- You’re not a developer.
You’re a scientist. So you’re probably not going to be able to write meaningful tests and then write a program that fulfills them, because you’re exploring and treading new ground. Don’t worry about it – just make sure your code works. And yes, that probably means some unit tests.
Most of these pieces of advice involve unit testing. There’s a good reason for this: when done correctly, unit testing is helpful. When fetishised, it’s incredibly dangerous, because people think they can churn out more and more tests and so any kind of spaghetti code and all will be fine. I promise you, it won’t.