Phylogenetics in Julia! Not R, sorry…

...and now for something completely different.  The Monty Python foot, taken from  ecogirlcosmoboy

…and now for something completely different. The Monty Python foot, taken from ecogirlcosmoboy

When I first read about Julia, I dismissed it as a nice idea that was coming too late into an already crowded niche space of programming languages. Fast forward almost two years and Doug Bates (of mixed effects models fame) is using it, and even Wired is is talking about it. So I decided to give it a try by making a phylogenetic library; don’t get too excited because it only sort-of loads a Newick phylogeny and not much else.

My first impressions were very good. It is fast – my code is dreadful (I wrote it with a few beers) and even with recursion it’s usable. It’s also very readable; despite a few quirks, it’s probably easier to read than R. The features that impressed me (in vague order of increasing nerdiness) are:

  • Easy to use multiple processors
  • Package management is all linked into GitHub, so there’s no messing around with CRAN-like central repositories (…yes, that could also be bad)
  • The typing system is well-done and saved me time. If I say a function only takes a phylogeny, it only takes a phylogeny, and is very vocal about it without me writing checking code
  • Types are defined explicitly, and there aren’t three kinds of class (R…!) to mess around with
  • Emacs (through ESS) already gives me a good coding environment. I couldn’t get JuliaStudio to work on my Ubuntu computer, but I’m not a fan of RStudio anyway so it was no biggie

That said, there are kinks, and while it’s only at version 0.2 some are surprising gives it’s over two years old. You’re going to find yourself on developer discussion pages quite quickly because the help files are still being built. You can’t delete or redefine variables or types, which is a nightmare when you’re experimenting. I’ve had a rather vocal falling out with Julia’s regular expression matching (matchall, not match, is your friend), the debugger hasn’t been touched in a while (…but it works), and there’s no standardisation of graphics yet. More fundamentally, Julia needs to take statistical formulae seriously; the expression notation is sort of alright, but not every function listens to it. I want to be able to plot(y ~ x), damnit!

Bottom line: it’s not ready yet, it made me grow a few grey hairs, and The Queen isn’t dead so long live R. However, take a quick look at any package up on GitHub, and you’ll find something you won’t see on any R packages – there’s no C code. I’m tired of people talking about how great it is that you can easily use C++ in R – I learnt R specifically because I wanted to move away from C++. If you have to solve your problem using another language, maybe you weren’t using the right language to begin with. I don’t think Julia will supplant R any time soon, but I am going to keep plugging away at phylogenies in it; I want to do some big simulations and I really don’t want to return to C++. I doubt I’m the only one.