Dataframes in Ruby: always double-check online

D’oh! If only I had checked beforehand! Courtesy of Simpson Crazy; apparently a hand-traced image and so OK for copyright…!

I absolutely love the Ruby programming language; I wouldn’t necessarily say I’m very good at it (or any language for that matter), but I always smile as I type ‘irb’ at a console. I find the language is more expressive, the naming conventions easy to use, and there are none of the silly indentation issues you find with Python. So, when faced with a solo project, I of course chose Ruby, and when I couldn’t find a reasonable data.frame gem (the Ruby equivalent of a package) I saw an opportunity, not a problem!

Behold! data_frame was born! Marvel! At how it’s very similar to a hash but with only a few extra features. Gaze adoringly! At how it can load CSV and Excel (xls and xlsx) files! Scream in shock! When you discover an identically-named package already available on rubygems, that happens to be much nicer (albeit without the Excel features). D’oh! If only I’d Googled more thoroughly earlier!

On a more positive note, I found the new GitHub-Zenodo integration really convenient for getting a citable DOI, and I’ll definitely be using that for all projects in the future. Moreover, making a gem (documentation and all) and getting everything ready took a single afternoon with a relaxed glass or two of wine. This is going from scratch, mind you, and included the time taken to re-install Ruby, get everything into the right gem format, figure out jeweler, and get everything online. I somehow can’t imagine having the same experience working with R…