Automatic tools for improving R packages

On Tuesday I gave a talk at a meetup of the R users group of Barcelona. I got to choose the topic of my talk, and decided I’d like to expand a bit on a recent tweet of mine. There are tools that help you improve your R packages, some of them are not famous enough yet in my opinion, so I was happy to help spread the word! I published my slides online but thought that a blog post would be nice as well.

Read more

Who is talking about the French Open?

I don’t think rOpenSci’s Jeroen Ooms can ever top the coolness of his magick package but I have to admit other things he’s developped are not bad at all. He’s recently been working on interfaces to Google compact language detectors 2 and 3 (the latter being more experimental). I saw this cool use case and started thinking about other possible applications of the packages.

I was very sad when I realized it was too late to try and download tweets about the Eurovision song context but then I also remembered there’s this famous tennis tournament going on right now, about which people probably tweet in various languages. I don’t follow the French Open myself, but it seemed interesting to find out which languages were the most prevalent, and whether the results from the cld2 and cld3 packages are similar and whether they’re similar to the language detection results from Twitter itself.

Read more

Which science is all around? #BillMeetScienceTwitter

I’ll admit I didn’t really know who Bill Nye was before yesterday. His name sounds a bit like Bill Nighy’s, that’s all I knew. But well science is all around and quite often scientists on Twitter start interesting campaigns. Remember the #actuallylivingscientists whose animals I dedicated a blog post? This time, the Twitter campaign is the #BillMeetScienceTwitter hashtag with which scientists introduce themselves to the famous science TV host Bill Nye. Here is a nice article about the movement.

Since I like surfing on Twitter trends, I decided to download a few of these tweets and to use my own R interface to the Monkeylearn machine learning API, monkeylearn (part of the rOpenSci project!), to classify the tweets in the hope of finding the most represented science fields. So, which science is all around?

Read more

How not to make an evergreen review graph

In this post I am inspired by two tweets, mainly this one and also this one. Since the total number of articles every year is increasing, no matter which subject you choose, the curve representing number of articles as a function of year of publication will probably look exponential, so one should not use such graphs to impress readers. At least I’m not impressed, I’m more amused by such graphs now that there’s a hashtag for them.

I shall use an rOpenSci package for getting some data about number of articles about a query term, and to do a graph that’s not an evergreen review graph!

Read more

A tribute to Lucy D'Agostino McGowan's git commit emoji game

Do you know Lucy? She is a very talented biostatistics PhD candidate that I had the chance to e-meet thanks to R-Ladies. One maybe superficial reason to admire her, on top of her other achievements, is her emoji game in git commits. Looking at Lucy’s git history (find her on Github), one wants to start using version control because she makes it look fun!

In this post, I will download many git commit messages of Lucy’s from Github’s API via the gh package, and have a look at the emojis she uses the most frequently.

Read more