Lintr Bot, lintr's Hester egg

Remember my blog post about automatic tools for improving R packages? One of these tools is Jim Hester’s lintr, a package that performs static code analysis. In my experience it mostly helps identifying too long code lines and missing space, although it’s a bit more involved than that. In any case, lintr helps you maintain good code style, and as mentioned in that now old post of mine, you can add a lintr unit test to your package which will ensure you don’t get lazy over time.

Now say your package has a lintr unit test and lives on GitHub. What happens if someone makes a pull request and writes looong code lines? Continuous integration builds will fail but not only that… The contributor will get to know Lintr Bot, lintr’s Hester (Easter) egg!

Lintr Bot, a lazy little thing?

Lintr Bot has a GitHub profile. I am actually responsible for its avatar, because when I discovered it a while back I offered Jim to design it based on this free image. Now, Lintr Bot has a face but if you look at its GitHub activity, you might wonder, what does it even do? It’s perfectly fine for a human to have nearly no GitHub activity, but remember that Lintr Bot lives on GitHub only, so what does it do all day?

Well Lintr Bot goes around and comments on Pull Requests in R packages, see for instance this comment. As mentioned earlier, a contributor who forgets even a slight code style mistake will be reminded of it by Lintr Bot if the package has a lintr unit test. This is quite useful!

We might also wonder how often Lintr Bot actually works? After all, since GitHub activity timeline doesn’t show comments on Pull Requests, we’re at a loss when it comes to estimating how active this bot is…

Lintr Bot’s year in review

That’s where my ghrecipes package, presented in my blog post about Bob Rudis’ R packages, will come into play. I’ve been working on a function called spy which can return either Pull Requests or Issues an user commented on during a given period of time. As creepy as it might sound, it could help you trace back all the desperate comments you wrote when debugging something last week for instance, without your having had to track them in the first place. ghrecipe::spy is a work in progress, but let’s see what it can return me for Lintr Bot!

Note that there’s no way to filter for comments between a date range, well as far as I understand the “search” functionalities of GitHub V4 API, so I can only filter Pull Requests updated in a given range.

lintrbot_act <- ghrecipes::spy(user = "lintr-bot", type = "PullRequest",
                               updated_after = "2017-03-30",
                               updated_before = "2018-03-30")
lintrbot_act <- dplyr::filter(lintrbot_act,
                              as.Date(created_at) >= anytime::anydate("2017-03-30"))

We get 228 pull requests in one year! Lintr Bot is definitely not a lazy little thing!

Here are 5 Pull Requests that got a visit from Lintr Bot.

owner repo title created_at state author url no_comments id
PredictiveEcology reproducible mergeCache 2018-03-29 18:29:33 MERGED eliotmcintire 2 18
PredictiveEcology SpaDES.core change objectSize to objSize – name conflict with R.oo::objectSize 2018-03-26 23:48:22 MERGED eliotmcintire 1 57
PredictiveEcology reproducible memoise – loadFromRepo 2018-03-26 17:41:25 MERGED eliotmcintire 3 17
PredictiveEcology quickPlot thin fixes – bugfixes, SpatialPolygonsDataFrames 2018-03-26 05:32:12 MERGED eliotmcintire 2 13
ptl93 AEDA added getEps 2018-03-25 17:12:44 MERGED MiGraber 5 39

Let’s have a look at the time series of comments.

ggplot(lintrbot_act) +
  geom_segment(aes(x = created_at, xend = created_at),
               y = 0, yend = 1) +
  theme_ipsum(base_size = 16,
              strip_text_size = 16) +
  ggtitle("Lintr Bot's work events",
          subtitle = "Each vertical line indicates Lintr Bot's commenting on a pull request")

plot of chunk unnamed-chunk-3

So it seems that Lintr Bot works quite regularly, with some intense periods! I won’t look at the number of repos over time to differentiate activity due to more activity in a few repos or due to several repos, but I’m curious to see who was helped by Lintr Bot.

dplyr::group_by(lintrbot_act, owner) %>%
  dplyr::summarize(repos = toString(unique(repo)), 
                   n = n()) %>%
  dplyr::arrange(- n) %>%
  dplyr::filter(n >= 5) %>%
owner repos n
RTMC tmc-rstudio, tmc-r-tester, tmc-r 64
mlr-org mlr 38
PredictiveEcology reproducible, SpaDES.core, quickPlot,, SpaDES.shiny 26
Azure doAzureParallel 16
HealthCatalyst healthcareai-r 16
neuropsychology psycho.R 11
ropensci drake 11
hbc bcbioRNASeq, bcbioSingleCell 5

I must say that besides the recently onboarded rOpenSci package drake by Will Landau, I don’t know any of these repos!

Hire Lintr Bot?

What about you? Would you let that cute little bot help you?

Note, a good package to improve code style automatically is styler by Kirill Müller and Lorenz Walthert, so if Lintr Bot has a lot to tell you, give styler::style_pkg() a try!