R package developer (and R-bloggers editor) Tal Galili just published the answers to a question many R users have asked: which are the most popular R packages? He wrote some R code to rank the top 100 packages by number of downloads. Here's the top 10:
The source data are the download logs from the RStudio CRAN mirror, whose logs RStudio team has helpfully made available in anonymized form for analysis. The Downloads column is the number of downloads of each package from January-May 2013, from a single CRAN mirror: it does not include downloads from the primary CRAN mirror or any of the 88 other CRAN mirrors. Just goes to show that there are at least 84,000 R users using this mirror alone (and most likely many more: not every R user installs a CRAN package). And as Ramnath Vaidyanathan shows in the map below, those R users are distributed all around the world:
If you're an R package author, you may be interested to know how often your package has been downloaded. Tal Gallili provides some code to answer that question as well. For example, here are the daily downloads for the Revolution Analytics contributed pacakage foreach:
R-Statistics blog: Top 100 R packages for 2013 (Jan-May)!
Most of the top 10 make sense, but why is 'digest' #2?
Posted by: Kevin Wright | June 14, 2013 at 11:24
@Kevin, not sure. It's a dependency (check "reverse depends") for a number of popular packages like profR and testthat, which may explain it. I think RStudio may also install "helpr" on some platforms as well -- it's also a dependency for that package.
Posted by: David Smith | June 14, 2013 at 11:37
@Kevin If you look at the reverse dependencies of digest on http://cran.r-project.org/web/packages/digest/index.html, you will see several packages which use it including the number 3 package, ggplot2, and knitr, which is integrated with Rstudio. Anyone downloading any of those packages would also download their dependencies which include digest.
Posted by: Brian Diggs | June 14, 2013 at 12:10
someone should really have fixed this map before it spread out. I know it's a small island at an inconvenient mapping location, but still, R was born in New Zealand!
Posted by: Baptiste | June 14, 2013 at 14:08
The fault is entirely mine. The screenshot was clipped incorrectly. The interactive map has NZ though.
Posted by: Ramnath | June 14, 2013 at 17:48
The time series graph is quite interesting. I am guessing, it reveals the pattern of people's workload. Monday is slow, coming off the weekend, on Tuesday work picks up, and peaks on Wednesday. Thursday and Friday people start to prepare for the weekend and slow down. Nevertheless, there are some dedicated souls who work during Sat & Sun, or simply curious analysts, who work on personal projects.
Posted by: Valentin | June 17, 2013 at 14:16
@Kevin Actually, the top 10 didn't make any sense to me. One should obviously take dependencies into account as David also suspected. I believe the top of this list is dominated by ggplot2 as it imports the packages plyr, digest, reshape2, proto, and scales which are all in the top 10 as well.
Posted by: Maarten-Jan Kallen | June 20, 2013 at 03:05