CRAN, the global repository of open-source packages that extend the capabiltiies of R, reached a milestone today. There are now more than 10,000 R packages available for download*.
(Incidentally, that count doesn't even include all the R packages out there. There are also another 1294 packages for genomic analysis in the BioConductor repository, hundreds of R packages published only on GitHub, commercial R packages from vendors such as Microsoft and Oracle, and an unknowable number of private, unpublished packages.)
Why so many packages? R has a very active developer community, who contribute new packages to CRAN on a daily basis. As a result, R is unparalleled in its capabilities for statistical computing, data science, and data visualization: almost anything you might care to do with data has likely already been implemented for you and released as an R package. There are several reasons behind this:
- R is the most popular language for data scientists — and it's been around for almost 20 years — and so by sheer force of numbers and time, R has more extensions than any other data science software.
- R is the primary tool used for statistical research: when new methods are developed, they're not just published as a paper — they're also published as an R package. That means R is always at the cutting edge of new methodologies.
- R was designed as an interface language — a means to present a consistent language interface for algorithms written in other languages. Many packages work by providing R language bindings to other open-source software, making R a convenient hub for all kinds of algorithms and methods.
- Last but certainly now least, the CRAN system itself is a very effective platform for sharing R extensions, with a mature system for package authoring, building, testing, and distribution. The R core team and in particular the CRAN maintainers deserve significant credit for creating such a vibrant ecosystem for R packages.
Having so many packages available can be a double-edged sword though: it can take some searching to find the package you need. Luckily, there are some resources available to help you:
- MRAN (the Microsoft R Application Network) provides a search tool for R packages on CRAN.
- To find the most popular packages, Rdocumentation.org provides a leaderboard of packages by number of downloads. It also provides lists of newly-released and recently-updated packages.
- CRAN provides package Task Views, providing a directory of packages by topic area (such as Finance or Clinical Trials). MRAN and RDocumentation.org also provide searchable versions based on the CRAN task views.
- To find popular and active R packages on GitHub, see the Trending R repositories list.
- For curated news on updated and new R packages, check out the Package Picks by Joseph Rickert on RStudio's RViews blog, and also the Package Spotlights published with each Microsoft R Open release. Cranberries also provides a comprehensive uncurated feed of new and updated packages.
The rate of R package growth shows no signs of abating, either. As you can see from this chart (created using this script by Gergely Daróczi), the growth in R packages shows no signs of plateauing soon. (This chart includes packages released and subsequently withdrawn from CRAN, which is why it goes over 10,000.)
Know of any other resources for exploring R packages? Let us know in the comments.
*Actually, as of this writing, there are 9,999 packages on CRAN. So close! But it won't be long...