CRAN, the global repository of open-source packages that extend the capabiltiies of R, reached a milestone today. There are now more than 10,000 R packages available for download*.
(Incidentally, that count doesn't even include all the R packages out there. There are also another 1294 packages for genomic analysis in the BioConductor repository, hundreds of R packages published only on GitHub, commercial R packages from vendors such as Microsoft and Oracle, and an unknowable number of private, unpublished packages.)
Why so many packages? R has a very active developer community, who contribute new packages to CRAN on a daily basis. As a result, R is unparalleled in its capabilities for statistical computing, data science, and data visualization: almost anything you might care to do with data has likely already been implemented for you and released as an R package. There are several reasons behind this:
- R is the most popular language for data scientists — and it's been around for almost 20 years — and so by sheer force of numbers and time, R has more extensions than any other data science software.
- R is the primary tool used for statistical research: when new methods are developed, they're not just published as a paper — they're also published as an R package. That means R is always at the cutting edge of new methodologies.
- R was designed as an interface language — a means to present a consistent language interface for algorithms written in other languages. Many packages work by providing R language bindings to other open-source software, making R a convenient hub for all kinds of algorithms and methods.
- Last but certainly now least, the CRAN system itself is a very effective platform for sharing R extensions, with a mature system for package authoring, building, testing, and distribution. The R core team and in particular the CRAN maintainers deserve significant credit for creating such a vibrant ecosystem for R packages.
Having so many packages available can be a double-edged sword though: it can take some searching to find the package you need. Luckily, there are some resources available to help you:
- MRAN (the Microsoft R Application Network) provides a search tool for R packages on CRAN.
- To find the most popular packages, Rdocumentation.org provides a leaderboard of packages by number of downloads. It also provides lists of newly-released and recently-updated packages.
- CRAN provides package Task Views, providing a directory of packages by topic area (such as Finance or Clinical Trials). MRAN and RDocumentation.org also provide searchable versions based on the CRAN task views.
- To find popular and active R packages on GitHub, see the Trending R repositories list.
- For curated news on updated and new R packages, check out the Package Picks by Joseph Rickert on RStudio's RViews blog, and also the Package Spotlights published with each Microsoft R Open release. Cranberries also provides a comprehensive uncurated feed of new and updated packages.
The rate of R package growth shows no signs of abating, either. As you can see from this chart (created using this script by Gergely Daróczi), the growth in R packages shows no signs of plateauing soon. (This chart includes packages released and subsequently withdrawn from CRAN, which is why it goes over 10,000.)
Know of any other resources for exploring R packages? Let us know in the comments.
*Actually, as of this writing, there are 9,999 packages on CRAN. So close! But it won't be long...
Howdy! A week or two ago I made a Shiny app for exploring R packages listed in CRAN's Task Views (with source at GitHub) and that can be useful if you have licensing restrictions for your project.
Posted by: Bearloga | January 27, 2017 at 17:16
findFn (sos package)
Do a search for functions, which opens an interactive HTML page of results. The results may be sorted in a variety of ways and also link to help files for each function.
http://rfunction.com/archives/2525
Posted by: gd047 | January 27, 2017 at 21:51
I am a beginner R user. Since I've started to learn, I wonder who is it proved whether these packages work right? Who is checking for functions, particularly complicated ones whether they are written right? What is the best way of understanding a package is good or bad? One answer to my last question would be reading blogs, however, many people who doesn't know statistics use R just by imitating what others have done. So they are not able to see the bugs or mistakes but able to sustain the ones before....
Posted by: Bahar Patlar | January 29, 2017 at 02:11
@Bahar, this blog post on Good R Packages provides some advice.
Posted by: David Smith | January 30, 2017 at 08:19