For almost five years, the entire CRAN repository of R packages has been archived on a daily basis at MRAN. If you use CRAN snapshots from MRAN, we'd love to hear how you use them in this survey. If you're not familiar with the concept, or just want to learn more, read on.
Every day since September 17, 2014, we (Microsoft and, before the acquisition, Revolution Analytics) have archived a snapshot of the entire CRAN repository as a service to the R community. These daily snapshots have several uses:
- As a longer-term archive of binary R packages. (CRAN keeps an archive of package source versions, but binary versions of packages are kept for a limited time. CRAN keeps package binaries only for the current R version and the prior major version, and only for the latest version of the package).
- As a static CRAN repository you can use like the standard CRAN repository, but frozen in time. This means changes to CRAN packages won't affect the behavior of R scripts in the future (for better or worse).
options(repos="https://cran.microsoft.com/snapshot/2017-03-15/")
provides a CRAN repository that works with R 3.3.3, for example — and you can choose any date since September 17, 2014. - The checkpoint package on CRAN provides a simple interface to these CRAN snapshots, allowing you use a specific CRAN snapshot by specifying a date, and making it easy to manage multiple R project each using different snapshots.
- Microsoft R Open, Microsoft R Client, Microsoft ML Server and SQL Server ML Services all use fixed CRAN repository snapshots from MRAN by default.
- The rocker project provides container instances for historical versions of R, tied to an appropriate CRAN snapshot from MRAN suitable for the corresponding R version.
MRAN and the CRAN snapshot system was created at a time when reproducibility was an emerging concept in the R ecosystem. Now, there are several methods available to ensure that your R code works consistently, even as R and CRAN changes. Beyond virtualization and containers, you have packages like packrat and miniCRAN, RStudio's package manager, and the full suite of tools for reproducible research.
As CRAN has grown and changes to packages have become more frequent, maintaining MRAN is an increasingly resource-intensive process. We're contemplating changes, like changing the frequency of snapshots, or thinning the archive of snapshots that haven't been used. But before we do that we'd like to hear from the community first. Have you used MRAN snapshots? If so, how are you using them? How many different snapshots have you used, and how often do you change that up? Please leave your feedback at the survey link below by June 14, and we'll use the feedback we gather in our decision-making process. Responses are anonymous, and we'll summarize the responses in a future blog post. Thanks in advance!
Comments
You can follow this conversation by subscribing to the comment feed for this post.