With the growing popularity of R, there is an associated increase in the popularity of online forums to ask questions. One of the most popular sites is StackOverflow, where more than 60 thousand questions have been asked and tagged to be related to R.
On the same page, you can also find related tags. Among the top 15 tags associated with R, several are also packages you can find on CRAN:
- ggplot2
- data.table
- plyr
- knitr
- shiny
- xts
- lattice
It very easy to install these packages directly from CRAN using the R function install.packages(), but this will also install all these package dependencies.
This leads to the question: How can one determine all these dependencies?
It is possible to do this using the function available.packages() and then query the resulting object.
But it is easier to answer this question using the functions in a new package, called miniCRAN, that I am working on. I have designed miniCRAN to allow you to create a mini version of CRAN behind a corporate firewall. You can use some of the function in miniCRAN to list packages and their dependencies, in particular:
- pkgAvail()
- pkgDep()
- makeDepGraph()
I illustrate these functions in the following scripts.
Start by loading miniCRAN and retrieving the available packages on CRAN. Use the function pkgAvail() to do this:
library(miniCRAN) pkgdata <- pkgAvail(repos = c(CRAN="http://cran.revolutionanalytics.com"), type="source") head(pkgdata[, c("Depends", "Suggests")]) ## Depends Suggests ## A3 "R (>= 2.15.0), xtable, pbapply" "randomForest, e1071" ## abc "R (>= 2.10), nnet, quantreg, MASS" NA ## abcdeFBA "Rglpk,rgl,corrplot,lattice,R (>= 2.10)" "LIM,sybil" ## ABCExtremes "SpatialExtremes, combinat" NA ## ABCoptim NA NA ## ABCp2 "MASS" NA
Next, use the function pkgDep() to get dependencies of the 7 popular tags on StackOverflow:
tags <- c("ggplot2", "data.table", "plyr", "knitr", "shiny", "xts", "lattice") pkgList <- pkgDep(tags, availPkgs=pkgdata, suggests=TRUE) pkgList ## [1] "abind" "bit64" "bitops" "Cairo" ## [5] "caTools" "chron" "codetools" "colorspace" ## [9] "data.table" "dichromat" "digest" "evaluate" ## [13] "fastmatch" "foreach" "formatR" "fts" ## [17] "ggplot2" "gtable" "hexbin" "highr" ## [21] "Hmisc" "htmltools" "httpuv" "iterators" ## [25] "itertools" "its" "KernSmooth" "knitr" ## [29] "labeling" "lattice" "mapproj" "maps" ## [33] "maptools" "markdown" "MASS" "mgcv" ## [37] "mime" "multcomp" "munsell" "nlme" ## [41] "plyr" "proto" "quantreg" "RColorBrewer" ## [45] "Rcpp" "RCurl" "reshape" "reshape2" ## [49] "rgl" "RJSONIO" "scales" "shiny" ## [53] "stringr" "testit" "testthat" "timeDate" ## [57] "timeSeries" "tis" "tseries" "XML" ## [61] "xtable" "xts" "zoo"
Wow, look how these 7 packages have dependencies on 63 other packages!
You can graphically visualise these dependencies in a graph, by using the function makeDepGraph():
p <- makeDepGraph(pkgList, availPkgs=pkgdata) library(igraph) plotColours <- c("grey80", "orange") topLevel <- as.numeric(V(p)$name %in% tags) par(mai=rep(0.25, 4)) set.seed(50) vColor <- plotColours[1 + topLevel] plot(p, vertex.size=8, edge.arrow.size=0.5, vertex.label.cex=0.7, vertex.label.color="black", vertex.color=vColor) legend(x=0.9, y=-0.9, legend=c("Dependencies", "Initial list"), col=c(plotColours, NA), pch=19, cex=0.9) text(0.9, -0.75, expression(xts %->% zoo), adj=0, cex=0.9) text(0.9, -0.8, "xts depends on zoo", adj=0, cex=0.9) title("Package dependency graph")
So, if you wanted to install the 7 most popular packages R packages (according to StackOverflow), R will in fact download and install up to 63 different packages!
There's a function package_dependencies in the tools package which does something similar.
But you do need to be careful how you write this: for example, my rgl package is in your list of 63, but not in your graph. (It is suggested by knitr, it's not a true dependency. Not sure what the criteria were for the graph.)
Posted by: Duncan Murdoch | July 08, 2014 at 12:26
Why are there packages displayed in the graph that are not linked by arrows to an initial package (not displayed as dependencies)? Ex. Sp and Maptools
Posted by: Alex | July 10, 2014 at 00:47
Hi, Duncan
I should have mentioned in the post that pkgDeps() is a wrapper around tools::package_dependencies where I set defaults for my application. I'll update the blog post to reflect this.
I am working on tracing the differences between the package list and the graph. The root cause has to do with different settings for recursion in the two functions. Once resolved, I'll publish the revision.
Andrie
Posted by: Andrie de Vries | July 14, 2014 at 06:58