by Andrie de Vries
Reproducible research has been integral to the ethos of R for many years. For example, literate programming allowed you to embed R into various report writing systems. Firstly, there was Sweave, that allowed you to embed R into latex to produce PDF or HTML documents. More recently, knitr and RMarkdown evolved, allowing you to very easily create HTML pages as well as other formats, including HTML5 presentations and even Word documents.
Sweave, knitr and RMarkdown all allow you to write in a markup language, then tag chunks of text as R code. These chunks then get evaluated and the output is inserted into the markup, finally to be "compiled" into the desired output format.
Other languages have not had the same close integration with latex and HTML and evolved different systems of interweaving code and text.
Notably, the Python world developed the IPython notebook system. Notebooks also allow you to write text, but you insert code blocks as "cells" into the notebook. A notebook is interactive, so you can execute the code in the cell directly, unlike latex and knitr where you essentially build the entire document to get the output.
Recently the Notebook idea took a much enhanced vision and scope, to explicitly allow languages other than Python to run inside the cells. Thus the Jupyter Notebook was born, a project initially aimed at Julia, Python and R (Ju-Pyt-e-R). But in reality many other languages are supported in Jupyter.
How do you use Jupyter?
Once Jupyter is up and running (installation instructions follow below), you interact with it on a web page.
The page itself is interactive, and you can designate each cell as either markdown or code. By pressing evaluate on the menu (or Shift+Enter), you can evaluate a single cell, or the entire notebook.
Benefits of using Jupyter
Jupyter was designed to enable sharing of notebooks with other people. The idea is that you can write some code, mix some text with the code, and publish this as a notebook. In the notebook they can see the code as well as the actual results of running the code.
This is a nice way of sharing little experimental snippets, but also to publish more detailed reports with explanations and full code sets. Of course, a variety of web services allows you to post just code snippets (e.g. gist). What makes Jupyter different is that the service will actually render the code output.
One interesting benefit of using Jupyter is that Github magically renders notebooks. See for example, the github Notebook gallery.
Jupyter itself is written in Python. Thus if you want to install Jupyter yourself, the process involves installing Python, followed by the Jupyter notebook modules, finally activating the R kernel. If you want to have fine-grained control of the process, the Juptyer notebook page has instructions for doing this. (Install manually if you want to control the version of R to use, which R kerner to use, etc.)
However, if you can live with a somewhat brute force approach, here are the instructions for a simple installation. You simply install the miniConda distribution of Python. miniConda is a minimal install of the Anaconda distribution of Python, a configuration of Python as a scientific computing platform. (The miniConda install also installs R and the IRKernel.)
Step 1: install miniConda
- Get and install miniConda for Python 3 at http://conda.pydata.org/miniconda.html
- Important: install python 3
Step 2: open an OS terminal window:
conda install -c r ipython-notebook r-irkernel
A selection of other blog posts:
Although there is not a lot of information in the blogosphere comparing Jupyter and knitr, here are a few blog posts I find and you may want to read:
- Interactive R notebooks with Jupyter and Sagemathcloud
- Yihui Xie on comparing knitr to IPython Notebook (2012)
- Literate programming, RStudio, and IPython Notebook
What is your view?
Do you think there is scope for Jupyter in the R world? Does Notebooks offer something you can't do with knitr?
Tell us what you think in the comments.