Andrew Flowers, data journalist and contributor to FiveThirtyEight.com, announced at last weeks' RStudio conference the availability of a new R package containing data and analyses from some of their data journalism features: the fivethirtyeight package. (Andrew's talk isn't yet online, but you can see him discuss several of these stories in his UseR!2016 presentation.) While not an official product of the FiveThirtyEight editorial team, it was developed by Albert Y. Kim, Chester Ismay and Jennifer Chunn under their guidance. Their motivation for producing the package was to provide a resource for teaching data science:
We are involved in statistics and data science education, in particular at the introductory undergraduate level. As such, we are always looking for data sets that balance being
- Rich enough to answer meaningful questions with, real enough to ensure that there is context, and realistic enough to convey to students that data as it exists “in the wild” often needs processing.
- Easily and quickly accessible to novices, so that we minimize the prerequisites to research.
The package includes data sets from dozens of data journalism stories, including stories about police killings in the USA, plane crashes, and even references to presidential candidates in hip-hop lyrics. There is also a complete worked analysis of performace of movies satisfying the Bechdel Test, presented as an Rmarkdown vignette.