If you have dense data on a continuous scale, an effective way of representing the data visually is to use a heatmap, where the values are represented by a color on a continuous scale. For example, this chart from a Wall Street Journal interactive feature (and mentioned in Tal Galili's useR!2016 talk) represents the number of measles cases in each US state and year by a colored square:
(Here's how to create that chart in R.) But, note that scale at the bottom of the chart, mapping measles cases to a color on the rainbow. Here, we'll zoom in on it:
The scale you choose for a heat map is very important, and has a major impact on how the viewer will interpret the data presented. This scale has been chosen with care: while most of the scale is red, very few of the data cells are red (because the distribution of measles cases is skewed, thanks in particular to the introduction of a vaccine in 1964). A naively chosen scale would wash out the data.
The actual colors you choose are important too. The physics, technology, and neuroscience behind the interpretation of colors is surprisingly complex, but this talk on the default color schemes used in Python's matplotlib does a great job of explaining:
You can easily use the viridis color scales in R as well, thanks to the viridis package by Simon Garnier, which is available on CRAN. The package provides for heatmap color schemes, all carefully chosen for optimized perception and usefulness for color-impaired viewers.
You can find several examples of using the viridis color pallettes in the package vignette, both for base R graphics (including raster) and ggplot2. To get started, just install.packages("viridis") to install the package from CRAN.
Github (Simon Garnier): viridis
Comments