Choropleth maps are a popular way of representing spatial or geographic data, where a statistic of interest (say, income, voting results or crime rate) are color-coded by region. R includes all of the necessary tools for creating choropleth maps, but Trulia's Ari Lamstein has made the process even easier with the new choroplethr package now available on github. With couple of lines of code, you can easily convert a data frame of values coded by country, state, county or zip code into a choropleth like this:
The chart above shows the US zip codes with the highest per-capita incomes, based on data from the US Census Bureau's American Community Survey. The choroplethr package also includes an interface to ACS data, so if you know the data code you're looking for, you can create a choropleth of your favourite US demographic statistic with just a single line of code, like this:
choroplethr_acs(tableId="B19301", lod="zip")
You can read more information about the capabilities of the choroplethr package at the Trulia Tech+Design Blog. Many thanks to Trulia and Ari Lamstein for supporting the development of this useful package!
Trulia Tech+Design Blog: The choroplethr package for R
Technically speaking I don't think that example is a choropleth since the color doesn't fill the geographical shape.
Posted by: Tom | January 22, 2014 at 17:41
Tom, you might be right. If you click thru to my blog post you will see that choroplethr attempts to render ZIP code level choropleths as scatterplots against an outline of the US. Zip-level choropleths are tricky for a number of reasons, and I mention what I think are some of the key issues in the blog post.
One thing to keep in mind is that it is virtually impossible to render the borders of zipcodes on a national map because zipcodes are so small in comparison to a standard image of the US. If you attempt to render that the fill color would not rendered at all because there is not even enough room to render the borders.
Feel free to pass along references to people who have done a better job of this in R, or to make a pull request to the project yourself. I would like to handle this case better, but felt that it would be interesting enough to people to release what I had.
Posted by: Ari | January 22, 2014 at 22:32
Ari, Thanks for your work! I can certainly see the challenge and I'm not sure a method exists to show that level of granularity at that extent. (Although I'm curious if you tried plotting with fill only and leaving the borders out altogether.)
That said, choropleth maps are not always the best solution, so offering the centroid map I think is a great idea. (And I don't think anyone will cry if you leave the package name the same :)
Posted by: Tom | January 23, 2014 at 19:44
Ari, thank you and everyone else involved in the development of the choroplethr package! As a GIScientist I'm always happy to see progress and new developments that enable people to use maps and spatial methodologies to analyze or visualize data - even more so in the context of free software!
Yet, being a geographer I also have to endorse Tom: what's shown in the map above is not a choropleth map. Choropleth maps are used to visualize attributes of areal units (in your case US zip codes) in their spatial context. What you're showing in the map is a visualization showing the zip codes represented by their area centroids (possibly) and assigned a graduated color symbology.
While there's nothing wrong with this (apart from being mislabeled "choropleth map") the reason for choosing this methodology was (if I understood your answer on Tom's comment correctly) the fact that zip code areas are too small to show up on a map of the contiguous United States. As it turns out, the visualization you chose doesn't really solve that problem: just have a look at the northeastern states, where points are overlapping and obscuring each other.
Tom suggested one very good idea to fix the issue you mentioned about the polygon borders obscuring the color symbology of the polygon areas: just get rid of the borders. A way to fix your graduated color point symbology would be to either make the symbol size smaller (which would probably not solve the overlapping completely, but instead make points in sparsely populated areas like Nevada more difficult to see), or make the points half-transparent.
Still, since your package is called choroplethr and there is also no need (and no good reason) to visualize zip code information simplified to their centroids, I suggest to go with Tom's suggestion. Or, use a different spatial scale of agglomeration. I'm not sure there's a point in showing data on a level as fine as zip codes on a map of the contiguous United States...
That said, I myself am not a fully-fledged programmer myself, but I would be happy to assist or participate where I can in the further development of your package. So if you see any need for a map lover's (nay: geek's) input, please let me know. As I said above, I'm thrilled to see geographic information and it's use in visualizations becoming more and more common, and I'm glad to help where I can. This package looks like a great effort and a very good start! Thanks again for that.
Posted by: Konstantin | January 23, 2014 at 21:01
Thanks to both Tom and Konstantin for your thoughtful comments. Just to clarify, choroplethr also renders state- and county-level choropleths, so I think that it is currently only the ZIP map which does not quite fit the name :)
Konstantin, help on the project would be greatly appreciated. I recently created a forum for the project which is currently just used for technical support. But perhaps we could continue this conversation there: https://groups.google.com/forum/#!forum/choroplethr
I am working from memory, but I think that a few months ago I tried to create ZCTA choropleths by downloading the ZCTA shapefile from the census here: http://www.census.gov/cgi-bin/geo/shapefiles2013/main. I think that it was 501MB zipped, and rendering it did not work well.
At that point I realized that I didn't understand shapefiles and mapping well enough to tackle the problem properly. Instead I signed up for a GIS class on Coursera: https://www.coursera.org/course/maps. It starts in April, and I was expecting to revisit this problem after completing the course.
Posted by: Ari | January 24, 2014 at 15:20
Dr. Robinson's Map MOOC is an awesome introduction to all things mapping and GIS - I highly recommend it!
Moving over to said forum website now.
Posted by: Konstantin | January 24, 2014 at 23:37
I had a similar problem with Canadian data at the postal code level (our "equivalent" to US zip codes).
I ended up creating Voronoi polygons for each postal code (using the deldir package) to cover the whole surface and filling each of them with color.
Posted by: Étienne Chassé St-Laurent | February 12, 2014 at 06:12
FYI, since receiving the amazing comments here about how choroplethr renders zipcodes I invested quite a bit of time trying to figure out how to render that map as a true choropleth. I couldn't figure it out myself, but did get far enough to post what I had tried and what I want to accomplish on gis.stackexchange:
http://gis.stackexchange.com/questions/87160/how-can-i-create-a-choropleth-from-the-2010-census-zcta-shapefile-using-r-and-gg?stw=2
Hopefully someone there can provide a good solution to this which I can incorporate into the next version of choroplethr.
Posted by: Ari | February 20, 2014 at 10:11