Mike Driscoll showed his PitchFX application, which visualizes the performance of major league baseball pitchers. I was intrigued to learn that the MLB tracks a host of statistics about each pitch thrown by each pitcher in pro baseball games: the speed of the pitch, the type of pitch (fastball, curveball, changeup, etc.) and the X-Y location where the pitch enters the batter's box. MLB have made these data available to the public, and Mike has created a neat application for visualizing it.
With PitchFX you can choose any pitcher from any team and get an instant analysis of his pitching style from the 2008 season. For example, here's the analysis for the Red Sox's Clay Buchholz:
Buchholz uses only four different pitches. The top row is a lattice chart showing the distributions of speeds for each type of pitch: fastballs are, uh, fast (80-85 mph and up); changeups are slower; sliders come at a range of speeds. But it's when you combine the pitch speed with the location that things really get interesting: that's what the bottom chart shows. Each dot is a pitch, and its location matches the location the pitch landed in the batter's box. The redder the dot, the faster the pitch (likewise, the bluer the slower); the darker the dot, the more pitches in that location. (Mike used the colorspace package to generate the range of colors in the plot.) Here you can see some interesting characteristics of Buchholz's pitching. Fastballs are high and left. Sliders come in two varieties: high above the plate but slow, or midspeed and to the right. Curveballs go low and slow, but occasionally hit the strike zone. Changeups are fairly localized, but the speed is unpredictable.
Mike also presented the technical details of how he created the PitchFX app. The charts are constructed in R, but R itself runs as a module of the Apache web server, thanks to Rapache. When you choose an analysis in the PitchFX web application the request is sent to Rapache, which in turn draws the data from a MySQL database for analysis in R. This creates a chart which in turn is sent to the web browser for display on your screen. The caching functionality of Apache means that charts are not needlessly regenerated unless the underlying data change. [Update May 10: Mike helpfully provided an architecture diagram which makes things clearer, shown below.]
It's a really well-put together application, and makes it easy and fun to compare the styles of your favourite pitchers.
Also in the same meeting, John Oram showed an outstanding app also using Rapache for visualizing environmental data in the Bay Area. More on that, tomorrow.
Dataspora: PitchFX viewer