Welcome, Hacker News readers! From the comments there, it seems like Shazam may have been prototyped in Matlab, so you may be interested in this guide to R for Matlab users. If you liked that article, you might also like this post about reading data from a Google Spreadsheet into R, or to download REvolution R. Check out our monthly roundups for other articles of interest.
Shazam is an application on the iPhone (and available on other platforms, too) that accomplishes a seemingly amazing task. If you hear a song on the radio (perhaps as the backing tune to a commercial) or in a store or bar, and you can't recall the title and artist, Shazam will tell you what the song is. You hold the phone up to capture about 10 seconds of music, and just a few seconds later (after communicating with its remote servers) Shazam will identify the song (and give you the opportunity to buy it).
This seemed almost like magic to me. I'm a statistician: I've been working with large, messy data sets for years, and statistical methods can be wonderful at extracting signal from noise. But think what this application accomplishes: Shazam can recognize a short audio sample of music that has been broadcast, mixed with heavy ambient noise, captured by a low-quality cellphone microphone and subjected to voice codec compression, amongst other abuses. Not only that, but the actual song is identified from an arbitrary 10-second sample within a database of over 2 million songs in just a few seconds. I just couldn't figure out how such fuzzy matching could be done, with such high reliability, in such a short time.
Well, now I know, thanks to an article in Slate that linked to a explanation of the original paper describing the method. The key is in compressing the database of 2M tunes using spectrographic methods. Each song is processed to create a frequency-intensity chart over time from which (and this is the key) only the peaks are extracted. Each song is therefore converted into a "constellation map" of, essentially, the frequencies of the loudest notes over time:
Judging from the chart above (taken from the paper) this creates about 3 data points (or "stars") per second of music in the database.
The same process is applied to the 10-second sample from the iPhone. Because of the background noise and other problems, some of the "stars" in the noisy sample may be missing, and some spurious ones added, but using a process of conceptual "astrogation" Shazam slides the sampled star chart across the much wider (in time) star charts of every song in the database, until a place in a song is found where there is a significant overlap of stars. This is the song you were looking for.
The actual process used for performing the matching uses an ingenious hashing process (explained in this blog post) which means the astrogation task can be completed in just a few seconds. The complete details are in the paper. Unfortunately, there's no indication in the paper of what software was used to develop the process (although the scatterplots in the paper do look decidedly R-like).
Free Won't: How Shazam Works