In scientific discovery, the first three paradigms were experimental, theoretical and (more recently) computational science. A new book of essays published by Microsoft (and available for free download -- kudos, MS!) argues that a fourth paradigm of scientific discovery is at hand: the analysis of massive data sets. The book is dedicated to the late Microsoft researcher Dr Jim Gray, who pioneered the idea with the catchphrase: "It's the data, stupid". The basic idea is that our capacity for collecting scientific data has far outstripped our present capacity to analyze it, and so our focus should be on developing technologies that will make sense of this "Deluge of Data" (as this New York Times review of the book -- well worth a read -- calls it).
Dr Gray's call-to-arms was not to develop isolated super-powerful super-computers but βto have a world in which all of the science literature is online, all of the science data is online, and they interoperate with each other.β This dream is already close to a reality in some scientific domains like astronomy, where advanced instruments routinely generate petabytes of data available for public analysis. And with further developments in distributed and high-performance computing, with freely-available high-scale data management tools like Hadoop, and with advanced open-source data-analysis tools like R rapidly adapting to the scales of these data sets, the fourth paradigm is certain to become a mainstream reality in other scientific domains as well.
Microsoft Research: The Fourth Paradigm: Data-Intensive Scientific Discovery
David, Thanks for sharing the "free download" link. I caught the NYT article yesterday but it didn't mention the free download part.
Posted by: Rama Ramakrishnan | December 16, 2009 at 10:01
Can't remember who said it, but just as Google seeks to organize the world's information, Elsevier and Wiley are mounting a valiant struggle to hide it :-)
Posted by: Tim | December 16, 2009 at 16:12
Great post David, thank you.
Posted by: Tal Galili | December 17, 2009 at 05:56
I agree. Information/data these days are really significant for advancement.
Posted by: Ruby at Science Summer Camp | March 01, 2011 at 21:53