February 22, 2013


Google "fake data science" or click on my name to read my comments about this book.

Vincent, not sure I agree with your assessment of this book. Hadoop/NoSQL is a part of data science to be sure, but not a *necessary* part. (You can do data science on many different data platforms, including small-data platforms.) Statistics *is* a necessary part though, and I wish more practitioners labelling themselves "data scientist" had a better grounding in the statistical basics. That's why I think this is a valuable book, especially given the price tag,

@Shailendra: The book can mislead people into thinking that data science = statistics + R. It also includes graph models and databases, processes for big data (read my article on the curse of big data to see why traditional statistics fail with big data), computer science, business analytics and more.

Statistics + R alone is not data science. It's like saying that gastronomy is French cuisine.

The book is copyright under the CC license which restricts its use to non-commercial endeavors. Apparently no one at a company can look at the book.

As a compete beginner, I found this book is quite interesting, but yes, it's more like a Stats + R programming book..
