The O'Reilly Data Scientist Survey for 2014 is out, with fresh data on the salaries and tools used by data scientists. Jon King has a summary of the results, but not much has changed since last year: median income is down very slightly ($100k in 2013 vs $98k in 2014), and the most popular analysis tools (excluding operating systems) remain — in rank order — SQL, Excel, R and Python.
Looking futher down into the tails of the popular data analysis tools yields some surprising results, however:
The big surprise for me was the low ranking of NumPy and SciPy, two toolkits that are essential for doing statistical analysis with Python. In this survey and others, Python and R are often similarly ranked for data science applications, but this result suggests that Python is used about 90% for data science tasks other than statistical analysis and predictive analytics (my guess: mainly data munging). From these survey results, it seems that much of the "deep data science" is done by R.
O'Reilly: 2014 Data Science Salary Survey
Great summary. Did you mean to write "SQL, Excel, & R" instead? Looks like Python is fourth.
Posted by: DD | December 10, 2014 at 13:29
I think a lot of respondents would have counted NumPy+SciPy in the python category, as they are just useful python packages. The R equivalent to these modules is its packages, but you didn't look at their use of any specific R package like party, randomForest or e1071.
I think a better approach would be to collapse the NumPy+SciPy categories into the python category for a more accurate representation.
Really interesting to see the associated salaries as well. If I were a less honest person I might push the line that if you use excel, learning R will push your salary up ~10K !
Posted by: Jegar | December 11, 2014 at 01:14
Tableau is the shocker for me. Tableau is pretty easy to learn (likewise, Spotfire, which should be listed as well).
Posted by: Tom | December 11, 2014 at 08:01
Given how many catastrophes over the last few decades have been shown to be due to Excel muck-ups (both Excel internals and user "errors"), one might be happier if Excel salaries were negative?
Posted by: Robert Young | December 11, 2014 at 10:14
-- I think a lot of respondents would have counted NumPy+SciPy in the python category, as they are just useful python packages.
Perhaps, but the median and range for the latter are clearly higher, suggesting that those who report are doing something specific to the packages. I'll guess Wall Street number crunching.
Posted by: Robert Young | December 11, 2014 at 10:18
@DD, thanks -- I did indeed mean to write "SQL, Excel, R and Python". I updated the post accordingly.
Posted by: David Smith | December 11, 2014 at 10:41
where are these mythical R jobs?
Posted by: Tom | December 14, 2014 at 03:13