O'Reilly has just published the results of the Data Scientist Salary Survey, based on data collected from attendees of the O'Reilly Strata conferences in 2012 and 2013. There were some interesting results from the salary portion of the survey:
- data scientists at early-stage startups earned a median salary of US$130,000
- data scientists at public companies earned a higher median salary (US$110,000) than those at private companies (US$100,000)
- data scientist using primarily open-source tools earned a higher median salary (US$130,000) than those using proprietary tools (US$90,000)
On that last point, the tool usage section of the survey also held interesting results. Each respondent listed multiple tools that they used both in data roles and non-data roles, and the results are summarized below:
That SQL tops the list is no surprise: most data scientists need to access a database at some point. But of non-database tools, R is the most-used tool, closely followed by Python. From the survey report:
The preponderance of R and Python usage is more surprising —operating systems aside, these were the two most commonly used individual tools, even above Excel, which for years has been the go-to option for spreadsheets and surface-level analysis. R and Python are likely popular because they are easily accessible and effective open source tools for analysis.
It's also interesting to note that the "traditional" proprietary data analysis tools, SAS and SPSS, fall at the bottom of the list. This isn't a random sample by any means — the attendees at Strata are heavily weighted towards US-based startups — but it's certainly indicative of where the market for data analysis products is going. R is also the top-ranked data analysis tool in recent surveys by KDNuggets and Rexer Analytics.
You can download the full report (free registration required) fom the the O'Reilly website at the link below.
O'Reilly Media: 2013 Data Science Salary Survey
You need to strip out the Revolution URL fragment from the O'Reilly link. And, so far as the bias goes, it's also a heavily OS slanted group, so the lack of SPSS/SAS/BMDP/etc. should come as no surprise.
Posted by: Robert Young | January 16, 2014 at 08:15
Thanks for the comment Robert. (I fixed the problem with the link.)
Posted by: David Smith | January 16, 2014 at 11:10
As stated in the report "We should note that Strata attendees comprise a special group and do not form an unbiased sample of everyone who seriously works with data."
This is definitely a convenience sample which profiles the attendees of the conference more than it does profile all data scientists.
Posted by: Ralph Winters | January 19, 2014 at 06:03