« New R User Groups in Europe | Main | Where should you publish that next paper? »

February 10, 2011

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a010534b1db25970b0147e27d8de0970b

Listed below are links to weblogs that reference R overtakes SAS and Matlab in programming language popularity:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Please post more fried chicken stories. If you don't have any more then it is OK to post a fried Turkey story. Thank you!!!

I think that comment went straight over my head...

Hi David,
Fellow blogger Rick Wicklin from SAS here. I'm speaking for myself and as a statistician, not on behalf of my company.

A statistical question. Can you really draw a valid statistical conclusion about the relative popularity of the smaller languages in the Tiobe data? The data seem to have a lot of statistical variation.

The Tiobe ratings vary by up to 0.1% from month to month. For the languages ranked greater than #15 or so, the monthly variation is a substantial portion of their Tiobe rating. Consequently, you can't use the Tiobe ratings (for the #15--#50 languages) to conclude that "Language X is more popular than Language Y." To do so seems to ignore several basic statistical ideas, such as statistical variation, statistical significance, and the sensitivity of rankings to small changes in data.

Specifically, in January the languages ranked #15--#50 were all within 0.5% of each other. That's one-half of one percent! Rankings are very sensitive when the data have similar values: small variations from month to month can send any of these languages shooting up or down many places in the rankings.

For example, from January to February, nine out of the 46 languages (almost 20%) changed ranks by more than 10 places. Some like SAS and MATLAB moved down, others such as Visual Basic .NET and Logo moved many places up. Although Visual Basic .NET's rating changed by merely one-third of one percent, it's RANK jumped 27 places, from #49 to #22! That's because 0.003 is a large RELATIVE change. I conclude that these monthly changes in rank aren't meaningful for the smaller languages on the Tiobe list.

I realize that this is primarily a blog to promote your company's software, but you are a professional statistician, in addition to being a marketing person, so it seems like the variation in the data would be a primary concern. If you're too busy to analyze the data, maybe I'll blog about these data in the future. Cheers, and best wishes.

Hi Rick,

All good and thoughtful points -- thanks for the discussion. I totally agree that it's tough to extract the signal from the noise of the month-to-month variations in those rankings; that's why the year-on-year comparisons, where the differences swamp the monthly variations, are interesting. In the case of R, Tiobe didn't report the year-on-year comparison. It would be nice to have a look at the trends over a longer timescales, but Tiobe doesn't publish the historical data for free. You can purchase it, but I don't quite have the disposable marketing budget you do. :)

Kindly,
David

I certainly would debate this. According to the published methodology, the basic query used is +" programming".

Using Google I am getting

+"sas programming" = ~ 260,000 results
+"R programming" = ~ 154,000 results

Why the big difference?

Also, given that both SAS and R are in the lower half of the list under "Other programming Languages", and together account for 1% of the entire list, the variable from month to month can be extremely large, and I would not be surprised to see a lot of list movement.

-Ralph Winters

Correction: The basic query format is:

+" programming"

-Ralph Winters

Breaking news: in the tiobe index for March R is overtaken by Matlab and SAS (and PL/SQL and... Logo!)

Tiobe = Joke


Actually , both R (which is S) and SAS are very old programming languages with usability and productivity issues that were present 25 years ago, and still present, because not a lot of fundamental research in statistical programming languages has been done over the past 2 decades.

Robert

The comments to this entry are closed.


R for the Enterprise

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid

Search Revolutions Blog