Tiobe Software ranks the popularity of programming languages based on references in search engines. While the methodology might be debated in terms of the absolute rankings it produces, it is quite interesting to see how the rankings fluctuate over time: Tiobe has produced a monthly report of rankings based on this methodology since 2001.
In the Tiobe Programming Community Index for February 2011, the top three slots are held by the general-purpose languages Java, C and C++. Domain-specific languages naturally fall farther down the list: in this months report, R is ranked at #25, with Matlab at 29 and SAS at 30. What's interesting is the movement: Matlab is down from #19 a month ago (and #20 a year ago), whereas R is up from #26. But look at SAS, mentioned in the report's summary as having "lost much ground": it's down from #16 a month ago and #14 a year ago.
Tiobe Software: TIOBE Programming Community Index for February 2011
Please post more fried chicken stories. If you don't have any more then it is OK to post a fried Turkey story. Thank you!!!
Posted by: Gekkor McFadden | February 10, 2011 at 22:12
I think that comment went straight over my head...
Posted by: David Smith | February 11, 2011 at 09:04
Hi David,
Fellow blogger Rick Wicklin from SAS here. I'm speaking for myself and as a statistician, not on behalf of my company.
A statistical question. Can you really draw a valid statistical conclusion about the relative popularity of the smaller languages in the Tiobe data? The data seem to have a lot of statistical variation.
The Tiobe ratings vary by up to 0.1% from month to month. For the languages ranked greater than #15 or so, the monthly variation is a substantial portion of their Tiobe rating. Consequently, you can't use the Tiobe ratings (for the #15--#50 languages) to conclude that "Language X is more popular than Language Y." To do so seems to ignore several basic statistical ideas, such as statistical variation, statistical significance, and the sensitivity of rankings to small changes in data.
Specifically, in January the languages ranked #15--#50 were all within 0.5% of each other. That's one-half of one percent! Rankings are very sensitive when the data have similar values: small variations from month to month can send any of these languages shooting up or down many places in the rankings.
For example, from January to February, nine out of the 46 languages (almost 20%) changed ranks by more than 10 places. Some like SAS and MATLAB moved down, others such as Visual Basic .NET and Logo moved many places up. Although Visual Basic .NET's rating changed by merely one-third of one percent, it's RANK jumped 27 places, from #49 to #22! That's because 0.003 is a large RELATIVE change. I conclude that these monthly changes in rank aren't meaningful for the smaller languages on the Tiobe list.
I realize that this is primarily a blog to promote your company's software, but you are a professional statistician, in addition to being a marketing person, so it seems like the variation in the data would be a primary concern. If you're too busy to analyze the data, maybe I'll blog about these data in the future. Cheers, and best wishes.
Posted by: Rick Wicklin | February 11, 2011 at 12:24
Hi Rick,
All good and thoughtful points -- thanks for the discussion. I totally agree that it's tough to extract the signal from the noise of the month-to-month variations in those rankings; that's why the year-on-year comparisons, where the differences swamp the monthly variations, are interesting. In the case of R, Tiobe didn't report the year-on-year comparison. It would be nice to have a look at the trends over a longer timescales, but Tiobe doesn't publish the historical data for free. You can purchase it, but I don't quite have the disposable marketing budget you do. :)
Kindly,
David
Posted by: David Smith | February 11, 2011 at 12:40
I certainly would debate this. According to the published methodology, the basic query used is +" programming".
Using Google I am getting
+"sas programming" = ~ 260,000 results
+"R programming" = ~ 154,000 results
Why the big difference?
Also, given that both SAS and R are in the lower half of the list under "Other programming Languages", and together account for 1% of the entire list, the variable from month to month can be extremely large, and I would not be surprised to see a lot of list movement.
-Ralph Winters
Posted by: Ralph Winters | February 16, 2011 at 07:16
Correction: The basic query format is:
+" programming"
-Ralph Winters
Posted by: Ralph Winters | February 16, 2011 at 07:18
Breaking news: in the tiobe index for March R is overtaken by Matlab and SAS (and PL/SQL and... Logo!)
Tiobe = Joke
Posted by: john | April 04, 2011 at 23:59
Actually , both R (which is S) and SAS are very old programming languages with usability and productivity issues that were present 25 years ago, and still present, because not a lot of fundamental research in statistical programming languages has been done over the past 2 decades.
Robert
Posted by: Robert Wilkins | June 08, 2011 at 16:33