« Big Data Disruption in the Insurance Industry | Main | Tutorial: Parallel programming with foreach »

August 29, 2013

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

half the linear algebra books published by SIAM use linear algebra.
Well, they wouldn't have much of a book without it!

The Python parts are probably using NumPy, Python's matrix/array extension, not just base Python.

Don't forget about pbdR for going big. We've run benchmarks on terabytes of data with up to 50,000 cores.

I've almost completed the Coding the Matrix class and my understanding is that Python was used for instructional purposes, not because it is best tool for the job. Matrix and Vector classes are built from scratch. Having previously only studied theoretical linear algebra I've learned a lot getting my hands "dirty".

With respect to R and linear algebra - my main issue is having to use the "drop=FALSE" line to prevent column and row matrices becoming vectors, for example, when you're filtering a data set and only one row is output. This was discussed by Radford Neal ("Design Flaws in R #2 — Dropped Dimensions") on his blog some time ago:-
http://radfordneal.wordpress.com/2008/08/20/design-flaws-in-r-2-%E2%80%94-dropped-dimensions/

Nothing that you said about R could not also be said about Python. Apart perhaps from point 2.

Python is a much cleaner and purer environment than R though which is why I'd choose it over R for linalg.

Great synopsis on R for linear algebra. But your comments about Python make it clear that you don't know much about how the language has increasingly been used in scientific computing for more than 10 years. The motivation came in many ways from frustrations with MATLAB, so you'll see some naming conventions and entire packages (e.g. matplotlib) influenced by MATLAB's conventions. If you read up on that history a bit (e.g. Wikipedia) then the fact that many CS and Math departments teach Python won't be as shocking.

There is a lot of momentum currently for usage of python in science. The combination of numpy, scipy, pandas modules are key to that. Companies like Enthought and Continuum Analytics are also helping with that. There are also several resources for a newbie. The tutorials and several talks from a python conference (pycon,scipy) are available as videos immediately after the conference (it will be really nice if materials were available like this from useR conferences).

The question still remains whether R can do lot of the same things and your post is timely in that regard. Thanks for posting.

I actually took the Coursera course and I think Python was a much better alternative to R in this case.

1.) Most students came in with little to no programming experience.
2.) R's learning curve is just too steep, whereas Python is human-readable (I love R, but python is more user-friendly).
3.) No other packages / modules were used, we built a Matrix and Vector class and implemented all of the features of matrices with overloaded operators.

The purpose of the class was to teach linear algebra through building each basic step. Using a pre-built Matrix package would defeat the purpose and reduce the course to just learning terms rather than forcing yourself to understand the linear algebra well enough to be able to write it in code.

Explaining the algorithms of linear algebra through code was a lot more fun than just knowing how to run a certain module.

So, yes, R has a lot more going for it than Python for Linear Algebra. But knowing how to use a tool wasn't the purpose. You were supposed to really grok linear algebra.

I can second what Will said. I also took the course and we didn't use numpy or any other package, but had to code our own Matrix and Vector classes as a learning experience. Now that I am done with the class I would gladly choose to use numpy over my code since numpy is much more optimized and tested. I would be even more likely to use R since it has great support for LA applications. I have used R for a long time now, but the pandas package does look very interesting. Python seems like a really great language.

Prior to this course I look a machine learning course on coursera that used Octave (or Matlab). Before I used R I liked Matlab better than S-Plus, but after using R for many years and then going back to Octave, I was so happy that I choose to learn R!

For books, have a look at Gentle's Matrix Algebra. I have his Computational Stats book, and it is well written for a daunting subject. I expect he's done similar for LA. The Amazon reviews (not many, but it's not Miley Cyrus topic) are all very positive. According to the ToC, and the snippet of the Preface one can sample, he uses most of the stat packs, including R.

The comments to this entry are closed.

Search Revolutions Blog




Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid
Get this blog via email with Blogtrottr