« Microsoft uses R for Xbox matchmaking | Main | Companies using R in 2014 »

May 22, 2014


Feed You can follow this conversation by subscribing to the comment feed for this post.

Aside from the inability of glm() to cope with very large datasets, another problem I've had with training GLMs is dealing with cases where the underlying optimization algorithm fails, possibly due to an inappropriate choice of initial parameters.

I've had some success with a technique that I saw on a blog somewhere -- selecting a small subset of the data, training an initial model on the subset, and using that model's fitted parameters as the initial parameters to the full GLM. However, I've also noticed that bigglm has much more success in fitting these "difficult" models, possibly due to its underlying algorithm being different.

So, that's another reason to choose bigglm over glm.

The comments to this entry are closed.

Search Revolutions Blog

Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid
Get this blog via email with Blogtrottr