« Create Fashion Fingerprints with R | Main | Integrate R into applications with DeployR Open »

October 28, 2014

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Regarding one of your last points. It was my understanding that contrasts should be set to e.g. contr.sum and NOT contr.treatment when running the model if you wanted to get Type III SS using the Anova() function in the car package.

I always appreciate Terry's perspective, coming from a viewpoint of doing data analysis (as opposed to primarily building software). He said:

"R needs a general and well thought out post-fit contrasts function. Population averaged estimates could be one option of said routine, with the SAS population one possible choice."

I could not agree more whole-heartedly. This concept of post-fit contrasts analysis can be very difficult to do in R in general, futzing around with the design matrix and trying to remember if it is the first or last level of a factor that is set to zero, etc. However, the commercial package asreml-r has a remarkable "predict" function that makes it simple to do such post-fit processing to easily calculate lsmeans-type contrasts or whatever narrow-space, intermediate-space, broad-space, or weighted-according-to-population-proportions-space predictions your heart desires.

See the examples on the following page for how to do predictions with one line of asreml code or many lines of tedious coefficient manipulations (using MCMCglmm, but it would be similar with lm, nlme, etc).

http://www.inside-r.org/packages/cran/agridat/docs/stroup.splitplot

For some reason, that page is missing the primary source document:
Walter W. Stroup, 1989. Predictable functions and prediction space in the mixed model procedure. Applications of Mixed Models in Agriculture and Related Disciplines.

Hey David Smith, what happened to the "Source" section of the Rd files when they were translated to inside-r.org ?


Kevin Wright

@Kevin, that looks like a bug at inside-r.org -- I don't think there was special consideration of the fields in dataset help files.

I have posted this comment on behalf of the author. (ed.)

The first comment is correct. ( Gustaf Granath |)

The first sentence after "a couple more things" should say
"...giving seriously incorrect answers unless sum-to-zero constraints are were used for the fit (contr.sum)."

I was thinking one thing and typed another. As a footnote, the actual constraint is this: let Z by the design matrix for the reference population, ie the result of model.matrix() on the reference data set. If the column sums of Z are 0 for the factor of interest (the one we are testing), then the naive algorithm works. For a uniform reference distribution both contr.sum and contr.helmert suffice, but not for other reference distributions.

OK, thanks for the clarification. May be a good idea to edit the post so people dont get confused (more than they already are).

Bill Venables has a very nice essay about linear models
and goes into detail about what is wrong with type III sums of squares in section 5. Here's a link
http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf

Nicholas Lewin-Koh

Dear all,

Many thanks for this very good point.
Regarding the post-fit contrasts function, I have used the 'multcomp' package:
http://cran.r-project.org/web/packages/multcomp
/multcomp.pdf

from Bretz, Hothorn,Westfall assocciated with their book in 2010 (http://www.crcpress.com/product/isbn/9781584885740)

I am not sure that it could help solving the type III issue initially discussed there but some functions in the package were very helpful to help me handling user defined contrasts notably.
If you have ever use it, I would be grateful to have your expert opinion on it.
Thanks again
Marc

@Kevin mentions asreml's prediction capability - the method is described in two papers: "Prediction in Linear Mixed Models", Welham et al, 2004 presents the basic ideas, while "An efficient computing strategy for prediction in mixed linear models" Gilmour et al, 2004 provides details of the math/computing involved.

In addition, there is a recently released package, 'predictmeans', available on CRAN which produces predictions from aov, lm, lme, lmer, glm, and other models, apparently using the approach described in the Welham et al. paper.

I've not used the package much, so cannot comment on its reliability, but I mention it in case it is of interest.
Cheers,
Alec

The comments to this entry are closed.

Search Revolutions Blog




Got comments or suggestions for the blog editor?
Email David Smith.
Follow revodavid on Twitter Follow David on Twitter: @revodavid
Get this blog via email with Blogtrottr