On Tue, 22 Aug 2006, Ted.Harding at nessie.mcc.ac.uk wrote:
> Hi Folks,
>
> I've encountered something I hadn't been consciously
> aware of previously, and I'm wondering what the
> explanation might be.
Try
> contr.helmert(letters[1:3])
[,1] [,2]
a -1 -1
b 1 -1
c 0 2> contr.treatment(letters[1:3])
b c
a 0 0
b 1 0
c 0 1
and note the difference in column names.
Those who made the decision to use those column names determined this.
I agreed with them that labelling the second Helmert contrast here as
'c'
would be confusing, especially easy to confuse with treatment contrasts.
However, I thought the treatment contrasts should be labelled b-a and c-a.
We also had arguments about xc vs x.c vs x:c. AFAIR brevity won.
Once you know how it is done, it is easy to change the behaviour, of
course: just roll your own contrasts function with the colnames you want.
> In (on another list) using R to demonstrate the difference
> between different contrasts in 'lm' I set up an example
> where Y is sampled from three different normal distributions
> according to the levels ("A","B","C") of a
factor X:
>
> Y<-c(rnorm(mean=0,n=12),rnorm(mean=2,n=12),rnorm(mean=4,n=12))
>
X<-factor(c(rep("A",12),rep("B",12),rep("C",12)))
>
> Then I do a summary(lm(Y~X)...) using first "Treatment" contrasts
> and then "Helmert" contrasts. Here are the coefficient parts
> of the results in each case:
Just coef() or print() gives you the coefficient names: this is not done
by summary().
> summary(lm(Y~X,contrasts=list(X="contr.treatment")))
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 0.2303 0.3220 0.715 0.47944
> XB 1.3057 0.4554 2.867 0.00716 **
> XC 3.4204 0.4554 7.511 1.23e-08 ***
>
>
> summary(lm(Y~X,contrasts=list(X="contr.helmert")))
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1.8057 0.1859 9.713 3.34e-11 ***
> X1 0.6529 0.2277 2.867 0.00716 **
> X2 0.9225 0.1315 7.017 5.00e-08 ***
>
>
> What I'm wondering is why the "effect names" are
"X.B"
> and "X.C" for Treatment, and "X1", "X2" for
Helmert.
>
> Why not "X.B" and "X.C" in both cases? Just as
"XB"
> contrasts B with the overall mean and "XC" contrasts C
> with the overall mean, "XA" being implicit, in the
> Treatment contrasts, so "X1" contrasts B with A and
> "X2" contrasts C with (A+B) in Helmert, so there
> is to my mind just as definite an association of "B"
> with the first contrast, and "C" with the second, in
> the Helmert case as in the Treatment case!
>
> I know it's just a matter of "notation", but in the
> Helmert case the association with the names of the
> factor levels has been lost, and it could be useful
> to have it explicit. (Or is it intended simply as a
> reminder that one is using a particular system of
> contrasts?)
>
> Thanks, and best wishes to all,
> Ted.
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 22-Aug-06 Time: 14:45:17
> ------------------------------ XFMail ------------------------------
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595