thr3ads.net - R help - [R] summary(lm ... conrasts=...) [Aug 2006]

If this information is useful, please help other people find it:
Share via:

(Ted Harding)

2006-Aug-22 13:45 UTC

[R] summary(lm ... conrasts=...)

Hi Folks,

I've encountered something I hadn't been consciously
aware of previously, and I'm wondering what the
explanation might be.

In (on another list) using R to demonstrate the difference
between different contrasts in 'lm' I set up an example
where Y is sampled from three different normal distributions
according to the levels ("A","B","C") of a factor
X:

Y<-c(rnorm(mean=0,n=12),rnorm(mean=2,n=12),rnorm(mean=4,n=12))
X<-factor(c(rep("A",12),rep("B",12),rep("C",12)))

Then I do a summary(lm(Y~X)...) using first "Treatment" contrasts
and then "Helmert" contrasts. Here are the coefficient parts
of the results in each case:


summary(lm(Y~X,contrasts=list(X="contr.treatment")))
Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.2303     0.3220   0.715  0.47944
XB            1.3057     0.4554   2.867  0.00716 **
XC            3.4204     0.4554   7.511 1.23e-08 ***


summary(lm(Y~X,contrasts=list(X="contr.helmert")))
Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.8057     0.1859   9.713 3.34e-11 ***
X1            0.6529     0.2277   2.867  0.00716 **
X2            0.9225     0.1315   7.017 5.00e-08 ***


What I'm wondering is why the "effect names" are "X.B"
and "X.C" for Treatment, and "X1", "X2" for
Helmert.

Why not "X.B" and "X.C" in both cases? Just as
"XB"
contrasts B with the overall mean and "XC" contrasts C
with the overall mean, "XA" being implicit, in the
Treatment contrasts, so "X1" contrasts B with A and
"X2" contrasts C with (A+B) in Helmert, so there
is to my mind just as definite an association of "B"
with the first contrast, and "C" with the second, in
the Helmert case as in the Treatment case!

I know it's just a matter of "notation", but in the
Helmert case the association with the names of the
factor levels has been lost, and it could be useful
to have it explicit. (Or is it intended simply as a
reminder that one is using a particular system of
contrasts?)

Thanks, and best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 22-Aug-06                                       Time: 14:45:17
------------------------------ XFMail ------------------------------

Prof Brian Ripley

2006-Aug-22 16:50 UTC

head link

[R] summary(lm ... conrasts=...)

On Tue, 22 Aug 2006, Ted.Harding at nessie.mcc.ac.uk wrote:
> Hi Folks,
> 
> I've encountered something I hadn't been consciously
> aware of previously, and I'm wondering what the
> explanation might be.
Try
> contr.helmert(letters[1:3])  [,1] [,2]
a   -1   -1
b    1   -1
c    0    2> contr.treatment(letters[1:3])  b c
a 0 0
b 1 0
c 0 1

and note the difference in column names.

Those who made the decision to use those column names determined this.
I agreed with them that labelling the second Helmert contrast here as
'c'
would be confusing, especially easy to confuse with treatment contrasts.
However, I thought the treatment contrasts should be labelled b-a and c-a.
We also had arguments about xc vs x.c vs x:c.  AFAIR brevity won.

Once you know how it is done, it is easy to change the behaviour, of 
course: just roll your own contrasts function with the colnames you want.
> In (on another list) using R to demonstrate the difference
> between different contrasts in 'lm' I set up an example
> where Y is sampled from three different normal distributions
> according to the levels ("A","B","C") of a
factor X:
> 
> Y<-c(rnorm(mean=0,n=12),rnorm(mean=2,n=12),rnorm(mean=4,n=12))
>
X<-factor(c(rep("A",12),rep("B",12),rep("C",12)))
> 
> Then I do a summary(lm(Y~X)...) using first "Treatment" contrasts
> and then "Helmert" contrasts. Here are the coefficient parts
> of the results in each case:
Just coef() or print() gives you the coefficient names: this is not done 
by summary().
> summary(lm(Y~X,contrasts=list(X="contr.treatment")))
> Coefficients:
>             Estimate Std. Error t value Pr(>|t|)
> (Intercept)   0.2303     0.3220   0.715  0.47944
> XB            1.3057     0.4554   2.867  0.00716 **
> XC            3.4204     0.4554   7.511 1.23e-08 ***
> 
> 
> summary(lm(Y~X,contrasts=list(X="contr.helmert")))
> Coefficients:
>             Estimate Std. Error t value Pr(>|t|)
> (Intercept)   1.8057     0.1859   9.713 3.34e-11 ***
> X1            0.6529     0.2277   2.867  0.00716 **
> X2            0.9225     0.1315   7.017 5.00e-08 ***
> 
> 
> What I'm wondering is why the "effect names" are
"X.B"
> and "X.C" for Treatment, and "X1", "X2" for
Helmert.
> 
> Why not "X.B" and "X.C" in both cases? Just as
"XB"
> contrasts B with the overall mean and "XC" contrasts C
> with the overall mean, "XA" being implicit, in the
> Treatment contrasts, so "X1" contrasts B with A and
> "X2" contrasts C with (A+B) in Helmert, so there
> is to my mind just as definite an association of "B"
> with the first contrast, and "C" with the second, in
> the Helmert case as in the Treatment case!
> 
> I know it's just a matter of "notation", but in the
> Helmert case the association with the names of the
> factor levels has been lost, and it could be useful
> to have it explicit. (Or is it intended simply as a
> reminder that one is using a particular system of
> contrasts?)
> 
> Thanks, and best wishes to all,
> Ted.
> 
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 22-Aug-06                                       Time: 14:45:17
> ------------------------------ XFMail ------------------------------
-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Aug 2006 - summary(lm ... conrasts=...)

[R] summary(lm ... conrasts=...)

[R] summary(lm ... conrasts=...)

Possibly Parallel Threads