thr3ads.net - R help - [R] Example of cell means model [Oct 2003]

If this information is useful, please help other people find it:
Share via:

Douglas Bates

2003-Oct-15 13:40 UTC

[R] Example of cell means model

This is an example from chapter 11 of the 6th edition of Devore's
engineering statistics text.  It happens to be a balanced data set in
two factors but the calculations will also work for unbalanced data.

I create a factor called 'cell' from the text representation of the
Variety level and the Density level using '/' as the separator
character. The coefficients for the linear model 'Yield ~ cell - 1'
are exactly the cell means.
> library(Devore6)
> data(xmp11.07)
> str(xmp11.07)`data.frame':	36 obs. of  3 variables:
 $ Yield  : num  10.5 9.2 7.9 12.8 11.2 13.3 12.1 12.6 14 10.8 ...
 $ Variety: Factor w/ 3 levels "1","2","3": 1 1 1
1 1 1 1 1 1 1 ...
 $ Density: Factor w/ 4 levels
"1","2","3","4": 1 1 1 2 2 2 3 3 3 4
...> xmp11.07$cell = with(xmp11.07, factor(paste(Variety, Density, sep =
'/')))
> xtabs(~ Variety + Density, xmp11.07)       Density
Variety 1 2 3 4
      1 3 3 3 3
      2 3 3 3 3
      3 3 3 3 3> means = xtabs(Yield ~ Variety+Density, xmp11.07)/xtabs(~ Variety + Density,
xmp11.07)
> means       Density
Variety 1         2         3         4        
      1  9.200000 12.433333 12.900000 10.800000
      2  8.933333 12.633333 14.500000 12.766667
      3 16.300000 18.100000 19.933333 18.166667> coef(fm1 <- lm(Yield ~ cell - 1, xmp11.07))  cell1/1   cell1/2   cell1/3   cell1/4   cell2/1   cell2/2   cell2/3   cell2/4 
 9.200000 12.433333 12.900000 10.800000  8.933333 12.633333 14.500000 12.766667 
  cell3/1   cell3/2   cell3/3   cell3/4 
16.300000 18.100000 19.933333 18.166667 

The residual sum of squares for this model is the same as that for the
model with crossed terms
> deviance(fm1)
[1] 38.04> deviance(fm2 <- lm(Yield ~ Variety * Density, xmp11.07))[1] 38.04

but the coefficients are quite different because they represent a
different parameterization.
> coef(fm2)      (Intercept)          Variety2          Variety3          Density2 
       9.20000000       -0.26666667        7.10000000        3.23333333 
         Density3          Density4 Variety2:Density2 Variety3:Density2 
       3.70000000        1.60000000        0.46666667       -1.43333333 
Variety2:Density3 Variety3:Density3 Variety2:Density4 Variety3:Density4 
       1.86666667       -0.06666667        2.23333333        0.26666667 

I hope this answers your question.  Sorry for the delay in
responding to you.

"Francisco Vergara" <gerifalte28 at hotmail.com> writes:
> Many thanks for your reply.  An example would be very helpful to make
> sure that I understand correctly what you described.

Francisco Vergara

2003-Oct-15 18:17 UTC

head link

[R] Example of cell means model

Thanks a lot for your reply. This helps a lot!
Just to confirm, using lm this model will give me the mean yield value for 
each cell in the two way array.  Now if I want to obtain the mean
of group means (like a SS type III approach) using LME (since I have random 
effects in the model) how can I parametrize this?
I could definitivelly use xtabs in a two-way case but in my case I have 2 
other (continuous) covariates that are potential confounders in
the model so I need to keep them to obtain the corrected means.
I added a continuous variable (NewVar) to the dataset Newxmp11.07 and 
obtaned a model with the covariate (fm4) and another without it (fm3)
>Newxmp11.07<-fix(xmp11.07)
>str(Newxmp11.07)`data.frame':   36 obs. of  4 variables:
$ Yield  : num  10.5 9.2 7.9 12.8 11.2 13.3 12.1 12.6 14 10.8 ...
$ Variety: Factor w/ 3 levels "1","2","3": 1 1 1 1
1 1 1 1 1 1 ...
$ Density: Factor w/ 4 levels
"1","2","3","4": 1 1 1 2 2 2 3 3 3 4 ...
$ NewVar : num  10 9 7 12 11 11 12 11 15 16 ...
>fm3<-gls(Yield~-1+Variety + Density, xmp11.07)
>fm4<-gls(Yield~-1+Variety + Density+NewVar, Newxmp11.07)
>fm3Coefficients:
Variety1  Variety2  Variety3  Density2  Density3  Density4
8.922222  9.797222 15.713889  2.911111  4.300000  2.433333

Degrees of freedom: 36 total; 30 residual
Residual standard error: 1.239243
>fm4Coefficients:
   Variety1    Variety2    Variety3    Density2    Density3    Density4
8.75757265  9.70589316 15.43347009  2.88152564  4.32186752  2.40117521
     NewVar
0.01157692

Degrees of freedom: 36 total; 29 residual
Residual standard error: 1.252991

fm4 gives me the mean of the group means for all the varieties but 
apparently it gives me the treatment contrasts for the densities.  If I 
change the order of the factors in the model specification I get
>coef(fm5<-gls(Yield~-1+ Density+Variety+NewVar, Newxmp11.07))   Density1    Density2    Density3    Density4    Variety2    Variety3
8.75757265 11.63909829 13.07944017 11.15874786  0.94832051  6.67589744
     NewVar
0.01157692

This, just like fm4 will include the original intercept value in Density 1 
which is not the actual density 1 mean.  What am I missing? I am sorry if 
these questions are very basic but I want to make sure that I understand 
what I am doing. I guess that this is the price that I am paying for having 
used in the past packages like SAS where you just ask for lsmeans and the 
software will give you a "black box" answer!

Best regards

Francisco

>From: Douglas Bates <bates at stat.wisc.edu>
>To: "Francisco Vergara" <gerifalte28 at hotmail.com>
>CC: r-help at r-project.org
>Subject: Re: [R] Example of cell means model
>Date: 15 Oct 2003 08:40:33 -0500
>
>This is an example from chapter 11 of the 6th edition of Devore's
>engineering statistics text.  It happens to be a balanced data set in
>two factors but the calculations will also work for unbalanced data.
>
>I create a factor called 'cell' from the text representation of the
>Variety level and the Density level using '/' as the separator
>character. The coefficients for the linear model 'Yield ~ cell - 1'
>are exactly the cell means.
>
> > library(Devore6)
> > data(xmp11.07)
> > str(xmp11.07)
>`data.frame':	36 obs. of  3 variables:
>  $ Yield  : num  10.5 9.2 7.9 12.8 11.2 13.3 12.1 12.6 14 10.8 ...
>  $ Variety: Factor w/ 3 levels "1","2","3": 1
1 1 1 1 1 1 1 1 1 ...
>  $ Density: Factor w/ 4 levels
"1","2","3","4": 1 1 1 2 2 2 3 3 3 4 ...
> > xmp11.07$cell = with(xmp11.07, factor(paste(Variety, Density, sep = 
>'/')))
> > xtabs(~ Variety + Density, xmp11.07)
>        Density
>Variety 1 2 3 4
>       1 3 3 3 3
>       2 3 3 3 3
>       3 3 3 3 3
> > means = xtabs(Yield ~ Variety+Density, xmp11.07)/xtabs(~ Variety + 
>Density, xmp11.07)
> > means
>        Density
>Variety 1         2         3         4
>       1  9.200000 12.433333 12.900000 10.800000
>       2  8.933333 12.633333 14.500000 12.766667
>       3 16.300000 18.100000 19.933333 18.166667
> > coef(fm1 <- lm(Yield ~ cell - 1, xmp11.07))
>   cell1/1   cell1/2   cell1/3   cell1/4   cell2/1   cell2/2   cell2/3   
>cell2/4
>  9.200000 12.433333 12.900000 10.800000  8.933333 12.633333 14.500000 
>12.766667
>   cell3/1   cell3/2   cell3/3   cell3/4
>16.300000 18.100000 19.933333 18.166667
>
>The residual sum of squares for this model is the same as that for the
>model with crossed terms
>
> > deviance(fm1)
>[1] 38.04
> > deviance(fm2 <- lm(Yield ~ Variety * Density, xmp11.07))
>[1] 38.04
>
>but the coefficients are quite different because they represent a
>different parameterization.
>
> > coef(fm2)
>       (Intercept)          Variety2          Variety3          Density2
>        9.20000000       -0.26666667        7.10000000        3.23333333
>          Density3          Density4 Variety2:Density2 Variety3:Density2
>        3.70000000        1.60000000        0.46666667       -1.43333333
>Variety2:Density3 Variety3:Density3 Variety2:Density4 Variety3:Density4
>        1.86666667       -0.06666667        2.23333333        0.26666667
>
>I hope this answers your question.  Sorry for the delay in
>responding to you.
>
>"Francisco Vergara" <gerifalte28 at hotmail.com> writes:
>
> > Many thanks for your reply.  An example would be very helpful to make
> > sure that I understand correctly what you described.
_________________________________________________________________
Surf and talk on the phone at the same time with broadband Internet access. 
Get high-speed for as low as $29.95/month (depending on the local service 
providers in your area).  broadband.msn.com

Francisco Vergara

2003-Oct-16 14:41 UTC

head link

[R] Example of cell means model

Many thansk to Dr. Bates to take time follow on this conversation.  Could 
anybody follow up on the question on what the SAS lsmeans function exactly 
does and how to reproduce that on R?

Thanks

Francisco
>From: Douglas Bates <bates at stat.wisc.edu>
>To: "Francisco Vergara" <gerifalte28 at hotmail.com>
>CC: r-help at r-project.org
>Subject: Re: [R] Example of cell means model
>Date: 16 Oct 2003 08:38:35 -0500
>
>"Francisco Vergara" <gerifalte28 at hotmail.com> writes:
>
> > Thanks a lot for your reply. This helps a lot!
> > Just to confirm, using lm this model will give me the mean yield value
> > for each cell in the two way array.  Now if I want to obtain the mean
> > of group means (like a SS type III approach) using LME (since I have
> > random effects in the model) how can I parametrize this?
>
>Didn't I just answer that?
>
> > I could definitivelly use xtabs in a two-way case but in my case I
> > have 2 other (continuous) covariates that are potential confounders in
> > the model so I need to keep them to obtain the corrected means.
> > I added a continuous variable (NewVar) to the dataset Newxmp11.07 and
> > obtaned a model with the covariate (fm4) and another without it (fm3)
>
>I think you misinterpreted what I wrote.  I used xtabs to show an
>alternative, direct calculation of the cell means.  The purpose was to
>to compare the result with the coefficients in the model fit with the
>new factor constructed as I described.
>
> > >Newxmp11.07<-fix(xmp11.07)
> > >str(Newxmp11.07)
> > `data.frame':   36 obs. of  4 variables:
> > $ Yield  : num  10.5 9.2 7.9 12.8 11.2 13.3 12.1 12.6 14 10.8 ...
> > $ Variety: Factor w/ 3 levels
"1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
> > $ Density: Factor w/ 4 levels
"1","2","3","4": 1 1 1 2 2 2 3 3 3 4 ...
> > $ NewVar : num  10 9 7 12 11 11 12 11 15 16 ...
> >
> > >fm3<-gls(Yield~-1+Variety + Density, xmp11.07)
> > >fm4<-gls(Yield~-1+Variety + Density+NewVar, Newxmp11.07)
>
>Why are you using gls for these models?
>
>Why do you not generate a factor to achieve the cell means model that
>(apparently) you want?
>
> > >fm3
> > Coefficients:
> > Variety1  Variety2  Variety3  Density2  Density3  Density4
> > 8.922222  9.797222 15.713889  2.911111  4.300000  2.433333
> >
> > Degrees of freedom: 36 total; 30 residual
> > Residual standard error: 1.239243
> >
> > >fm4
> > Coefficients:
> >    Variety1    Variety2    Variety3    Density2    Density3   
Density4
> > 8.75757265  9.70589316 15.43347009  2.88152564  4.32186752  2.40117521
> >      NewVar
> > 0.01157692
> >
> > Degrees of freedom: 36 total; 29 residual
> > Residual standard error: 1.252991
> >
> > fm4 gives me the mean of the group means for all the varieties but
> > apparently it gives me the treatment contrasts for the densities.  If
> > I change the order of the factors in the model specification I get
>
>Please read what I wrote.
>
> > >coef(fm5<-gls(Yield~-1+ Density+Variety+NewVar, Newxmp11.07))
> >    Density1    Density2    Density3    Density4    Variety2   
Variety3
> > 8.75757265 11.63909829 13.07944017 11.15874786  0.94832051  6.67589744
> >      NewVar
> > 0.01157692
> >
> > This, just like fm4 will include the original intercept value in
> > Density 1 which is not the actual density 1 mean.  What am I missing?
> > I am sorry if these questions are very basic but I want to make sure
> > that I understand what I am doing. I guess that this is the price that
> > I am paying for having used in the past packages like SAS where you
> > just ask for lsmeans and the software will give you a "black
box"
> > answer!
>
>Yes.  I think it would be interesting to ask people who use the
>results from lsmeans to explain what the results represent.  My guess
>is that less than 1% of the people who use lsmeans know what they in
>fact are.
>
>I won't be able to continue this conversation.  I am very busy right
>putting new facilities into R.  Perhaps others on the list will be
>able to respond to your questions.
>
_________________________________________________________________
Try MSN Messenger 6.0 with integrated webcam functionality!

Maybe Matching Threads

Search for more possibly parallel threads

R help - Oct 2003 - Example of cell means model

[R] Example of cell means model

[R] Example of cell means model

[R] Example of cell means model

Maybe Matching Threads