Adaikalavan Ramasamy
2005-Jun-10 17:35 UTC
[R] ANOVA vs REML approach to variance component estimation
Can anyone verify my calculations below or explain why they are wrong ?
I have several animals that were measured thrice. The only blocking
variable is the animal itself. I am interested in calculating the
between and within object variations in R. An artificial example :
y <- c( 2.2, -1.4, -0.5, # animal 1
-0.3 -2.1 1.5, # animal 2
1.3 -0.3 0.5, # animal 3
-1.4 -0.2 1.8) # animal 4
ID <- factor( rep(1:4, each=3) )
1) Using the ANOVA method
summary(aov( y ~ ID ))
Df Sum Sq Mean Sq F value Pr(>F)
ID 3 0.900 0.300 0.1207 0.9453
Residuals 8 19.880 2.485
=> within animal variation = 2.485
=> between animal variation = (0.300 - 2.485)/3 = -0.7283
I am aware that ANOVA can give negative estimates for variances. Is this
such a case or have I coded wrongly ?
2) Using the REML approach
library(nlme)
lme( y ~ 1, rand = ~ 1 | ID)
....
Random effects:
Formula: ~1 | ID
(Intercept) Residual
StdDev: 0.01629769 1.374438
=> within animal variation = 1.374438^2 = 1.88908
=> between animal variation = 0.01629769^2 = 0.0002656147
Is this the correct way of coding for this problem ? I do not have
access to a copy of Pinheiro & Bates at the moment.
Thank you very much in advance.
Regards, Adai
Chuck Cleland
2005-Jun-10 19:10 UTC
[R] ANOVA vs REML approach to variance component estimation
They look fine to me. Also, note varcomp() in the ape package and
VarCorr() in the nlme package. I think in this case the ANOVA estimate
of the intercept variance component is negative because the true value
is close to zero.
> y <- c( 2.2, -1.4, -0.5, # animal 1
+ -0.3, -2.1, 1.5, # animal 2
+ 1.3, -0.3, 0.5, # animal 3
+ -1.4, -0.2, 1.8) # animal 4
> ID <- factor( rep(1:4, each=3) )
> library(nlme)
> library(ape)
> summary(aov(y ~ ID))
Df Sum Sq Mean Sq F value Pr(>F)
ID 3 0.9625 0.3208 0.1283 0.9406
Residuals 8 20.0067 2.5008
> (0.3208 - 2.5008) / 3
[1] -0.7266667
> varcomp(lme(y ~ 1, random = ~ 1 | ID))
ID Within
0.0002709644 1.9062505816
attr(,"class")
[1] "varcomp"
> VarCorr(lme(y ~ 1, random = ~ 1 | ID))
ID = pdLogChol(1)
Variance StdDev
(Intercept) 0.0002709644 0.01646100
Residual 1.9062505816 1.38067034
Adaikalavan Ramasamy wrote:> Can anyone verify my calculations below or explain why they are wrong ?
>
> I have several animals that were measured thrice. The only blocking
> variable is the animal itself. I am interested in calculating the
> between and within object variations in R. An artificial example :
>
> y <- c( 2.2, -1.4, -0.5, # animal 1
> -0.3 -2.1 1.5, # animal 2
> 1.3 -0.3 0.5, # animal 3
> -1.4 -0.2 1.8) # animal 4
> ID <- factor( rep(1:4, each=3) )
>
>
> 1) Using the ANOVA method
>
> summary(aov( y ~ ID ))
> Df Sum Sq Mean Sq F value Pr(>F)
> ID 3 0.900 0.300 0.1207 0.9453
> Residuals 8 19.880 2.485
>
> => within animal variation = 2.485
> => between animal variation = (0.300 - 2.485)/3 = -0.7283
>
> I am aware that ANOVA can give negative estimates for variances. Is this
> such a case or have I coded wrongly ?
>
>
> 2) Using the REML approach
>
> library(nlme)
> lme( y ~ 1, rand = ~ 1 | ID)
> ....
> Random effects:
> Formula: ~1 | ID
> (Intercept) Residual
> StdDev: 0.01629769 1.374438
>
> => within animal variation = 1.374438^2 = 1.88908
> => between animal variation = 0.01629769^2 = 0.0002656147
>
> Is this the correct way of coding for this problem ? I do not have
> access to a copy of Pinheiro & Bates at the moment.
>
> Thank you very much in advance.
>
> Regards, Adai
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 452-1424 (M, W, F)
fax: (917) 438-0894
Adaikalavan Ramasamy
2005-Jun-12 23:30 UTC
[R] ANOVA vs REML approach to variance component estimation
Thank you for confirming this and introducing me to varcomp().
I have another question that I hope you or someone else can help me
with. I was trying to generalise my codes for variable measurement
levels and discovered that lme() was estimating the within group
variance even with a single measure per subject for all subjects !
Here is an example where we have 12 animals but with single measurement.
y <- c(2.2, -1.4, -0.5, -0.3, -2.1, 1.5,
1.3, -0.3, 0.5, -1.4, -0.2, 1.8)
ID <- factor( 1:12 )
Analysis of variance method correctly says that there is no residual
variance and it equals to total variance.
summary(aov(y ~ ID))
Df Sum Sq Mean Sq
ID 11 20.9692 1.9063
However the REML method is giving me a within animal variance when there
is no replication at animal level. It seems like I can get components of
variance for factors that are not replicated.
library(ape)
varcomp(lme(y ~ 1, random = ~ 1 | ID))
ID Within
1.6712661 0.2350218
Am I reading this correct and can someone kindly explain this to me ?
Thank you again.
Regards, Adai
On Fri, 2005-06-10 at 15:10 -0400, Chuck Cleland wrote:> They look fine to me. Also, note varcomp() in the ape package and
> VarCorr() in the nlme package. I think in this case the ANOVA estimate
> of the intercept variance component is negative because the true value
> is close to zero.
>
> > y <- c( 2.2, -1.4, -0.5, # animal 1
> + -0.3, -2.1, 1.5, # animal 2
> + 1.3, -0.3, 0.5, # animal 3
> + -1.4, -0.2, 1.8) # animal 4
>
> > ID <- factor( rep(1:4, each=3) )
>
> > library(nlme)
> > library(ape)
>
> > summary(aov(y ~ ID))
> Df Sum Sq Mean Sq F value Pr(>F)
> ID 3 0.9625 0.3208 0.1283 0.9406
> Residuals 8 20.0067 2.5008
>
> > (0.3208 - 2.5008) / 3
> [1] -0.7266667
>
> > varcomp(lme(y ~ 1, random = ~ 1 | ID))
> ID Within
> 0.0002709644 1.9062505816
> attr(,"class")
> [1] "varcomp"
>
> > VarCorr(lme(y ~ 1, random = ~ 1 | ID))
> ID = pdLogChol(1)
> Variance StdDev
> (Intercept) 0.0002709644 0.01646100
> Residual 1.9062505816 1.38067034
>
> Adaikalavan Ramasamy wrote:
> > Can anyone verify my calculations below or explain why they are wrong
?
> >
> > I have several animals that were measured thrice. The only blocking
> > variable is the animal itself. I am interested in calculating the
> > between and within object variations in R. An artificial example :
> >
> > y <- c( 2.2, -1.4, -0.5, # animal 1
> > -0.3 -2.1 1.5, # animal 2
> > 1.3 -0.3 0.5, # animal 3
> > -1.4 -0.2 1.8) # animal 4
> > ID <- factor( rep(1:4, each=3) )
> >
> >
> > 1) Using the ANOVA method
> >
> > summary(aov( y ~ ID ))
> > Df Sum Sq Mean Sq F value Pr(>F)
> > ID 3 0.900 0.300 0.1207 0.9453
> > Residuals 8 19.880 2.485
> >
> > => within animal variation = 2.485
> > => between animal variation = (0.300 - 2.485)/3 = -0.7283
> >
> > I am aware that ANOVA can give negative estimates for variances. Is
this
> > such a case or have I coded wrongly ?
> >
> >
> > 2) Using the REML approach
> >
> > library(nlme)
> > lme( y ~ 1, rand = ~ 1 | ID)
> > ....
> > Random effects:
> > Formula: ~1 | ID
> > (Intercept) Residual
> > StdDev: 0.01629769 1.374438
> >
> > => within animal variation = 1.374438^2 = 1.88908
> > => between animal variation = 0.01629769^2 = 0.0002656147
> >
> > Is this the correct way of coding for this problem ? I do not have
> > access to a copy of Pinheiro & Bates at the moment.
> >
> > Thank you very much in advance.
> >
> > Regards, Adai
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
> >
>