Adaikalavan Ramasamy
2005-Jun-10 17:35 UTC
[R] ANOVA vs REML approach to variance component estimation
Can anyone verify my calculations below or explain why they are wrong ? I have several animals that were measured thrice. The only blocking variable is the animal itself. I am interested in calculating the between and within object variations in R. An artificial example : y <- c( 2.2, -1.4, -0.5, # animal 1 -0.3 -2.1 1.5, # animal 2 1.3 -0.3 0.5, # animal 3 -1.4 -0.2 1.8) # animal 4 ID <- factor( rep(1:4, each=3) ) 1) Using the ANOVA method summary(aov( y ~ ID )) Df Sum Sq Mean Sq F value Pr(>F) ID 3 0.900 0.300 0.1207 0.9453 Residuals 8 19.880 2.485 => within animal variation = 2.485 => between animal variation = (0.300 - 2.485)/3 = -0.7283 I am aware that ANOVA can give negative estimates for variances. Is this such a case or have I coded wrongly ? 2) Using the REML approach library(nlme) lme( y ~ 1, rand = ~ 1 | ID) .... Random effects: Formula: ~1 | ID (Intercept) Residual StdDev: 0.01629769 1.374438 => within animal variation = 1.374438^2 = 1.88908 => between animal variation = 0.01629769^2 = 0.0002656147 Is this the correct way of coding for this problem ? I do not have access to a copy of Pinheiro & Bates at the moment. Thank you very much in advance. Regards, Adai
Chuck Cleland
2005-Jun-10 19:10 UTC
[R] ANOVA vs REML approach to variance component estimation
They look fine to me. Also, note varcomp() in the ape package and VarCorr() in the nlme package. I think in this case the ANOVA estimate of the intercept variance component is negative because the true value is close to zero. > y <- c( 2.2, -1.4, -0.5, # animal 1 + -0.3, -2.1, 1.5, # animal 2 + 1.3, -0.3, 0.5, # animal 3 + -1.4, -0.2, 1.8) # animal 4 > ID <- factor( rep(1:4, each=3) ) > library(nlme) > library(ape) > summary(aov(y ~ ID)) Df Sum Sq Mean Sq F value Pr(>F) ID 3 0.9625 0.3208 0.1283 0.9406 Residuals 8 20.0067 2.5008 > (0.3208 - 2.5008) / 3 [1] -0.7266667 > varcomp(lme(y ~ 1, random = ~ 1 | ID)) ID Within 0.0002709644 1.9062505816 attr(,"class") [1] "varcomp" > VarCorr(lme(y ~ 1, random = ~ 1 | ID)) ID = pdLogChol(1) Variance StdDev (Intercept) 0.0002709644 0.01646100 Residual 1.9062505816 1.38067034 Adaikalavan Ramasamy wrote:> Can anyone verify my calculations below or explain why they are wrong ? > > I have several animals that were measured thrice. The only blocking > variable is the animal itself. I am interested in calculating the > between and within object variations in R. An artificial example : > > y <- c( 2.2, -1.4, -0.5, # animal 1 > -0.3 -2.1 1.5, # animal 2 > 1.3 -0.3 0.5, # animal 3 > -1.4 -0.2 1.8) # animal 4 > ID <- factor( rep(1:4, each=3) ) > > > 1) Using the ANOVA method > > summary(aov( y ~ ID )) > Df Sum Sq Mean Sq F value Pr(>F) > ID 3 0.900 0.300 0.1207 0.9453 > Residuals 8 19.880 2.485 > > => within animal variation = 2.485 > => between animal variation = (0.300 - 2.485)/3 = -0.7283 > > I am aware that ANOVA can give negative estimates for variances. Is this > such a case or have I coded wrongly ? > > > 2) Using the REML approach > > library(nlme) > lme( y ~ 1, rand = ~ 1 | ID) > .... > Random effects: > Formula: ~1 | ID > (Intercept) Residual > StdDev: 0.01629769 1.374438 > > => within animal variation = 1.374438^2 = 1.88908 > => between animal variation = 0.01629769^2 = 0.0002656147 > > Is this the correct way of coding for this problem ? I do not have > access to a copy of Pinheiro & Bates at the moment. > > Thank you very much in advance. > > Regards, Adai > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894
Adaikalavan Ramasamy
2005-Jun-12 23:30 UTC
[R] ANOVA vs REML approach to variance component estimation
Thank you for confirming this and introducing me to varcomp(). I have another question that I hope you or someone else can help me with. I was trying to generalise my codes for variable measurement levels and discovered that lme() was estimating the within group variance even with a single measure per subject for all subjects ! Here is an example where we have 12 animals but with single measurement. y <- c(2.2, -1.4, -0.5, -0.3, -2.1, 1.5, 1.3, -0.3, 0.5, -1.4, -0.2, 1.8) ID <- factor( 1:12 ) Analysis of variance method correctly says that there is no residual variance and it equals to total variance. summary(aov(y ~ ID)) Df Sum Sq Mean Sq ID 11 20.9692 1.9063 However the REML method is giving me a within animal variance when there is no replication at animal level. It seems like I can get components of variance for factors that are not replicated. library(ape) varcomp(lme(y ~ 1, random = ~ 1 | ID)) ID Within 1.6712661 0.2350218 Am I reading this correct and can someone kindly explain this to me ? Thank you again. Regards, Adai On Fri, 2005-06-10 at 15:10 -0400, Chuck Cleland wrote:> They look fine to me. Also, note varcomp() in the ape package and > VarCorr() in the nlme package. I think in this case the ANOVA estimate > of the intercept variance component is negative because the true value > is close to zero. > > > y <- c( 2.2, -1.4, -0.5, # animal 1 > + -0.3, -2.1, 1.5, # animal 2 > + 1.3, -0.3, 0.5, # animal 3 > + -1.4, -0.2, 1.8) # animal 4 > > > ID <- factor( rep(1:4, each=3) ) > > > library(nlme) > > library(ape) > > > summary(aov(y ~ ID)) > Df Sum Sq Mean Sq F value Pr(>F) > ID 3 0.9625 0.3208 0.1283 0.9406 > Residuals 8 20.0067 2.5008 > > > (0.3208 - 2.5008) / 3 > [1] -0.7266667 > > > varcomp(lme(y ~ 1, random = ~ 1 | ID)) > ID Within > 0.0002709644 1.9062505816 > attr(,"class") > [1] "varcomp" > > > VarCorr(lme(y ~ 1, random = ~ 1 | ID)) > ID = pdLogChol(1) > Variance StdDev > (Intercept) 0.0002709644 0.01646100 > Residual 1.9062505816 1.38067034 > > Adaikalavan Ramasamy wrote: > > Can anyone verify my calculations below or explain why they are wrong ? > > > > I have several animals that were measured thrice. The only blocking > > variable is the animal itself. I am interested in calculating the > > between and within object variations in R. An artificial example : > > > > y <- c( 2.2, -1.4, -0.5, # animal 1 > > -0.3 -2.1 1.5, # animal 2 > > 1.3 -0.3 0.5, # animal 3 > > -1.4 -0.2 1.8) # animal 4 > > ID <- factor( rep(1:4, each=3) ) > > > > > > 1) Using the ANOVA method > > > > summary(aov( y ~ ID )) > > Df Sum Sq Mean Sq F value Pr(>F) > > ID 3 0.900 0.300 0.1207 0.9453 > > Residuals 8 19.880 2.485 > > > > => within animal variation = 2.485 > > => between animal variation = (0.300 - 2.485)/3 = -0.7283 > > > > I am aware that ANOVA can give negative estimates for variances. Is this > > such a case or have I coded wrongly ? > > > > > > 2) Using the REML approach > > > > library(nlme) > > lme( y ~ 1, rand = ~ 1 | ID) > > .... > > Random effects: > > Formula: ~1 | ID > > (Intercept) Residual > > StdDev: 0.01629769 1.374438 > > > > => within animal variation = 1.374438^2 = 1.88908 > > => between animal variation = 0.01629769^2 = 0.0002656147 > > > > Is this the correct way of coding for this problem ? I do not have > > access to a copy of Pinheiro & Bates at the moment. > > > > Thank you very much in advance. > > > > Regards, Adai > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > >