Hello, is it possible to obtain type III sums of squares for a nested model as in the following: lmod <- lm(resp ~ A * B + (C %in% A), mydata)) I have tried library(car) Anova(lmod, type="III") but this gives me an error (and I also understand from the documentation of Anova as well as from a previous request (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/64477.html) that it is not possible to specify nested models with car's Anova). anova(lmod) works, of course. My data (given below) is balanced so I expect the results to be similar for both type I and type III sums of squares. But are they *exactly* the same? The editor of the journal which I'm sending my manuscript to requests what he calls "conventional" type III tests and I'm not sure if can convince him to accept my type I analysis. R> mydata A B C resp 1 1 1 1 34.12 2 1 1 2 32.45 3 1 1 3 44.55 4 1 2 1 20.88 5 1 2 2 22.32 6 1 2 3 27.71 7 2 1 6 38.20 8 2 1 7 31.62 9 2 1 8 38.71 10 2 2 6 18.93 11 2 2 7 20.57 12 2 2 8 31.55 13 3 1 9 40.81 14 3 1 10 42.23 15 3 1 11 41.26 16 3 2 9 28.41 17 3 2 10 24.07 18 3 2 11 21.16 Thanks a lot, Carsten
Carsten Jaeger wrote:> Hello, > > is it possible to obtain type III sums of squares for a nested model as > in the following: > > lmod <- lm(resp ~ A * B + (C %in% A), mydata)) > > I have tried > > library(car) > Anova(lmod, type="III") > > but this gives me an error (and I also understand from the documentation > of Anova as well as from a previous request > (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/64477.html) that it is > not possible to specify nested models with car's Anova). > > anova(lmod) works, of course. > > My data (given below) is balanced so I expect the results to be similar > for both type I and type III sums of squares. But are they *exactly* the > same? The editor of the journal which I'm sending my manuscript to > requests what he calls "conventional" type III tests and I'm not sure if > > can convince him to accept my type I analysis.In balanced designs, type I-IV SSD's are all identical. However, I don't think the model does what I think you think it does. Notice that "nesting" is used with two diferent meanings, in R it would be that the codings of C only makes sense within levels of A - e.g. if they were numbered 1:3 within each group, but with C==1 when A==1 having nothing to do with C==1 when A==2. SAS does something. er. else... What I think you want is a model where C is a random terms so that main effects of A can be tested, like in> summary(aov(resp ~ A * B + Error(C), dd))Error: C Df Sum Sq Mean Sq F value Pr(>F) A 2 33.123 16.562 0.4981 0.6308 Residuals 6 199.501 33.250 Error: Within Df Sum Sq Mean Sq F value Pr(>F) B 1 915.21 915.21 83.7846 9.57e-05 *** A:B 2 16.13 8.07 0.7384 0.5168 Residuals 6 65.54 10.92 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 (This is essentially the same structure as Martin Bleichner had earlier today, also @web.de. What is this? an epidemic? ;-)) -- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Bill.Venables at csiro.au
2007-Jul-10 12:44 UTC
[R] type III ANOVA for a nested linear model
The message from this cute little data set is very clear. Consider> fm <- aov(resp ~ A*B + A/C, mydata) > > drop1(fm, test = "F")Single term deletions Model: resp ~ A * B + A/C Df Sum of Sq RSS AIC F value Pr(F) <none> 65.540 47.261 A:B 2 16.132 81.672 47.222 0.7384 0.5168 A:C 6 199.501 265.041 60.411 3.0440 0.1007 So neither of the non-marginal terms is significant. To address questions about the main effects the natural next step is to remove the interactions. By orthogonality you can safely cut a few corners and do both at once:> drop1(update(fm, .~A+B), test = "F")Single term deletions Model: resp ~ A + B Df Sum of Sq RSS AIC F value Pr(F) <none> 281.17 57.47 A 2 33.12 314.30 55.48 0.8246 0.4586 B 1 915.21 1196.38 81.54 45.5695 9.311e-06 There is a very obvious, even trivial, B main effect, but nothing else. All this becomes even more glaring if you take the unusal step of plotting the data. What sort of editor would overlook this clear and demonstrable message leaping out from the data in favour of some arcane argument about "types of sums of squares"? Several answers come to mind: A power freak, a SAS afficianado, an idiot. If you get nowhere with this editor, my suggestion, hard as it may seem, is that you do not submit to that kind of midnless idealogy and make fatuous compromises for the sake of immediate publication. If necessary, part company with that editor and find somewhere else to publish where the editor has some inkling of what statistical inference is all about. Bill Venables. -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Carsten Jaeger Sent: Tuesday, 10 July 2007 4:15 AM To: R help list Subject: [R] type III ANOVA for a nested linear model Hello, is it possible to obtain type III sums of squares for a nested model as in the following: lmod <- lm(resp ~ A * B + (C %in% A), mydata)) I have tried library(car) Anova(lmod, type="III") but this gives me an error (and I also understand from the documentation of Anova as well as from a previous request (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/64477.html) that it is not possible to specify nested models with car's Anova). anova(lmod) works, of course. My data (given below) is balanced so I expect the results to be similar for both type I and type III sums of squares. But are they *exactly* the same? The editor of the journal which I'm sending my manuscript to requests what he calls "conventional" type III tests and I'm not sure if can convince him to accept my type I analysis. R> mydata A B C resp 1 1 1 1 34.12 2 1 1 2 32.45 3 1 1 3 44.55 4 1 2 1 20.88 5 1 2 2 22.32 6 1 2 3 27.71 7 2 1 6 38.20 8 2 1 7 31.62 9 2 1 8 38.71 10 2 2 6 18.93 11 2 2 7 20.57 12 2 2 8 31.55 13 3 1 9 40.81 14 3 1 10 42.23 15 3 1 11 41.26 16 3 2 9 28.41 17 3 2 10 24.07 18 3 2 11 21.16 Thanks a lot, Carsten ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
The aliasing problem arises in alias(), which is what Anova uses to detect aliasing. It is simply the fact that anova more or less blithely ignores the NA's that makes anova behave apparently more 'sensibly' than Anova. But like Carsten, I found this difficult to understand. Unordered factors are supposed to be arbitrary. I can understand why A:C behaves differently from A:D (where D<-factor(rep(1:3,6)). A:C generates a 27-level factor with only 18 levels populated; A:D has only nine levels. But it is not so simple. I was goig to suggest using AC<-factor(A:C) instead of A %in% C as a fix, as this generates a 9-level factor with all levels populated - just as A:D does. But> alias(lm(resp~A+AC))generates an alias, where> alias(lm(resp~A+A:D))does not. That seemed to me entirely bizarre. table(A,AC) and table(A,A:D) show identical balance and nesting. Something, somewhere, is expanding the two differently. But what? model.matrix is the 'culprit', I think. The two model matrices differ; model.matrix(resp~A+A:D) has 9 columns, and model.matrix(resp~A+AC) has eleven. So now I know what has happened. What I don't understand is why, except that 'that is what model.matrix does with this factor level numbering' (refactoring either AC or A:D via as.numeric finally generates identical - and aliased - behaviour)) . I think I am going to invoke the law of unintended consequences and go find a cold compress. Steve E>>> Peter Dalgaard <P.Dalgaard at biostat.ku.dk> 11/07/2007 16:03:13 >>> >A term C %in% A (or A/C) is not a _specification_ that C is nested in >A, it is a _directive_ to include the terms A and C:A. Now, C:A involves >a term for each combination of A and C, of which many are empty if C is >strictly coarser than A. This may well be what is confusing Anova().>In fact, with this (c(1:3,6:11)) coding of C, A:C is completely >equivalent to C, but if you look at summary(lm(....)) you will see a lot >of NA coefficients in the A:C case. If you use resp ~ A*B+C, then you >still get a couple of missing coefficients in the C terms because of >collinearity with the A terms. (Notice that this is one case where the >order inside the model formula will matter; C+A*B is not the same.)******************************************************************* This email and any attachments are confidential. Any use, co...{{dropped}}
Mendiburu, Felipe (CIP)
2007-Jul-12 14:08 UTC
[R] type III ANOVA for a nested linear model
Dear Carsten In this test, factor B would be representing to a factor of block or repetition according to as the levels of A, B, and C are in the data. Factor C this nested in A, then the model should include: B, A and C nested in A, the difference it is the error. Model: B 1 A 2 C(A) 6 Error (2+6)*1 = 8 Total mydata<-read.table("mydata.txt",header=T) mydata[,1]<- as.factor(mydata[,1]) mydata[,2]<- as.factor(mydata[,2]) mydata[,3]<- as.factor(mydata[,3]) model <- aov(resp ~ B + A + C/A, mydata) summary(model) Df Sum Sq Mean Sq F value Pr(>F) B 1 915.21 915.21 89.6476 1.274e-05 *** A 2 33.12 16.56 1.6223 0.25621 C 6 199.50 33.25 3.2570 0.06316 . Residuals 8 81.67 10.21 Best regards, Felipe de Mendiburu Statistician -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Carsten Jaeger Sent: Tuesday, July 10, 2007 6:15 AM To: R help list Subject: [R] type III ANOVA for a nested linear model Hello, is it possible to obtain type III sums of squares for a nested model as in the following: lmod <- lm(resp ~ A * B + (C %in% A), mydata)) I have tried library(car) Anova(lmod, type="III") but this gives me an error (and I also understand from the documentation of Anova as well as from a previous request (http://finzi.psych.upenn.edu/R/Rhelp02a/archive/64477.html) that it is not possible to specify nested models with car's Anova). anova(lmod) works, of course. My data (given below) is balanced so I expect the results to be similar for both type I and type III sums of squares. But are they *exactly* the same? The editor of the journal which I'm sending my manuscript to requests what he calls "conventional" type III tests and I'm not sure if can convince him to accept my type I analysis. R> mydata A B C resp 1 1 1 1 34.12 2 1 1 2 32.45 3 1 1 3 44.55 4 1 2 1 20.88 5 1 2 2 22.32 6 1 2 3 27.71 7 2 1 6 38.20 8 2 1 7 31.62 9 2 1 8 38.71 10 2 2 6 18.93 11 2 2 7 20.57 12 2 2 8 31.55 13 3 1 9 40.81 14 3 1 10 42.23 15 3 1 11 41.26 16 3 2 9 28.41 17 3 2 10 24.07 18 3 2 11 21.16 Thanks a lot, Carsten ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.