Hi, I am trying a two way anova with unequal sample sizes but results are not as expected: I take the example from Applied Linear Statistical Models (Neter et al. pp889-897, 1996) growth rate gender bone development 1.4 1 1 2.4 1 1 2.2 1 1 2.4 1 2 2.1 2 1 1.7 2 1 2.5 2 2 1.8 2 2 2 2 2 0.7 3 1 1.1 3 1 0.5 3 2 0.9 3 2 1.3 3 2 expected results are source of variation SS df MS F gender 0.12 1 0.12 0.74 bone development 4.1897 2 2.0949 12.89** interaction 0.0754 2 0.377 0.23 Error 1.3 8 0.1625 # I use aov (growrate ~ gender * bonedevelopment)->m summary(m) Df Sum Sq Mean Sq F value Pr(>F) as.factor(gender) 2 4.3063 2.1531 13.2501 0.002891 ** as.factor(bonedevlopment) 1 0.0926 0.0926 0.5697 0.472022 as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034 Residuals 8 1.3000 0.1625 #if I change the order of factors, results are different aov (growrate ~ bonedevelopment * gender)->m summary(m) Df Sum Sq Mean Sq F value Pr(>F) as.factor(bonedevlopment) 1 0.0029 0.0029 0.0176 0.897785 as.factor(gender) 2 4.3960 2.1980 13.5262 0.002713 ** as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034 Residuals 8 1.3000 0.1625 #In the both cases, results for main effects differ from those expected in Neter et al. However interaction and residuals are well estimated. Can anyone help, either I am wrong in the formula, or either is there an other problem? Is there a mean to conduct easily the test as in it is in Neter et al. ? The same problems occurs with anova(lm(....))? thank you very much julien CLAUDE ------------------------------- CLAUDE julien Universit? Montpellier II Institut des Sciences de l'Evolution de Montpellier Laboratoire de Pal?ontologie (Morphom?trie), Cc64 2, Place Eug?ne Bataillon. 34095, Montpellier, cedex 5 FRANCE Phone : (33) 4 67 14 47 82 Fax : (33) 4 67 14 36 10 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Is this a problem (perhaps) with the dreaded "SAS type III sums of squares"? I don't know the reference, but the authors may be estimating the effect of dropping the main effects while the interaction terms are still included in the model (which is at the very least controversial, and probably wrong). The sums of squares estimated for unbalanced linear models necessarily vary according to the order in which the factors are added or dropped. What do the authors say about the order they're using? See http://finzi.psych.upenn.edu/R/Rhelp/archive/0913.html http://finzi.psych.upenn.edu/R/Rhelp/archive/3612.html and responses to them ... Ben Bolker On Tue, 16 Oct 2001, julien claude wrote:> Hi, > > I am trying a two way anova with unequal sample sizes but results are not > as expected: > > I take the example from Applied Linear Statistical Models (Neter et al. > pp889-897, 1996) > > growth rate gender bone development > 1.4 1 1 > 2.4 1 1 > 2.2 1 1 > 2.4 1 2 > 2.1 2 1 > 1.7 2 1 > 2.5 2 2 > 1.8 2 2 > 2 2 2 > 0.7 3 1 > 1.1 3 1 > 0.5 3 2 > 0.9 3 2 > 1.3 3 2 > > expected results are > > source of variation SS df MS F > gender 0.12 1 0.12 0.74 > bone development 4.1897 2 2.0949 12.89** > interaction 0.0754 2 0.377 0.23 > Error 1.3 8 0.1625 > > # I use > aov (growrate ~ gender * bonedevelopment)->m > summary(m) > > Df Sum Sq Mean Sq F value Pr(>F) > as.factor(gender) 2 4.3063 2.1531 13.2501 > 0.002891 ** > as.factor(bonedevlopment) 1 0.0926 0.0926 0.5697 > 0.472022 > as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034 > Residuals 8 1.3000 0.1625 > > #if I change the order of factors, results are different > aov (growrate ~ bonedevelopment * gender)->m > summary(m) > > Df Sum Sq Mean Sq F value > Pr(>F) > as.factor(bonedevlopment) 1 0.0029 0.0029 0.0176 > 0.897785 > as.factor(gender) 2 4.3960 2.1980 13.5262 0.002713 ** > as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034 > Residuals 8 1.3000 0.1625 > > #In the both cases, results for main effects differ from those expected in > Neter et al. > However interaction and residuals are well estimated. > Can anyone help, either I am wrong in the formula, or either is there an > other problem? Is there a mean to conduct easily the test as in it is in > Neter et al. ? > The same problems occurs with anova(lm(....))? > > thank you very much > > julien CLAUDE > ------------------------------- > > CLAUDE julien > Universit? Montpellier II > Institut des Sciences de l'Evolution de Montpellier > Laboratoire de Pal?ontologie (Morphom?trie), Cc64 > 2, Place Eug?ne Bataillon. > 34095, Montpellier, cedex 5 > FRANCE > > Phone : (33) 4 67 14 47 82 > Fax : (33) 4 67 14 36 10 > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- 318 Carr Hall bolker at zoo.ufl.edu Zoology Department, University of Florida http://www.zoo.ufl.edu/bolker Box 118525 (ph) 352-392-5697 Gainesville, FL 32611-8525 (fax) 352-392-3704 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 16 Oct 2001, julien claude wrote:> Hi, > > I am trying a two way anova with unequal sample sizes but results are not > as expected: > > I take the example from Applied Linear Statistical Models (Neter et al. > pp889-897, 1996) > > growth rate gender bone development > 1.4 1 1 > 2.4 1 1 > 2.2 1 1 > 2.4 1 2 > 2.1 2 1 > 1.7 2 1 > 2.5 2 2 > 1.8 2 2 > 2 2 2 > 0.7 3 1 > 1.1 3 1 > 0.5 3 2 > 0.9 3 2 > 1.3 3 2 > > expected results are > > source of variation SS df MS F > gender 0.12 1 0.12 0.74 > bone development 4.1897 2 2.0949 12.89** > interaction 0.0754 2 0.377 0.23 > Error 1.3 8 0.1625There's something fairly fundamental wrong here. Gender in your data has three levels, but your expected results give only 1 df. If you check the book again you will find you have the column labels wrong. This isn't main problem, though.> #In the both cases, results for main effects differ from those expected in > Neter et al.Yes, but the book clearly warns you that many software packages don't have the same choices for sums of squares in unbalanced designs that they have. R is one of those many packages. Neter et al, as they carefull explain, present ANOVA tables that summarise comparisons from a bunch of different models and on page 895 they show all the sets of models they use to construct their ANOVA table. -thomas Thomas Lumley Asst. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
julien claude <claude at isem.univ-montp2.fr> writes:> Hi, > > I am trying a two way anova with unequal sample sizes but results are not > as expected: > > I take the example from Applied Linear Statistical Models (Neter et al. > pp889-897, 1996) > > growth rate gender bone development > 1.4 1 1 > 2.4 1 1 > 2.2 1 1 > 2.4 1 2 > 2.1 2 1 > 1.7 2 1 > 2.5 2 2 > 1.8 2 2 > 2 2 2 > 0.7 3 1 > 1.1 3 1 > 0.5 3 2 > 0.9 3 2 > 1.3 3 2 > > expected results are > > source of variation SS df MS F > gender 0.12 1 0.12 0.74 > bone development 4.1897 2 2.0949 12.89** > interaction 0.0754 2 0.377 0.23 > Error 1.3 8 0.1625 > > # I use > aov (growrate ~ gender * bonedevelopment)->m > summary(m) > > Df Sum Sq Mean Sq F value Pr(>F) > as.factor(gender) 2 4.3063 2.1531 13.2501 > 0.002891 ** > as.factor(bonedevlopment) 1 0.0926 0.0926 0.5697 > 0.472022 > as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034 > Residuals 8 1.3000 0.1625Ahem. Tab damage detected... and your command and output don't match up. The as.factor(gender:bonedevlopment) is playing with fire... You should calculate factor() of each term. However, it would seem that you already did manage to convert things to factors or you would have gotten something to this effect:> evalq(as.factor(gender:bone.development),d)[1] 1 Levels: 1 Warning messages: 1: Numerical expression has 14 elements: only the first used in: gender:bone.development 2: Numerical expression has 14 elements: only the first used in: gender:bone.development> > #if I change the order of factors, results are different > aov (growrate ~ bonedevelopment * gender)->m > summary(m) > > Df Sum Sq Mean Sq F value > Pr(>F) > as.factor(bonedevlopment) 1 0.0029 0.0029 0.0176 > 0.897785 > as.factor(gender) 2 4.3960 2.1980 13.5262 0.002713 ** > as.factor(gender:bonedevlopment) 2 0.0754 0.0377 0.2321 0.798034 > Residuals 8 1.3000 0.1625 > > #In the both cases, results for main effects differ from those expected in > Neter et al. > However interaction and residuals are well estimated. > Can anyone help, either I am wrong in the formula, or either is there an > other problem? Is there a mean to conduct easily the test as in it is in > Neter et al. ? > The same problems occurs with anova(lm(....))?I don't think we're the ones with the problem... There are various boneheaded ways in which people try to use to assign some kind of SumSq to main effects in the presence of interaction, and they are all wrong - although maybe not very wrong if the unbalance is slight. The tests *should* depend on the test order, as is most clearly seen if the predictors are highly collinear. -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Julien, At 09:33 PM 16/10/2001 +0200, julien claude wrote:>Hi, > >I am trying a two way anova with unequal sample sizes but results are not >as expected: > >I take the example from Applied Linear Statistical Models (Neter et al. >pp889-897, 1996) > >growth rate gender bone development >1.4 1 1 >2.4 1 1 >2.2 1 1 >2.4 1 2 >2.1 2 1 >1.7 2 1 >2.5 2 2 >1.8 2 2 >2 2 2 >0.7 3 1 >1.1 3 1 >0.5 3 2 >0.9 3 2 >1.3 3 2You've apparently reversed the data columns for gender and bone development.>expected results are > >source of variation SS df MS F >gender 0.12 1 0.12 0.74 >bone development 4.1897 2 2.0949 12.89** >interaction 0.0754 2 0.377 0.23 >Error 1.3 8 0.1625 > ># I use >aov (growrate ~ gender * bonedevelopment)->m >summary(m) > > Df Sum Sq Mean > Sq F value Pr(>F) >as.factor(gender) 2 4.3063 > 2.1531 13.2501 >0.002891 ** >as.factor(bonedevlopment) 1 0.0926 >0.0926 0.5697 >0.472022 >as.factor(gender:bonedevlopment) 2 0.0754 0.0377 > 0.2321 0.798034 >Residuals 8 1.3000 >0.1625 > >#if I change the order of factors, results are different >aov (growrate ~ bonedevelopment * gender)->m >summary(m) > > Df Sum Sq Mean Sq F > value >Pr(>F) >as.factor(bonedevlopment) 1 0.0029 0.0029 0.0176 >0.897785 >as.factor(gender) 2 4.3960 2.1980 > 13.5262 0.002713 ** >as.factor(gender:bonedevlopment) 2 0.0754 0.0377 > 0.2321 0.798034 >Residuals 8 1.3000 0.1625 > >#In the both cases, results for main effects differ from those expected in >Neter et al. >However interaction and residuals are well estimated. >Can anyone help, either I am wrong in the formula, or either is there an >other problem? Is there a mean to conduct easily the test as in it is in >Neter et al. ? >The same problems occurs with anova(lm(....))?The problem is that the analysis in Neter et al. uses what are sometimes called "type III" sums of squares -- that is, testing each term in the model 'after' all others (including main effects 'after' interactions to which the main effects are marginal). Some would argue that this *never* makes sense, but it *certainly* doesn't make sense if you use "contr.treatment" to code contrasts for factors, which is the default in R. To get the results in Neter, you can use contr.sum or contr.helmert with lm, along with the Anova function in the car package: > library(car) > > Anova(lm(growth.rate ~ gender * bone, + contrasts=list(gender='contr.sum', bone='contr.sum')), + type='III') Anova Table (Type III tests) Response: growth.rate Sum Sq Df F value Pr(>F) (Intercept) 34.680 1 213.4154 4.729e-07 *** gender 0.120 1 0.7385 0.415160 bone 4.190 2 12.8914 0.003145 ** gender:bone 0.075 2 0.2321 0.798034 Residuals 1.300 8 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > > Anova(lm(growth.rate ~ gender * bone, + contrasts=list(gender='contr.helmert', bone='contr.helmert')), + type='III') Anova Table (Type III tests) Response: growth.rate Sum Sq Df F value Pr(>F) (Intercept) 34.680 1 213.4154 4.729e-07 *** gender 0.120 1 0.7385 0.415160 bone 4.190 2 12.8914 0.003145 ** gender:bone 0.075 2 0.2321 0.798034 Residuals 1.300 8 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > You might think about whether you really prefer this analysis to one that obeys marginality (producing what are sometimes called "type II" sums of squares). The anova function in R computes so-called "sequential" (or "type I") sums of squares which rarely correspond to hypotheses of interest. How to calculate F-tests in unbalanced Anova models with interactions is a subject that seems to produce a lot of heat, so you can expect conflicting advice. I hope that this is useful to you. John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox ----------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Maybe Matching Threads
- coefficient of partial determination...partial r square [ redux]
- Maximum likelihood estimation of Regression parameters
- How to choose appropriate linear model? (ANOVA)
- measurement error model - "simple" linear regression
- predict.lm - standard error of predicted means?