Lars Bergemann
2009-Jul-08 14:34 UTC
[R] Two-way ANOVA gives different results using anova(lm()) than doing it by hand
Hey! Could you please take a quick look at what I have done? Somehow I get wrong results using the anova(lm()) combination compared to doing a two way ANOVA by hand. Running: Data<-read.table("Data.txt"); g<-lm(ExM~S1*S2,Data); anova(g); Gives: Analysis of Variance Table Response: ExM Df Sum Sq Mean Sq F value Pr(>F) S1 1 4.3679 4.3679 167.045 < 2.2e-16 *** S2 1 0.9427 0.9427 36.053 8.236e-09 *** S1:S2 1 0.3231 0.3231 12.357 0.0005371 *** Residuals 212 5.5434 0.0261 I compared it to the work done by hand, ie calculated all the different square sums using sum() and tapply(). So I know that anova(lm()) gets the degrees of freedom equal two 1, 1, 1 and 212 when it should be 5, 5, 25 and 180. Also, the square sums are quite different ... I get 4.xx, 4.xx, 1.xx, 0.xx ... as you see, what anova(lm()) gets is different. The data: S1 has 6 levels, so has S2. On average, each cell has 6 values, most cells have actually 6 values, and there are two of each: 5, 7, 4, 8 - so average 6. Could you please help me, why it does not work with anova(lm())? I tried quite a few thinks found with Google, but it all gave me the same result as anova(lm()) ... Thanks a lot! Lars _________________________________________________________________ -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Data.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090708/19a66254/attachment-0002.txt>
Greg Snow
2009-Jul-08 17:11 UTC
[R] Two-way ANOVA gives different results using anova(lm()) than doing it by hand
Well, since we don't have Data.txt it is kind of hard for us to replicate what you have done. Here goes a guess as to what the problem may be. Have you told R anywhere that S1 and S2 are factors with 6 levels rather than numeric vectors? Or are you just hoping that the computer can read your mind to find out this information? (reading minds is one of the things that R and computers in general are not very good at yet. I have made a note to my future self to use the TimeTravel package to send a copy of the ESP package back to my past self, but I have not received it yet). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Lars Bergemann > Sent: Wednesday, July 08, 2009 8:35 AM > To: r-help at r-project.org > Subject: [R] Two-way ANOVA gives different results using anova(lm()) > than doing it by hand > > > Hey! > > > > Could you please take a quick look at what I have done? Somehow I get > wrong results using the anova(lm()) combination compared to doing a two > way ANOVA by hand. > > > > Running: > > > > Data<-read.table("Data.txt"); > g<-lm(ExM~S1*S2,Data); > anova(g); > > > > Gives: > > > > Analysis of Variance Table > > Response: ExM > Df Sum Sq Mean Sq F value Pr(>F) > S1 1 4.3679 4.3679 167.045 < 2.2e-16 *** > S2 1 0.9427 0.9427 36.053 8.236e-09 *** > S1:S2 1 0.3231 0.3231 12.357 0.0005371 *** > Residuals 212 5.5434 0.0261 > > > I compared it to the work done by hand, ie calculated all the different > square sums using sum() and tapply(). > > So I know that anova(lm()) gets the degrees of freedom equal two 1, 1, > 1 and 212 when it should be 5, 5, 25 and 180. Also, the square sums are > quite different ... I get 4.xx, 4.xx, 1.xx, 0.xx ... as you see, what > anova(lm()) gets is different. > > > > The data: S1 has 6 levels, so has S2. On average, each cell has 6 > values, most cells have actually 6 values, and there are two of each: > 5, 7, 4, 8 - so average 6. > > > > Could you please help me, why it does not work with anova(lm())? I > tried quite a few thinks found with Google, but it all gave me the same > result as anova(lm()) ... > > > > Thanks a lot! > > > > Lars > > _________________________________________________________________ > >
Zhiliang Ma
2009-Jul-08 21:25 UTC
[R] Two-way ANOVA gives different results using anova(lm()) than doing it by hand
the following works. i don't exactly what happens here. I guess "lm" might treat S1 and S2 as quantitative variables, not qualitative variables. cheers, Zhiliang S1 <- as.character(Data[,1]) S1 <- as.factor(S1) S2 <- as.character(Data[,2]) S2 <- as.factor(S2) data <- data.frame(S1=S1, S2=S2, ExM=Data[,4]) g <- lm(ExM ~ S1*S2, data) anova(g) Analysis of Variance Table Response: ExM Df Sum Sq Mean Sq F value Pr(>F) S1 5 4.7454 0.9491 961.66 < 2.2e-16 *** S2 5 4.9548 0.9910 1004.10 < 2.2e-16 *** S1:S2 25 1.2993 0.0520 52.66 < 2.2e-16 *** Residuals 180 0.1776 0.0010 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 On Wed, Jul 8, 2009 at 10:34 AM, Lars Bergemann<lars.bergemann at hotmail.com> wrote:> > Hey! > > > > Could you please take a quick look at what I have done? Somehow I get wrong results using the anova(lm()) combination compared to doing a two way ANOVA by hand. > > > > Running: > > > > Data<-read.table("Data.txt"); > g<-lm(ExM~S1*S2,Data); > anova(g); > > > > Gives: > > > > Analysis of Variance Table > > Response: ExM > ? ? ? ? ? Df Sum Sq Mean Sq F value ? ?Pr(>F) > S1 ? ? ? ? ?1 4.3679 ?4.3679 167.045 < 2.2e-16 *** > S2 ? ? ? ? ?1 0.9427 ?0.9427 ?36.053 8.236e-09 *** > S1:S2 ? ? ? 1 0.3231 ?0.3231 ?12.357 0.0005371 *** > Residuals 212 5.5434 ?0.0261 > > > I compared it to the work done by hand, ie calculated all the different square sums using sum() and tapply(). > > So I know that anova(lm()) gets the degrees of freedom equal two 1, 1, 1 and 212 when it should be 5, 5, 25 and 180. Also, the square sums are quite different ... I get 4.xx, 4.xx, 1.xx, 0.xx ... as you see, what anova(lm()) gets is different. > > > > The data: S1 has 6 levels, so has S2. On average, each cell has 6 values, most cells have actually 6 values, and there are two of each: 5, 7, 4, 8 - so average 6. > > > > Could you please help me, why it does not work with anova(lm())? I tried quite a few thinks found with Google, but it all gave me the same result as anova(lm()) ... > > > > Thanks a lot! > > > > Lars > > _________________________________________________________________ > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >