Hello every body, I am trying to do a factorial anova analysis following this model: model<-anova(lm(responsevariable~factorA*factorB)) model<-anova(lm(luz$dosel~luz$estado*luz$Bosque)) Df Sum Sq Mean Sq F value Pr(>F) estado 1 6931.1 6931.1 41.6455 7.974e-06 *** Bosque 1 36.6 36.6 0.2197 0.6456 estado:Bosque 1 36.6 36.6 0.2197 0.6456 Residuals 16 2662.9 166.4 Strange is that the sum of squares of the factor Bosque are identical to the SS of the interaction, and are non significant. But when I plot the data, the interaction surley is significant... my data.frame looks as follows: Bosque estado lux dosel 1 deciduo pristino 703 88.56 2 deciduo pristino 800 90.64 3 deciduo pristino 150 95.84 4 deciduo pristino 245 87.52 5 deciduo pristino 1300 91.68 6 deciduo activo 1900 26.16 7 deciduo activo 840 59.44 8 deciduo activo 323 69.84 9 deciduo activo 112 75.04 10 deciduo activo 1360 51.12 11 siemprev activo 900 41.76 12 siemprev activo 480 65.68 13 siemprev activo 350 78.16 14 siemprev activo 350 37.60 15 siemprev activo 272 58.40 16 siemprev pristino 100 94.80 17 siemprev pristino 60 95.84 18 siemprev pristino 50 97.92 19 siemprev pristino 270 94.80 20 siemprev pristino 110 97.92 Dose some body understand what I am doing wrong??? I have been navigating at the R site search, but didn't found much posting on factorial anova. In advance thanks a lot for your comments Petra
On Sun, 2005-12-25 at 23:01 -0300, Petra Wallem wrote:> Hello every body, I am trying to do a factorial anova analysis > following this model: > > model<-anova(lm(responsevariable~factorA*factorB)) > model<-anova(lm(luz$dosel~luz$estado*luz$Bosque)) > > Df Sum Sq Mean Sq F value Pr(>F) > estado 1 6931.1 6931.1 41.6455 7.974e-06 *** > Bosque 1 36.6 36.6 0.2197 0.6456 > estado:Bosque 1 36.6 36.6 0.2197 0.6456 > Residuals 16 2662.9 166.4 > > Strange is that the sum of squares of the factor Bosque are identical to > the SS of the interaction, and are non significant. But when I plot the > data, the interaction surley is significant... > > my data.frame looks as follows: > > Bosque estado lux dosel > 1 deciduo pristino 703 88.56 > 2 deciduo pristino 800 90.64 > 3 deciduo pristino 150 95.84 > 4 deciduo pristino 245 87.52 > 5 deciduo pristino 1300 91.68 > 6 deciduo activo 1900 26.16 > 7 deciduo activo 840 59.44 > 8 deciduo activo 323 69.84 > 9 deciduo activo 112 75.04 > 10 deciduo activo 1360 51.12 > 11 siemprev activo 900 41.76 > 12 siemprev activo 480 65.68 > 13 siemprev activo 350 78.16 > 14 siemprev activo 350 37.60 > 15 siemprev activo 272 58.40 > 16 siemprev pristino 100 94.80 > 17 siemprev pristino 60 95.84 > 18 siemprev pristino 50 97.92 > 19 siemprev pristino 270 94.80 > 20 siemprev pristino 110 97.92 > > Dose some body understand what I am doing wrong??? I have been > navigating at the R site search, but didn't found much posting on > factorial anova. > > In advance thanks a lot for your comments > Petra > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlIt would help if you would use the "dump" function and paste the output into an e-mail:> dump("luz","")Also, it's much easier to use "data=luz" as an argument in the lm function rather than appending the data frame name to each variable. I don't think that "model" contains the lm model output. It looks like you are saving the anova table.
Rick, I read you data into a data.frame called data. I sugguest you run the model as follows: fit<-anova( dosel ~ estado * Bosque, data = data) summary(fit1) The results are:> contrasts(data$Bosque)siemprev deciduo 0 siemprev 1> contrasts(data$estado)pristino activo 0 pristino 1> summary(fit1)Call: lm(formula = dosel ~ estado * Bosque, data = data) Residuals: Min 1Q Median 3Q Max -30.160 -2.548 0.312 3.588 21.840 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.632e+01 5.769e+00 9.762 3.84e-08 *** estadopristino 3.453e+01 8.159e+00 4.232 0.000635 *** Bosquesiemprev 1.249e-15 8.159e+00 1.53e-16 1.000000 estadopristino:Bosquesiemprev 5.408e+00 1.154e+01 0.469 0.645622 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 12.9 on 16 degrees of freedom Multiple R-Squared: 0.7245, Adjusted R-squared: 0.6729 F-statistic: 14.03 on 3 and 16 DF, p-value: 9.615e-05 You will note that the p values for the interaction and the main effect for Bosqueiemprev are no longer the same. Feliz ano nuevo! John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC and University of Maryland School of Medicine Claude Pepper OAIC University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 410-605-7119 - NOTE NEW EMAIL ADDRESS: jsorkin at grecc.umaryland.edu>>> Rick Bilonick <rab at nauticom.net> 12/25/05 9:09 PM >>>On Sun, 2005-12-25 at 23:01 -0300, Petra Wallem wrote:> Hello every body, I am trying to do a factorial anova analysis > following this model: > > model<-anova(lm(responsevariable~factorA*factorB)) > model<-anova(lm(luz$dosel~luz$estado*luz$Bosque)) > > Df Sum Sq Mean Sq F value Pr(>F) > estado 1 6931.1 6931.1 41.6455 7.974e-06 *** > Bosque 1 36.6 36.6 0.2197 0.6456 > estado:Bosque 1 36.6 36.6 0.2197 0.6456 > Residuals 16 2662.9 166.4 > > Strange is that the sum of squares of the factor Bosque are identical to > the SS of the interaction, and are non significant. But when I plot the > data, the interaction surley is significant... > > my data.frame looks as follows: > > Bosque estado lux dosel > 1 deciduo pristino 703 88.56 > 2 deciduo pristino 800 90.64 > 3 deciduo pristino 150 95.84 > 4 deciduo pristino 245 87.52 > 5 deciduo pristino 1300 91.68 > 6 deciduo activo 1900 26.16 > 7 deciduo activo 840 59.44 > 8 deciduo activo 323 69.84 > 9 deciduo activo 112 75.04 > 10 deciduo activo 1360 51.12 > 11 siemprev activo 900 41.76 > 12 siemprev activo 480 65.68 > 13 siemprev activo 350 78.16 > 14 siemprev activo 350 37.60 > 15 siemprev activo 272 58.40 > 16 siemprev pristino 100 94.80 > 17 siemprev pristino 60 95.84 > 18 siemprev pristino 50 97.92 > 19 siemprev pristino 270 94.80 > 20 siemprev pristino 110 97.92 > > Dose some body understand what I am doing wrong??? I have been > navigating at the R site search, but didn't found much posting on > factorial anova. > > In advance thanks a lot for your comments > Petra > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlIt would help if you would use the "dump" function and paste the output into an e-mail:> dump("luz","")Also, it's much easier to use "data=luz" as an argument in the lm function rather than appending the data frame name to each variable. I don't think that "model" contains the lm model output. It looks like you are saving the anova table. ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Thanks a lot to all of your responses, I did follow your adivces, but finnally to really get it understanded I acctually did the work to calculate the anova step by step on an excel spread sheet to see if I get the same SS and MS as is aov output, and yes, they are the same, so John you are right the data is kind of freak... My only preliminary survye was to make a boxplot of the interaction, where data is acctually correlated, but I did not expect that this correlation would result in identical sum of squares between tretment and interaction... kind of odd... Thanks again for your comments and suggestions, I learned some new functions I was not using... Happy New 2006, for all of you, enjoy the party!!! Cheers Petra El mar, 27-12-2005 a las 13:13, John Wilkinson escribi伱伋:> Petra, > > It looks as though the problem is with your data. > Reading it into 'R' gives--- > > dat<-read.table("clipboard",header=T,sep="") > dat > Bosque estado lux dosel > 1 deciduo pristino 703 88.56 > 2 deciduo pristino 800 90.64 > 3 deciduo pristino 150 95.84 > 4 deciduo pristino 245 87.52 > 5 deciduo pristino 1300 91.68 > 6 deciduo activo 1900 26.16 > 7 deciduo activo 840 59.44 > 8 deciduo activo 323 69.84 > 9 deciduo activo 112 75.04 > 10 deciduo activo 1360 51.12 > 11 siemprev activo 900 41.76 > 12 siemprev activo 480 65.68 > 13 siemprev activo 350 78.16 > 14 siemprev activo 350 37.60 > 15 siemprev activo 272 58.40 > 16 siemprev pristino 100 94.80 > 17 siemprev pristino 60 95.84 > 18 siemprev pristino 50 97.92 > 19 siemprev pristino 270 94.80 > 20 siemprev pristino 110 97.92 > > a straight analysis of variance (aov) model gives-- > > > dat.aov<-aov(dosel~estado*Bosque,data=dat) > > summary(dat.aov) > Df Sum Sq Mean Sq F value Pr(>F) > estado 1 6931.1 6931.1 41.6455 7.974e-06 *** > Bosque 1 36.6 36.6 0.2197 0.6456 > estado:Bosque 1 36.6 36.6 0.2197 0.6456 > Residuals 16 2662.9 166.4 > > > showing that Bosque and its interaction with estado do indeed have > the same 'sum of squares' of 36.6 > > a preliminary exploration of the data's factors shows-- > > > with(dat,tapply(dosel,list(estado,Bosque),mean)) > > deciduo siemprev > activo 56.320 56.320 > pristino 90.848 96.256 > > > with(dat,tapply(dosel,list(estado,Bosque),sd)) > deciduo siemprev > activo 19.232972 16.817800 > pristino 3.239062 1.577238 > > > This shows that the levels of the factors are highly corelated > > the linear model and its anova confirms this-- > > > fit.lm<-lm(dosel~estado*Bosque,data=dat) > > summary(fit.lm) > > Call: > lm(formula = dosel ~ estado * Bosque, data = dat) > > Residuals: > Min 1Q Median 3Q Max > -30.160 -2.548 0.312 3.588 21.840 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 5.632e+01 5.769e+00 9.762 3.84e-08 *** > estadopristino 3.453e+01 8.159e+00 4.232 0.000635 *** > Bosquesiemprev 1.249e-15 8.159e+00 1.53e-16 1.000000 > estadopristino:Bosquesiemprev 5.408e+00 1.154e+01 0.469 0.645622 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Residual standard error: 12.9 on 16 degrees of freedom > Multiple R-Squared: 0.7245, Adjusted R-squared: 0.6729 > F-statistic: 14.03 on 3 and 16 DF, p-value: 9.615e-05 > > > anova(fit.lm) > Analysis of Variance Table > > Response: dosel > Df Sum Sq Mean Sq F value Pr(>F) > estado 1 6931.1 6931.1 41.6455 7.974e-06 *** > Bosque 1 36.6 36.6 0.2197 0.6456 > estado:Bosque 1 36.6 36.6 0.2197 0.6456 > Residuals 16 2662.9 166.4 > > > the drop function shows that the model would improve by > dropping the interaction term and so reducing the RSS > (by 36.56, being the redundant interaction Sum of Sq) > > drop1(fit.lm).The AIC confirms this (the lower the better). > Single term deletions > > Model: > dosel ~ estado * Bosque > Df Sum of Sq RSS AIC > <none> 2662.90 105.83 > estado:Bosque 1 36.56 2699.46 104.10 > > > The only sig effect of the model is thus between estado levels. > pristino effect being *** sig greater than activo for both levels of > Bosque ( as the tapply table above clearly shows) > > It pays to do a preliminary survry of the data. > > I hope that helps, > > > John > > > > > > >-- Petra Wallem Centro de Estudios Avanzados en Ecolog伱伃a & Biodiversidad (CASEB) Departamento de Ecolog伱伃a Facultad de Ciencias Biol伱伋gicas Pontificia Universidad Cat伱伋lica de Chile Av. Libertador Bernardo O'Higgins # 340 Casilla 114-D