Dear All, I have a problem in understanding how the interactions of 2 ways ANOVA work, because I get conflicting results from a t-test and an anova. For most of you my problem is very simple I am sure. I need an help with an example, looking at one table I am analyzing. The table is in attachment and can be imported in R by means of this command: scrd<- read.table('/Users/luca/Documents/Analisi_passi/Codice_R/Statistics_results_bump_hole_Audio_Haptic/tables_for_R/table_realism_wood.txt', header=TRUE, colClasse=c('numeric','factor','factor','numeric')) This table is the result of a simple experiment. Subjects where exposed to some stimuli and they where asked to evaluate the degree of realism of the stimuli on a 7 point scale (i.e., data in column "response"). Each stimulus was presented in two conditions, "A" and "AH", where AH is the condition A plus another thing (let?s call it "H"). Now, what means exactly in my table the interaction stimulus:condition? I think that if I do the analysis anova(response ~ stimulus*condition) I will get the comparison between the same stimulus in condition A and in condition AH. Am I wrong? For instance the comparison of stimulus flat_550_W_realism presented in condition A with the same stimulus, flat_550_W_realism, presented in condition AH. The problem is that if I do a t-test between the values of this stimulus in the A and AH condition I get significative difference, while if I do the test with 2-ways ANOVA I don?t get any difference. How is this possible? Here I put the results analysis #Here the result of ANOVA:> fit1<- lm(response ~ stimulus + condition + stimulus:condition, data=scrd) >#EQUIVALE A lm(response ~ stimulus*condition, data=scrd) > > anova(fit1)Analysis of Variance Table Response: response Df Sum Sq Mean Sq F value Pr(>F) stimulus 6 15.05 2.509 1.1000 0.3647 condition 1 36.51 36.515 16.0089 9.64e-05 *** stimulus:condition 6 1.47 0.244 0.1071 0.9955 Residuals 159 362.67 2.281 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #As you can see the p-value for stimulus:condition is high. #Now I do the t-test with the same values of the table concerning the stimulus presented in A and AH conditions: flat_550_W_realism =c(3,3,5,3,3,3,3,5,3,3,5,7,5,2,3) flat_550_W_realism_AH =c(7,4,5,3,6,5,3,5,5,7,2,7,5, 5)> t.test(flat_550_W_realism,flat_550_W_realism_AH, var.equal=TRUE)Two Sample t-test data: flat_550_W_realism and flat_550_W_realism_AH t = -2.2361, df = 27, p-value = 0.03381 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -2.29198603 -0.09849016 sample estimates: mean of x mean of y 3.733333 4.928571 #Now we have a significative difference between these two stimuli (p-value = 0.03381) Why I get this beheaviour? Moreover, how by means of ANOVA I could track the significative differences between the stimuli presented in A and AH condition whitout doing the t-test? Please help! Thanks in advance -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: table_realism_wood.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110105/8e3524de/attachment.txt>
You really need to spend more time with a good aov textbook and probably a consultant that can explain things to you face to face. But here is a basic explanation to get you pointed in the right direction: Consider a simple 2x2 example with factors A and B each with 2 levels (1 and 2). Draw a 2x2 grid to represent this, there are 4 groups and the theory would be that they have means mu11, mu12, mu21, and mu22 (mu12 is for the group with A at level 1 and B at level 2, etc.). Now you fit the full model with 2 main effects and 1 interaction, if we assume treatment contrasts (the default in R, the coefficients/tests will be different for different contrasts, but the general idea is the same) then the intercept/mean/constant piece will correspond to mu11; the coefficient (only seen if treated as lm instead of aov object) for testing A will be (mu21-mu11) and for testing B will be (mu12-m11). Now the interaction piece gets a bit more complex, it is (mu11 - mu12 - mu21 + mu22), this makes a bit more sense if we rearrange it to be one of ( (mu22-mu21) - (mu12-mu11) ) or ( (mu22-mu12) - (mu21-mu11) ); it represents the difference in the differences, i.e. we find how much going from A1 to A2 changes things when B is 1, then we find how much going from A1 to A2 changes things when B is 2, then we find the difference in these changes, that is the interaction (and if it is 0, then the effects of A and B are additive and independent, i.e. the amount A changes things does not depend on the value of B and vis versa). So testing the interaction term is asking if how much a change in A affects things depends on the value of B. This is very different from comparing mu11 to mu12 (or mu21 to mu22) which is what I think you did in the t-test, it is asking a very different question and using different base assumptions (ignoring any effect of B, additional data, etc.). Note that your test on condition is very significant, this would be more similar to your t-test, but still not match exactly because of the differences. Now your case is more complicated since stimulus has 7 levels (6 df), so the interaction is a combination of 6 different differences of differences, which is why you need to spend some time in a good textbook/class to really understand what model(s) you are fitting. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Frodo Jedi > Sent: Wednesday, January 05, 2011 4:10 PM > To: r-help at r-project.org > Subject: [R] Problem with 2-ways ANOVA interactions > > Dear All, > I have a problem in understanding how the interactions of 2 ways ANOVA > work, > because I get conflicting results > from a t-test and an anova. For most of you my problem is very simple I > am sure. > > I need an help with an example, looking at one table I am analyzing. > The table > is in attachment > and can be imported in R by means of this command: > scrd<- > read.table('/Users/luca/Documents/Analisi_passi/Codice_R/Statistics_res > ults_bump_hole_Audio_Haptic/tables_for_R/table_realism_wood.txt', > header=TRUE, colClasse=c('numeric','factor','factor','numeric')) > > > This table is the result of a simple experiment. Subjects where exposed > to some > stimuli and they where asked to evaluate the degree of realism > of the stimuli on a 7 point scale (i.e., data in column "response"). > Each stimulus was presented in two conditions, "A" and "AH", where AH > is the > condition A plus another thing (let?s call it "H"). > > Now, what means exactly in my table the interaction stimulus:condition? > > I think that if I do the analysis anova(response ~ stimulus*condition) > I will > get the comparison between > > the same stimulus in condition A and in condition AH. Am I wrong? > > For instance the comparison of stimulus flat_550_W_realism presented in > condition A with the same stimulus, flat_550_W_realism, > > presented in condition AH. > > The problem is that if I do a t-test between the values of this > stimulus in the > A and AH condition I get significative difference, > while if I do the test with 2-ways ANOVA I don?t get any difference. > How is this possible? > > Here I put the results analysis > > > #Here the result of ANOVA: > > fit1<- lm(response ~ stimulus + condition + stimulus:condition, > data=scrd) > >#EQUIVALE A lm(response ~ stimulus*condition, data=scrd) > > > > anova(fit1) > Analysis of Variance Table > > Response: response > Df Sum Sq Mean Sq F value Pr(>F) > stimulus 6 15.05 2.509 1.1000 0.3647 > condition 1 36.51 36.515 16.0089 9.64e-05 *** > stimulus:condition 6 1.47 0.244 0.1071 0.9955 > Residuals 159 362.67 2.281 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > > #As you can see the p-value for stimulus:condition is high. > > > #Now I do the t-test with the same values of the table concerning the > stimulus > presented in A and AH conditions: > > flat_550_W_realism > =c(3,3,5,3,3,3,3,5,3,3,5,7,5,2,3) > flat_550_W_realism_AH =c(7,4,5,3,6,5,3,5,5,7,2,7,5, > 5) > > > t.test(flat_550_W_realism,flat_550_W_realism_AH, var.equal=TRUE) > > Two Sample t-test > > data: flat_550_W_realism and flat_550_W_realism_AH > t = -2.2361, df = 27, p-value = 0.03381 > alternative hypothesis: true difference in means is not equal to 0 > 95 percent confidence interval: > -2.29198603 -0.09849016 > sample estimates: > mean of x mean of y > 3.733333 4.928571 > > > #Now we have a significative difference between these two stimuli (p- > value > 0.03381) > > > > Why I get this beheaviour? > > > Moreover, how by means of ANOVA I could track the significative > differences > between the stimuli presented in A and AH condition > whitout doing the t-test? > > Please help! > > Thanks in advance > > >
Maybe a simple concrete example would help:> tmpdat <- data.frame( One= rep( c('A','B'), each=10 ),+ Two=rep( c('C','D'), each=5, length.out=20 ), + mu1 = rep( c(10, 11, 12, 16), each=5 ) )> > tmpdat$e <- with(tmpdat, ave( rnorm(20), One, Two, FUN=scale ) ) > tmpdat$y <- with(tmpdat, mu1+e) > > # check the means > > tapply( tmpdat$y, tmpdat[,c('One','Two')], mean )Two One C D A 10 11 B 12 16> > # now fit the data > > fit <- aov( y ~ One*Two, data=tmpdat ) > > # look at what was measured > > coef(fit)(Intercept) OneB TwoD OneB:TwoD 10 2 1 3> > # notice: > > (16-12) - (11-10)[1] 3> > (16-11) - (12-10)[1] 3> > # another way of thinking > > model.tables(fit)Tables of effects One One A B -1.75 1.75 Two Two C D -1.25 1.25 One:Two Two One C D A 0.75 -0.75 B -0.75 0.75> model.tables(fit, 'means')Tables of means Grand mean 12.25 One One A B 10.5 14.0 Two Two C D 11.0 13.5 One:Two Two One C D A 10 11 B 12 16> > fit2 <- aov( y ~ One + Two, data=tmpdat ) > model.tables(fit2)Tables of effects One One A B -1.75 1.75 Two Two C D -1.25 1.25> model.tables(fit2, 'means')Tables of means Grand mean 12.25 One One A B 10.5 14.0 Two Two C D 11.0 13.5> > tmpdat2 <- expand.grid( One=c('A','B'), Two=c('C','D') ) > cbind( tmpdat2, fit=predict(fit, tmpdat2), fit2=predict(fit2, tmpdat2) )One Two fit fit2 1 A C 10 9.25 2 B C 12 12.75 3 A D 11 11.75 4 B D 16 15.25>Now go back and replace the 16 with 13 and see how things change. Also, how are you accounting for people in your model? Did you have multiple people? Did each person report more than one outcome? You probably need to include person as some form of random effect to properly account for them. This is really getting to the point where you need a consultant (or several more classes). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow@imail.org 801.408.8111 From: Frodo Jedi [mailto:frodo.jedi@yahoo.com] Sent: Thursday, January 06, 2011 2:39 PM To: Greg Snow; r-help@r-project.org Subject: Re: [R] Problem with 2-ways ANOVA interactions Dear Greg, many many thanks, still have a doubt:> Before I wrongly thought that if I do the analysis anova(response ~ stimulus*condition) I would have got the comparison between > the same stimulus in condition A and in condition AH (e.g. stimulus_1_A, stimulus_1_AH). > Instad, apparently, the interaction stimulus:condition means that I find the differences between the stimuli keeping fixed the condition!! > If this is true then doing the anova with the interaction stimulus:condition is equivalent to do the ONE WAY ANOVA first on > the subset where all the conditions are A and then on the subset where all the conditions are AH? Right?>>I think you are closer, but not quite there. The test on the interaction tests if the difference between A and AH is the same across the different stimuli. The main effect for condition >>tests if there is a difference between A and AH.So you mean that the interaction compare for example: stimulus1 in condition A with stimulus 2 in condition AH, right? Could you please answer also to my question I did at the end?..that is what at the end I what to know:> So if all before is correct, my final question is: how by means of ANOVA can I track the significative differences between the stimuli > presented in A and AH condition whitout passing for the t-test? Indeed my goal was to find in one hand if globally the condition > AH bring to better results than condition A, and on the other hand I needed to know for which stimuli the condition AH brings > better results than condition A. >Thanks Best regards [[alternative HTML version deleted]]