Jake Kami
2010-Sep-19 23:27 UTC
[R] boyplots nearly identical but still highly significant effect?
dear list, i am running a within-design ANOVA with 4 factors (4,4,2 and 2 levels each). the last one is a time factor comprising two different treatment timepoints. i fit a mixed-effects model using lme and apply the anova function to the outcome. according to this analysis, there are highly significant main effect on the first and the time factor. i then checked the boxplots for the two 4-level factors for each timepoint separately: there is a difference or barely 1 to 2 units; actually, the plots look pretty much alike. also, there is no notable interaction effect. i am really wondering how this high significance of the time factor can come up then because i can not see any huge difference between the timepoint for all of the remaining factors. i know this might be a very basic statistical question but assistance in every way will be appreciated. best jake [[alternative HTML version deleted]]
Ben Bolker
2010-Sep-20 13:09 UTC
[R] boyplots nearly identical but still highly significant effect?
Jake Kami <jakejkami <at> gmail.com> writes:> > dear list, > > i am running a within-design ANOVA with 4 factors (4,4,2 and 2 levels each). > the last one is a time factor comprising two different treatment timepoints. > i fit a mixed-effects model using lme and apply the anova function to the > outcome. according to this analysis, there are highly significant main > effect on the first and the time factor. i then checked the boxplots for the > two 4-level factors for each timepoint separately: there is a difference or > barely 1 to 2 units; actually, the plots look pretty much alike. also, there > is no notable interaction effect. i am really wondering how this high > significance of the time factor can come up then because i can not see any > huge difference between the timepoint for all of the remaining factors. i > know this might be a very basic statistical question but assistance in every > way will be appreciated.Hard to say too much without more details (it's not clear what your random effects are; trying to fit random effects with fewer than 5 or 6 levels is difficult, so unless your grouping/random factor is another variable that you haven't told us about, it's quite possible that lme is reporting a very small variance for the random factor. But that may be a bit tangential ...) A couple of possibilities -- (1) boxplots display descriptive, not inferential, statistics. Especially if your sample sizes are large, the differences between level means could be large in terms of standard errors of the mean but small in terms of population standard deviations. (2) you do have an orthogonal design, and you do say you don't see effects of interactions, but ... it's possible that some of the 'non-significant' factors are explaining enough of the residual variance that the difference attributable to the 'significant' factors is larger than it appears from the marginal distributions. One way to check this would be to fit a model with only the 'non-significant' factors and then examine the difference in the residuals between levels of the 'significant' factors. (3) are you using likelihood-ratio or F tests? If the latter, and if your sample sizes are small enough, the tests may be seriously anticonservative (see Pinheiro and Bates 2000). But good for you for trying to make sense of your results rather than just reporting them ...