Hi list, I have a question regarding post-hoc extraction of data from a two-way categorical anova. I have a categorical anova of this form: width ~ steepness + patchiness (4 steepness levels, 4 patchiness levels) This simple setup answers if for the widths I collected across different levels of steepness and patchiness significant differences can be found. Is there a way to look at these differences in detail. Lets say that the steepness parameter is significant, then I would like to know between which levels they are significant or if there are levels where this isn't the case. It's a basic question but my R knowledge has faded somewhat... Kind regards, Koen [[alternative HTML version deleted]]
hadley wickham
2007-Dec-12 19:59 UTC
[R] two-way categorical anova post-hoc data extraction
> Hi list, > > I have a question regarding post-hoc extraction of data from a two-way categorical anova. > > I have a categorical anova of this form: > > width ~ steepness + patchiness (4 steepness levels, 4 patchiness levels) > > This simple setup answers if for the widths I collected across different levels of steepness and patchiness significant differences can be found. Is there a way to look at these differences in detail. Lets say that the steepness parameter is significant, then I would like to know between which levels they are significant or if there are levels where this isn't the case. > > It's a basic question but my R knowledge has faded somewhat...I've never found this particularly easy to do. For a recent client I wrote: library(effects) library(multcomp) library(multcompView) effectsum <- function(model, effect) { effects <- as.data.frame(all.effects(model)[[effect]]) mcp <- list("Tukey") names(mcp) <- effect class(mcp) <- "mcp" glht_sum <- summary(glht(model, linfct = mcp)) p <- as.vector(glht_sum$test$pvalues) names(p) <- gsub(" ", "", names(glht_sum$test$tstat)) groups <- multcompLetters(p) effects[, 1] <- factor(effects[, 1]) effects$group <- groups$Letters[as.character(effects[, 1])] effects } which allows you to do: mtcars$cyl <- factor(mtcars$cyl) simple <- lm(mpg ~ wt + cyl, data=mtcars) effects <- effectsum(simple, "cyl") library(ggplot2) qplot(cyl, fit, data=effects, min = lower, max = upper, geom="pointrange", ylab="Mean effect") + geom_text(aes(label = group, y = min(lower) - diff(range(lower)) * 0.07)) ggplot(mtcars, aes(x = cyl, y = mpg)) + geom_crossbar(aes(min=lower, max=upper, y=fit), data=effects, width=0.2) + geom_point(aes(colour=wt)) This gives lsmeans (aka population marginal means) with their standard errors, and groups generated from all pairwise comparisons adjusted for multiple comparisons. I would love to improve this code: to deal with all factors in a model automatically. Additionally, all.effects and multcompLetters are rather fragile with respect to level names - if you get an error try removing any non-alphanumeric characters, Hadley -- http://had.co.nz/