Steven Worthington
2010-Apr-01 22:21 UTC
[R] Adding regression lines to each factor on a plot when using ANCOVA
Dear R users, i'm using a custom function to fit ancova models to a dataset. The data are divided into 12 groups, with one dependent variable and one covariate. When plotting the data, i'd like to add separate regression lines for each group (so, 12 lines, each with their respective individual slopes). My 'model1' uses the group*covariate interaction term, and so the coefficients to plot these lines should be contained within the 'model1' object (there are 25 coefficients and it looks like I need the last 12). The problem is I can't figure out how to extract the relevant coefficients from 'model1' and add them using abline. I imagine there's some way of using the relevant slopes abline(model1$coef[14:25]) together with the intercept, but I can't quite get it right. Can anyone offer a suggestion as to how to go about this? Ideally, What i'd like is to plot each regression line in the same color as the group to which it belongs. I've provided an example with dummy data below best, Steve # ==========================================================# hypothetical data species <- c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7,8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12) beak.lgth <- c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1,9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9,19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9) mass <- c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2,84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6,27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2,34.7,39.3,41.7,40.5,42.7,41.8) dataset <- cbind(groups, beak.lgth, mass) # ANCOVA function anc <- function(variable, covariate, group){ # transform data lgVar <- log10(variable) lgCov <- log10(covariate) # separate regression lines for each group model1 <- lm(lgVar ~ lgCov + group + lgCov:group) model1.summ <- summary(model1) model1.anv <- anova(model1) # separate regression lines for each group, but with the same slope model2 <- lm(lgVar ~ lgCov + group) model2.summ <- summary(model2) model2.anv <- anova(model2) # same regression line for all groups model3 <- lm(lgVar ~ lgCov) model3.summ <- summary(model3) model3.anv <- anova(model3) compare <- anova(model1, model2, model3) # compare all models # plots par(mfcol=c(1,2)) boxplot(lgVar ~ group, col="darkgoldenrod1") # plot the variate and covariate by group plot(lgVar ~ lgCov, pch=as.numeric(group), col=as.numeric(group)) legend("topleft", inset=0, legend=as.character(unique(group)), col=as.numeric(unique(group)), pch=as.numeric(unique(group)), pt.cex=1.5) abline(model1) # Need separate regression lines here list(model_1_summary=model1.summ, model_1_ANOVA=model1.anv, model_2_summary=model2.summ, model_2_ANOVA=model2.anv, model_3_summary=model3.summ, model_3_ANOVA=model3.anv, model_comparison=compare) } # call function anc(beak.lgth, mass, species) # ========================================================== -- View this message in context: http://n4.nabble.com/Adding-regression-lines-to-each-factor-on-a-plot-when-using-ANCOVA-tp1748654p1748654.html Sent from the R help mailing list archive at Nabble.com.
RICHARD M. HEIBERGER
2010-Apr-02 00:05 UTC
[R] Adding regression lines to each factor on a plot when using ANCOVA
## Steve, ## please use the ancova function in the HH package. install.packages("HH") library(HH) ## windows.options(record=TRUE) windows.options(record=TRUE) # hypothetical data beak.lgth <- c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1, 9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9, 19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9) mass <- c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2, 84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6, 27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2, 34.7,39.3,41.7,40.5,42.7,41.8) ## Make species into a factor species <- factor(c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7, 8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12)) ## then construct a data.frame with the three variables and the log transforms dataset <- data.frame(species, beak.lgth, mass, logBeak=log10(beak.lgth), logMass=log10(mass)) ## default is 7 colors, we need 12 trellis.par.set("superpose.line", Rows(trellis.par.get("superpose.line"), c(1:6, 1:6))) trellis.par.set("superpose.symbol", Rows(trellis.par.get("superpose.symbol"), c(1:6, 1:6))) ancova(logBeak ~ logMass * species, data=dataset) ancova(logBeak ~ logMass + species, data=dataset) ancova(logBeak ~ logMass, groups=species, data=dataset) ancova(logBeak ~ species, x=logMass, data=dataset) bwplot(logBeak ~ species, data=dataset) ## Rich [[alternative HTML version deleted]]
Peter Ehlers
2010-Apr-02 00:31 UTC
[R] Adding regression lines to each factor on a plot when using ANCOVA
Steve, Thanks for providing an example (which does, however need a bit of tweaking; BTW, it's usually not a good idea to cbind your data when what you really want is a data.frame). Your guess about some clever way of using abline() is unfortunately not correct - as the help page indicates, the slope and intercept must be given as single values. So you will have to extract each (intercept, slope) pair from the model coefficients and call abline() on them. A convenient way to do this is to specify the model as mod <- lm(y ~ f/x + 0) (which I first learned from MASS, the book). Here f is your grouping variable. As the book says, this gives "separate regression models of the type 1 + x within the levels of f". The "+ 0" removes the usual intercept which is replaced by individual intercepts for each level of f. For your example this will give 12 intercepts as the first 12 coefficients and 12 slopes as the remaining coefs. Then you can use cof <- coef(mod) for(i in 1:12) abline(a=cof[i], b=cof[12 + i]) to plot the 12 lines. -Peter Ehlers On 2010-04-01 16:21, Steven Worthington wrote:> > Dear R users, > > i'm using a custom function to fit ancova models to a dataset. The data are > divided into 12 groups, with one dependent variable and one covariate. When > plotting the data, i'd like to add separate regression lines for each group > (so, 12 lines, each with their respective individual slopes). My 'model1' > uses the group*covariate interaction term, and so the coefficients to plot > these lines should be contained within the 'model1' object (there are 25 > coefficients and it looks like I need the last 12). The problem is I can't > figure out how to extract the relevant coefficients from 'model1' and add > them using abline. I imagine there's some way of using the relevant slopes > > abline(model1$coef[14:25]) > > together with the intercept, but I can't quite get it right. Can anyone > offer a suggestion as to how to go about this? Ideally, What i'd like is to > plot each regression line in the same color as the group to which it > belongs. > > I've provided an example with dummy data below > > best, > > Steve > > > # ==========================================================> # hypothetical data > species<- > c(1,1,1,2,2,2,3,3,3,3,4,4,4,5,5,5,5,6,6,6,7,7,7,8,8,8,8,9,9,9,9,9,10,10,10,11,11,11,11,12,12,12,12,12) > beak.lgth<- > c(2.3,4.2,2.7,3.4,4.2,4.8,1.9,2.2,1.7,2.5,15,16.5,14.7,9.6,8.5,9.1,9.4,17.7,15.6,14,6.8,8.5,9.4,10.5,10.9,11.2,11.5,19,17.2,18.9,19.5,19.9,12.6,12.1,12.9,14.1,12.5,15,14.8,4.3,5.7,2.4,3.5,2.9) > mass<- > c(45.9,47.1,47.6,17.2,17.9,17.7,44.9,44.8,45.3,44.9,39,39.7,41.2,84.8,79.2,78.3,82.8,102.8,107.2,104.1,51.7,45.5,50.6,27.5,26.6,27.5,26.9,25.4,23.7,21.7,22.2,23.8,46.9,51.5,49.4,33.4,33.1,33.2,34.7,39.3,41.7,40.5,42.7,41.8) > dataset<- cbind(groups, beak.lgth, mass) > > # ANCOVA function > anc<- function(variable, covariate, group){ > # transform data > lgVar<- log10(variable) > lgCov<- log10(covariate) > # separate regression lines for each group > model1<- lm(lgVar ~ lgCov + group + lgCov:group) > model1.summ<- summary(model1) > model1.anv<- anova(model1) > # separate regression lines for each group, but with the same slope > model2<- lm(lgVar ~ lgCov + group) > model2.summ<- summary(model2) > model2.anv<- anova(model2) > # same regression line for all groups > model3<- lm(lgVar ~ lgCov) > model3.summ<- summary(model3) > model3.anv<- anova(model3) > compare<- anova(model1, model2, model3) # compare all models > # plots > par(mfcol=c(1,2)) > boxplot(lgVar ~ group, col="darkgoldenrod1") > # plot the variate and covariate by group > plot(lgVar ~ lgCov, pch=as.numeric(group), col=as.numeric(group)) > legend("topleft", inset=0, legend=as.character(unique(group)), > col=as.numeric(unique(group)), > pch=as.numeric(unique(group)), pt.cex=1.5) > abline(model1) # Need separate regression lines here > list(model_1_summary=model1.summ, model_1_ANOVA=model1.anv, > model_2_summary=model2.summ, > model_2_ANOVA=model2.anv, model_3_summary=model3.summ, > model_3_ANOVA=model3.anv, model_comparison=compare) > } > > # call function > anc(beak.lgth, mass, species) > # ==========================================================>-- Peter Ehlers University of Calgary