Hello R experts,
I have having a difficult time figuring out how to perform and interpret an
ANCOVA of my nested experimental data and would love any suggestions that you
might have.
Here is the deal:
1) I have twelve tanks of fish (1-12), each with a bunch of fish in them
2) I have three treatments (1-3); 4 tanks per treatment. (each tank only has one
treatment applied to it)
3) I sampled multiple fish from each tank (1-3) and would like to nest my tanks
within each treatment (i.e. four tanks nested in treatment 1, four tanks in
treatment 2 and four tanks in treatment 3) in order to account for any
additional variance due to random tank effects within a treatment.
4) The dependent variable in this case is the AREA of an anatomical structure,
which is proportional to body length (the covariate).
5) Here is a simplified example of my data-frame structure and the code to
generate it. I have re-run the following analysis on this example data set for
this email:
treatment<-c(rep(1:3,each=12))
tank<-c(rep(1:12,each=3))
fish<-c(rep(1:3, each=1, times=12))
length<-c(runif(36,15,25))
area<-c(length[1:12]*10+10,length[13:24]*10+20,length[25:36]*10)
fishdata<-data.frame(treatment,tank,fish,length,area);fishdata
# treatment tank fish length area
#1 1 1 1 16.71511 177.1511
#2 1 1 2 21.95281 229.5281
#3 1 1 3 20.39821 213.9821
#4 1 2 1 23.67241 246.7241
#5 1 2 2 17.35663 183.5663
#6 1 2 3 21.61087 226.1087
#7 1 3 1 21.20769 222.0769
#8 1 3 2 20.14224 211.4224
#9 1 3 3 18.50520 195.0520
#10 1 4 1 21.57603 225.7603
#11 1 4 2 20.27112 212.7112
#12 1 4 3 22.78739 237.8739
#13 2 5 1 19.80727 218.0727
#14 2 5 2 15.11602 171.1602
#...and so on...
plot(length,area)
I have tried the following code to run the ANCOVA with the nested design. Area
is the dependent, treatment is a fixed factor, length is a covariate of area,
and tank is nested within treatment.
QUESTION 1. Have I written this code correctly? I a little confused if it should
be tank/treatment of treatment/tank. From what I understand, this part is
testing the equality of the slopes for each treatment group (the interaction
term).
RSslopes<-aov(area~length*treatment+Error(tank/treatment),data=fishdata)
summary(RSslopes)
#Error: tank
# Df Sum Sq Mean Sq
#length 1 753.6 753.6
#
#Error: tank:treatment
# Df Sum Sq Mean Sq
#length 1 4633 4633
#
#Error: Within
# Df Sum Sq Mean Sq F value Pr(>F)
#length 1 25282 25282 2254.451 < 2e-16 ***
#treatment 1 327 327 29.192 7.47e-06 ***
#length:treatment 1 5 5 0.461 0.502
#Residuals 30 336 11
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
These results show no significant difference in slopes (P=0.502), so I interpret
that as conforming to the assumption of equal slopes in order to run the actual
ANCOVA and test for a treatment effect.
QUESTION 2. How do I test to see if there is a tank effect at this point,?
I'm not sure which Mean Sq value to test over the residuals, or if I should
wait to do that until the next test?
This next part removes the interaction term to just test the treatment effect
while removing the effect of length. right?
RSancova<-aov(area~length+treatment+Error(tank/treatment),data=fishdata)
summary(RSancova)
#Error: tank
# Df Sum Sq Mean Sq
#length 1 753.6 753.6
#
#Error: tank:treatment
# Df Sum Sq Mean Sq
#length 1 4633 4633
#
#Error: Within
# Df Sum Sq Mean Sq F value Pr(>F)
#length 1 25282 25282 2294.34 < 2e-16 ***
#treatment 1 327 327 29.71 5.9e-06 ***
#Residuals 31 342 11
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
So I think this tells me that I do have a treatment effect...but I still would
like to know how to arrange the Mean Sq for the F-test for tank effect.
Now I test to see if removing the interaction term significantly affects the fit
of the model...but this part doesn't work with a nested design!! (it does
work if I do the exact same thing without the nested term.
I get the following:
anova(RSslopes,RSancova)
#Error in UseMethod("anova") :
# no applicable method for 'anova' applied to an object of class
"c('aovlist', 'listof')"
This has something to do with the nested design putting out an aov list, I
think.
Now I really lose it....because I know how to perform a Tukey HSD test to
compare each combination of treatments if I had a regular non-nested ANCOVA, but
when I try it with these tests it can't run...
HSD.test(RSancova,"treatment")
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors =
stringsAsFactors) :
cannot coerce class 'c("aovlist", "listof")' into
a data.frame
QUESTION 3: any tips at this point to get either of these to work?
I have also tried another method that I found on the help list archive:
require(MASS) ## for oats data set
require(nlme)## for lme()
require(multcomp)## for multiple comparison stuff
Aov.mod <- aov(area~ length.mm+ treatment + Error(tank/treatment), data =
d.sag.right)
Lme.mod <- lme(area ~ length.mm+ treatment, random = ~1 | tank/treatment,
data = d.sag.right)
summary(Aov.mod)
#Error: tank
# Df Sum Sq Mean Sq
#length.mm 1 8181085 8181085
#
#Error: tank:treatment
# Df Sum Sq Mean Sq
#length.mm 1 1822796 1822796
#treatment 1 11304871 11304871
#
#Error: Within
# Df Sum Sq Mean Sq F value Pr(>F)
#length.mm 1 66978274 66978274 161.96 < 2e-16 ***
#treatment 2 8871292 4435646 10.73 0.000106 ***
#Residuals 59 24399419 413549
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
anova(Lme.mod)
# numDF denDF F-value p-value
#(Intercept) 1 54 4207.021 <.0001
#length.mm 1 54 183.772 <.0001
#treatment 2 8 9.779 0.0071
summary(Lme.mod)
#Linear mixed-effects model fit by REML
# Data: d.sag.right
# AIC BIC logLik
# 1011.271 1026.16 -498.6353
#
#Random effects:
# Formula: ~1 | tank
# (Intercept)
#StdDev: 193.3108
#
# Formula: ~1 | treatment %in% tank
# (Intercept) Residual
#StdDev: 193.3063 635.9044
#
#Fixed effects: area ~ length.mm + treatment
# Value Std.Error DF t-value p-value
#(Intercept) -3005.7828 741.2499 54 -4.055019 0.0002
#length.mm 523.9419 37.1477 54 14.104286 0.0000
#treatment1000 588.6912 288.0330 8 2.043832 0.0752
#treatment2000 1283.1145 292.3259 8 4.389329 0.0023
# Correlation:
# (Intr) lngth. tr1000
#length.mm -0.956
#treatment1000 -0.246 0.025
#treatment2000 -0.384 0.173 0.567
#
#Standardized Within-Group Residuals:
# Min Q1 Med Q3 Max
#-2.487003437 -0.594427676 0.004423702 0.445992447 2.463725291
#
#Number of Observations: 66
#Number of Groups:
# tank treatment %in% tank
# 11 11
summary(glht(Lme.mod, linfct=mcp("treatment"="Tukey")))
*** caught bus error ***
address 0x0, cause 'non-existent physical address'
etc. R crash notification...
This part always crashes R...
QUESTION 4. WTF? why does it crash R?
QUESTION 5. How do I interpret the anova and summary of the Lme.mod?
QUESTION 6. Why does the Number of Groups at the bottom say 11? and say
treatment %in% tank...that makes me think that I didn't nest it properly.
Well...I think that is it...I'm confused. please help!!
--Sean
[[alternative HTML version deleted]]