Dear R-users,
I’d like to make a nested anova on my data, and since I’m discovering both R and
statistic, I’d like to be sure that I’m not doing something stupid.
Here is my data: I’ve measured some variable responses (Y, for example leaf
size) for different plants grown on three differing conditions (A, B and C).
These plants come from two different species (sp1 and sp2) and within each
species, two to three varieties were tested (var1 and var2 within sp1 and var3
to var5 within sp2). Finally I’ve got of course three replicates for each
condition*plant combinations and I’ve got another factor (month) that
corresponds to the “block” effect. Thus, my data looks like that:
month<-c("jun","aug","nov","mar","sep","dec","mar","apr","jul","feb","apr","aug","feb","jun","jul","mar","sep","dec","mar","aug","oct","may","aug","dec","feb","may","nov","feb","apr","oct","may","jul","oct","feb","jun","aug","mar","jun","sep","mar","jun","jul","may","sep","nov")
condition<-c(rep("A",15),rep("B",15),rep("C",15))
species<-c(rep("sp1",6),rep("sp2",9),rep("sp1",6),rep("sp2",9),rep("sp1",6),rep("sp2",9))
variety<-c(rep("var1",3),rep("var2",3),rep("var3",3),rep("var4",3),rep("var5",3),rep("var1",3),rep("var2",3),rep("var3",3),rep("var4",3),rep("var5",3),rep("var1",3),rep("var2",3),rep("var3",3),rep("var4",3),rep("var5",3))
leafsize<-c(12.952971,14.183247,14.894708,12.623053,11.053027,14.062297,5.974159,5.273493,7.450258,6.030390,6.412735,6.867507,6.227527,7.153695,6.414014,19.856307,21.966194,21.445263,18.100480,17.887656,17.444490,11.355896,12.246672,11.462910,12.484537,11.742058,11.937823,16.838480,15.412491,17.789735,11.660008,12.355745,12.963518,10.629601,11.781656,10.693390,5.637602,6.181518,6.853488,8.136201,9.224135,8.309939,10.938328,11.070514,10.965592)
exple<-data.frame(cbind(month,condition,species,variety,leafsize))
exple[,5]<-leafsize
exple
I’d like to estimate the effects of these different factors (condition, species,
variety within species, interaction between condition*species, interaction
between condition*variety within species and finally month) on my variable
response, that is to say :
Y=condition + species + variety(species) + condition*species + condition*
variety(species) + month + residual
The anova I wrote is the following:
anova(lm(leafsize~condition*(species/variety)+month,data=exple))
I’ve noticed that, depending on the order of my factors in my formula, the anova
result varies, for example:
anova(lm(leafsize~month+condition*(species/variety),data=exple))
gives slightly different results. If I understood well, this is because anova()
calculate type 1 sum of square. Thus, I would like to calculate type II, and I
tried Anova() within car package:
library(car)
Anova( lm(leafsize~month+condition*(species/variety),data=exple))
I get the following error :
“One or more terms aliased in model”.
While if I remove the nested factor, the anova with type II works, for example:
Anova( lm(leafsize~month+condition*variety,data=exple))
My questions are:
1- Is my nested anova model pertinent (ie does it actually estimate the
effect of the factors as I hope it does?)
2- Why can’t I realize my nested anova with type II sum of square?
I apologize if my knowledge of statistics is so weak that I’ve make or ask
something stupid…
Many thanks in advance for time and consideration.
Olga
[[alternative HTML version deleted]]