Does someone out there have an example of R-code for a split-split plot ANOVA using aov or another function? The design is not balanced. I never set up one in R before and it would be nice to see an example before I tackle a very complex design I have to model. Thanks, Mike Mike Saunders Research Assistant Forest Ecosystem Research Program Department of Forest Ecosystem Sciences University of Maine Orono, ME 04469 207-581-2763 (O) 207-581-4257 (F) [[alternative HTML version deleted]]
Hi Mike, *An example of the use of aov() for a split-plot is in MASS library(MASS) example(Oats) The book also gives a detailed explanation *pp 45-52 of the Pinheiro and Bates book gives you an example of the use of lme() on a split-plot. If you have a non balanced design, lme() from the nlme library might be a better tool than aov(). Also, if you have the lme4 library installed you'll have a lot more flexibility on the formulation of your random effects. regards, Jesus -------------------------------------------------------------- Jes?s Mar?a Fr?as Celayeta School of Food Sci. and Env. Health. Faculty of Tourism and Food Dublin Institute of Technology Cathal Brugha St., Dublin 1. Ireland t +353 1 4024459 f +353 1 4024495 w www.dit.ie/DIT/tourismfood/science/staff/frias.html --------------------------------------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Mike Saunders > Sent: 01 February 2005 14:36 > To: R Help > Subject: [R] Split-split plot ANOVA > > > Does someone out there have an example of R-code for a > split-split plot ANOVA using aov or another function? The design > is not balanced. I never set up one in R before and it would be > nice to see an example before I tackle a very complex design I > have to model. > > Thanks, > Mike > > Mike Saunders > Research Assistant > Forest Ecosystem Research Program > Department of Forest Ecosystem Sciences > University of Maine > Orono, ME 04469 > 207-581-2763 (O) > 207-581-4257 (F) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide!http://www.R-project.org/posting-guide.html -- This message has been scanned for content and viruses by the DIT Information Services MailScanner Service, and is believed to be clean. http://www.dit.ie -- This message has been scanned for content and viruses by the DIT Information Services MailScanner Service, and is believed to be clean. http://www.dit.ie
> I have been going over and over the examples in MASS > and the Pinheiro and Bates example, but cannot get my model > to run correctly with either aov or lme. > > Could someone give me a hand with the correct model statement?It would help to see some of the things you have tried already ...> First a description of the design. We are studying > germination rates for > various species under a variety of treaments. This is a > blocked split-split > plot design. The levels and treatments are: > > Blocks: 1-6 > > Whole plot treatment: > Overstory: Yes or No > > Split plot treatments: > Caging (to protect against seed predators): Yes or No > Herbaceous competition (i.e., grass): Yes or No > > Split-split plot treatment: > Tree species: 7 kinds > > The response variable is Lag, which is a indication of when > the seeds first germinated.I would try somthing like lme (fixed= Lag ~ Caging + herbaceous + tree, data= your.data, random= ~ 1 | Overstory/split/splitsplit) Perhaps you want/need to add some interactions as well. Overstory, split and splitsplit would be factors with specific levels for each of the plots, split plots and split-split plots, respectively. Thus what I attempted here is to separate the variables of the hierarchical design of data gathering (which go into the random effects) and the treatments (which go into the fixed effects). The degrees of freedom for the fixed effects are automatically adjusted to the correct level in the hierarchy. Did you try that? What did not work out with it?> Lastly, I have unbalanced data since some treatment > combinations never had any germination.In principle, the REML estimates in lme are not effected by unbalanced data. BUT I do not think that the missing germinations by themselves lead to an unbalanced data set: I assume it is informative that in some treatment combinations there was no germination. Thus, your lag there is something close to infinity (or at least longer than you cared to wait ;-). Thus, I would argue you have to somehow include these data points as well, otherwise you can only make a very restricted statement of the kind: if there was germination, this depended on such and such.> Since the data are highly nonnormal, I hope to do a > permutations test on the F-values for each main effect and > interaction in order to get my p-values.As these are durations a log transformation of your response might be enough. Regards, Lorenz - Lorenz Gygax, Dr. sc. nat. Centre for proper housing of ruminants and pigs Swiss Federal Veterinary Office agroscope FAT T?nikon, CH-8356 Ettenhausen / Switzerland
Hi Mike, Do you have a schematic drawing of how exactly your treatments were applied? In split-plot experiments, it is generally very important to clearly define the sequence of plot sizes, because if you don?t do this properly, then the output will be confusing. Checking if your degrees of freedom at each level are correct should give you a good idea about whether you?ve specified the model in the right way. Generally, I see some problem with your model specification as you seem to have two (not one) treatments in some of your subplots. If I got it right, the sequence of terms should be something like Block/Whole.plot/Caging/Competition/Species at least if it?s a full split-plot. Can you send me some more details on the design? Regards, Christoph Lorenz.Gygax at fat.admin.ch wrote:>>I have been going over and over the examples in MASS >>and the Pinheiro and Bates example, but cannot get my model >>to run correctly with either aov or lme. >> >>Could someone give me a hand with the correct model statement? >> >> > >It would help to see some of the things you have tried already ... > > > >>First a description of the design. We are studying >>germination rates for >>various species under a variety of treaments. This is a >>blocked split-split >>plot design. The levels and treatments are: >> >>Blocks: 1-6 >> >>Whole plot treatment: >> Overstory: Yes or No >> >>Split plot treatments: >> Caging (to protect against seed predators): Yes or No >> Herbaceous competition (i.e., grass): Yes or No >> >>Split-split plot treatment: >> Tree species: 7 kinds >> >>The response variable is Lag, which is a indication of when >>the seeds first germinated. >> >> > >I would try somthing like > >lme (fixed= Lag ~ Caging + herbaceous + tree, > data= your.data, > random= ~ 1 | Overstory/split/splitsplit) > >Perhaps you want/need to add some interactions as well. Overstory, split and >splitsplit would be factors with specific levels for each of the plots, >split plots and split-split plots, respectively. > >Thus what I attempted here is to separate the variables of the hierarchical >design of data gathering (which go into the random effects) and the >treatments (which go into the fixed effects). > >The degrees of freedom for the fixed effects are automatically adjusted to >the correct level in the hierarchy. > >Did you try that? What did not work out with it? > > > >>Lastly, I have unbalanced data since some treatment >>combinations never had any germination. >> >> > >In principle, the REML estimates in lme are not effected by unbalanced data. > >BUT I do not think that the missing germinations by themselves lead to an >unbalanced data set: I assume it is informative that in some treatment >combinations there was no germination. Thus, your lag there is something >close to infinity (or at least longer than you cared to wait ;-). Thus, I >would argue you have to somehow include these data points as well, otherwise >you can only make a very restricted statement of the kind: if there was >germination, this depended on such and such. > > > >>Since the data are highly nonnormal, I hope to do a >>permutations test on the F-values for each main effect and >>interaction in order to get my p-values. >> >> > >As these are durations a log transformation of your response might be >enough. > >Regards, Lorenz >- >Lorenz Gygax, Dr. sc. nat. >Centre for proper housing of ruminants and pigs >Swiss Federal Veterinary Office >agroscope FAT T?nikon, CH-8356 Ettenhausen / Switzerland > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > >
Christoph and others:
I did succeed, but it was not straightforward. I ended up using the
formula:
mod<-aov(Lag~Ov*Pr*He*Sp +
Error(Bl/Ov/Pr:He),data=dat,na.action=na.omit)
and repartioning the sums of squares into the proper factors. I am still
not entirely sure I did it correctly or why R split the sums of squares up the
way it did, but I know I am very close. I suspect that the contrasts need to be
set differently.
I should also note that the R output is rather strange in that I had to do
a lot of searching to find which terms were actually error terms and which ones
were model terms. An example of the output from R shows this:
summary(mod)
Error: Bl
Df Sum Sq Mean Sq
Ov 1 8.466 8.466
Pr 1 10.845 10.845
He 1 72.015 72.015
Sp 2 310.009 155.005
Error: Bl:Ov
Df Sum Sq Mean Sq
Ov 1 1783.17 1783.17
Pr 1 9.70 9.70
He 1 194.48 194.48
Sp 3 66.80 22.27
Error: Bl:Ov:Pr:He
Df Sum Sq Mean Sq F value Pr(>F)
Pr 1 303.01 303.01 2.8526 0.1119
He 1 253.35 253.35 2.3851 0.1433
Sp 6 1119.67 186.61 1.7568 0.1759
Ov:Pr 1 307.52 307.52 2.8952 0.1095
Ov:He 1 88.39 88.39 0.8321 0.3761
Pr:He 1 7.64 7.64 0.0719 0.7922
Ov:Sp 6 285.01 47.50 0.4472 0.8359
Pr:Sp 2 102.27 51.14 0.4814 0.6271
He:Sp 1 43.78 43.78 0.4121 0.5306
Ov:Pr:He 1 60.07 60.07 0.5656 0.4637
Residuals 15 1593.30 106.22
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Sp 6 23448.5 3908.1 54.1807 <2e-16 ***
Ov:Sp 6 466.6 77.8 1.0781 0.3766
Pr:Sp 6 183.3 30.6 0.4236 0.8628
He:Sp 6 571.2 95.2 1.3199 0.2494
Ov:Pr:Sp 6 360.4 60.1 0.8328 0.5457
Ov:He:Sp 6 47.6 7.9 0.1100 0.9952
Pr:He:Sp 6 384.4 64.1 0.8881 0.5044
Ov:Pr:He:Sp 6 291.2 48.5 0.6727 0.6718
Residuals 215 15508.1 72.1
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 `
' 1
It was relatively straightforward to do the repartioning most of the time;
occasionally I couldn't figure it out, however. Those crossed factors at
the split-plot level really give me fits sometimes.
Anyway, thanks everyone for the help.
Mike
-----------------------
Dear Mike,
Have you succeeded with your split-plot ANOVA?
Best wishes,
Christoph
Mike Saunders wrote:
Christoph,
Thanks for the help. I think the place where I am having the most
problems is the crossed factors at the split-plot level. A synopsis of the
design follows:
Blocks: There are 6 blocks of treatments spread across a site.
Each block is complete in design (Block = 1-6).
Whole plot: Each block has two plots, one of which has the residual
tree overstory removed and the other intact (Overstory = yes or no).
Split plot: In each whole plot, there are 4 planting grids for the
seeds. A 2 x 2 factorial, at this level, of seed predator control (Caging = yes
or no) and herbaceous competition, i.e., grass seed (Herb = yes or no). Spatial
placement of grids in the whole plot was random and assignment of caging and
herbaceous competition treatments to each grid were random, with the restriction
that each of the four treatment combinations appeared only once. [Note the two
factors are crossed at this level and this has been giving me the most problems
when setting up the model statement].
Split-split plot: Each planting grid is a 3 x 3 array of planting
areas (individual planting areas are about 15 cm x 15 cm). Nine different tree
species (Species = 1 - 9) were randomly assigned to each planting area and 25
seeds planted in each area.
Time to germination, total germination (proportion of the 25) and
rate of germination are response variables of interest.
I am interested in the main effects Overstory, Caging, Herb, and
Species as well as all interactions (although I might throw out 3- and 4-way
interactions later).
A drawing is attached as a pdf.
Thanks in advance for any help you can provide.
Mike
Mike Saunders
Research Assistant
Forest Ecosystem Research Program
Department of Forest Ecosystem Sciences
University of Maine
Orono, ME 04469
207-581-2763 (O)
207-581-4257 (F)
----- Original Message ----- From: "Christoph Scherber"
<Christoph.Scherber@uni-jena.de>
To: <mike_saunders@umenfa.maine.edu>
Cc: <r-help@hypatia.math.ethz.ch>
Sent: Thursday, February 03, 2005 5:40 AM
Subject: Re: [R] Split-split plot ANOVA
Hi Mike,
Do you have a schematic drawing of how exactly your treatments were
applied? In split-plot experiments, it is generally very important
to
clearly define the sequence of plot sizes, because if you don´t do
this
properly, then the output will be confusing. Checking if your
degrees of
freedom at each level are correct should give you a good idea about
whether you´ve specified the model in the right way.
Generally, I see some problem with your model specification as you
seem
to have two (not one) treatments in some of your subplots.
If I got it right, the sequence of terms should be something like
Block/Whole.plot/Caging/Competition/Species
at least if it´s a full split-plot.
Can you send me some more details on the design?
Regards,
Christoph
Lorenz.Gygax@fat.admin.ch wrote:
I have been going over and over the examples in MASS
and the Pinheiro and Bates example, but cannot get my model to run correctly
with either aov or lme.
Could someone give me a hand with the correct model
statement?
It would help to see some of the things you have tried already
...
First a description of the design. We are studying
germination rates for various species under a variety of treaments. This is a
blocked split-split plot design. The levels and treatments are:
Blocks: 1-6
Whole plot treatment:
Overstory: Yes or No
Split plot treatments:
Caging (to protect against seed predators): Yes or No
Herbaceous competition (i.e., grass): Yes or No
Split-split plot treatment:
Tree species: 7 kinds
The response variable is Lag, which is a indication of
when the seeds first germinated.
I would try somthing like
lme (fixed= Lag ~ Caging + herbaceous + tree,
data= your.data,
random= ~ 1 | Overstory/split/splitsplit)
Perhaps you want/need to add some interactions as well.
Overstory, split and
splitsplit would be factors with specific levels for each of
the plots,
split plots and split-split plots, respectively.
Thus what I attempted here is to separate the variables of the
hierarchical
design of data gathering (which go into the random effects)
and the
treatments (which go into the fixed effects).
The degrees of freedom for the fixed effects are automatically
adjusted to
the correct level in the hierarchy.
Did you try that? What did not work out with it?
Lastly, I have unbalanced data since some treatment
combinations never had any germination.
In principle, the REML estimates in lme are not effected by
unbalanced data.
BUT I do not think that the missing germinations by themselves
lead to an
unbalanced data set: I assume it is informative that in some
treatment
combinations there was no germination. Thus, your lag there is
something
close to infinity (or at least longer than you cared to wait
;-). Thus, I
would argue you have to somehow include these data points as
well, otherwise
you can only make a very restricted statement of the kind: if
there was
germination, this depended on such and such.
Since the data are highly nonnormal, I hope to do a
permutations test on the F-values for each main effect and interaction in order
to get my p-values.
As these are durations a log transformation of your response
might be
enough.
Regards, Lorenz
- Lorenz Gygax, Dr. sc. nat.
Centre for proper housing of ruminants and pigs
Swiss Federal Veterinary Office
agroscope FAT Tänikon, CH-8356 Ettenhausen / Switzerland
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
Mike Saunders
Research Assistant
Forest Ecosystem Research Program
Department of Forest Ecosystem Sciences
University of Maine
Orono, ME 04469
207-581-2763 (O)
207-581-4257 (F)
[[alternative HTML version deleted]]