Does someone out there have an example of R-code for a split-split plot ANOVA using aov or another function? The design is not balanced. I never set up one in R before and it would be nice to see an example before I tackle a very complex design I have to model. Thanks, Mike Mike Saunders Research Assistant Forest Ecosystem Research Program Department of Forest Ecosystem Sciences University of Maine Orono, ME 04469 207-581-2763 (O) 207-581-4257 (F) [[alternative HTML version deleted]]
Hi Mike, *An example of the use of aov() for a split-plot is in MASS library(MASS) example(Oats) The book also gives a detailed explanation *pp 45-52 of the Pinheiro and Bates book gives you an example of the use of lme() on a split-plot. If you have a non balanced design, lme() from the nlme library might be a better tool than aov(). Also, if you have the lme4 library installed you'll have a lot more flexibility on the formulation of your random effects. regards, Jesus -------------------------------------------------------------- Jes?s Mar?a Fr?as Celayeta School of Food Sci. and Env. Health. Faculty of Tourism and Food Dublin Institute of Technology Cathal Brugha St., Dublin 1. Ireland t +353 1 4024459 f +353 1 4024495 w www.dit.ie/DIT/tourismfood/science/staff/frias.html --------------------------------------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Mike Saunders > Sent: 01 February 2005 14:36 > To: R Help > Subject: [R] Split-split plot ANOVA > > > Does someone out there have an example of R-code for a > split-split plot ANOVA using aov or another function? The design > is not balanced. I never set up one in R before and it would be > nice to see an example before I tackle a very complex design I > have to model. > > Thanks, > Mike > > Mike Saunders > Research Assistant > Forest Ecosystem Research Program > Department of Forest Ecosystem Sciences > University of Maine > Orono, ME 04469 > 207-581-2763 (O) > 207-581-4257 (F) > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide!http://www.R-project.org/posting-guide.html -- This message has been scanned for content and viruses by the DIT Information Services MailScanner Service, and is believed to be clean. http://www.dit.ie -- This message has been scanned for content and viruses by the DIT Information Services MailScanner Service, and is believed to be clean. http://www.dit.ie
> I have been going over and over the examples in MASS > and the Pinheiro and Bates example, but cannot get my model > to run correctly with either aov or lme. > > Could someone give me a hand with the correct model statement?It would help to see some of the things you have tried already ...> First a description of the design. We are studying > germination rates for > various species under a variety of treaments. This is a > blocked split-split > plot design. The levels and treatments are: > > Blocks: 1-6 > > Whole plot treatment: > Overstory: Yes or No > > Split plot treatments: > Caging (to protect against seed predators): Yes or No > Herbaceous competition (i.e., grass): Yes or No > > Split-split plot treatment: > Tree species: 7 kinds > > The response variable is Lag, which is a indication of when > the seeds first germinated.I would try somthing like lme (fixed= Lag ~ Caging + herbaceous + tree, data= your.data, random= ~ 1 | Overstory/split/splitsplit) Perhaps you want/need to add some interactions as well. Overstory, split and splitsplit would be factors with specific levels for each of the plots, split plots and split-split plots, respectively. Thus what I attempted here is to separate the variables of the hierarchical design of data gathering (which go into the random effects) and the treatments (which go into the fixed effects). The degrees of freedom for the fixed effects are automatically adjusted to the correct level in the hierarchy. Did you try that? What did not work out with it?> Lastly, I have unbalanced data since some treatment > combinations never had any germination.In principle, the REML estimates in lme are not effected by unbalanced data. BUT I do not think that the missing germinations by themselves lead to an unbalanced data set: I assume it is informative that in some treatment combinations there was no germination. Thus, your lag there is something close to infinity (or at least longer than you cared to wait ;-). Thus, I would argue you have to somehow include these data points as well, otherwise you can only make a very restricted statement of the kind: if there was germination, this depended on such and such.> Since the data are highly nonnormal, I hope to do a > permutations test on the F-values for each main effect and > interaction in order to get my p-values.As these are durations a log transformation of your response might be enough. Regards, Lorenz - Lorenz Gygax, Dr. sc. nat. Centre for proper housing of ruminants and pigs Swiss Federal Veterinary Office agroscope FAT T?nikon, CH-8356 Ettenhausen / Switzerland
Hi Mike, Do you have a schematic drawing of how exactly your treatments were applied? In split-plot experiments, it is generally very important to clearly define the sequence of plot sizes, because if you don?t do this properly, then the output will be confusing. Checking if your degrees of freedom at each level are correct should give you a good idea about whether you?ve specified the model in the right way. Generally, I see some problem with your model specification as you seem to have two (not one) treatments in some of your subplots. If I got it right, the sequence of terms should be something like Block/Whole.plot/Caging/Competition/Species at least if it?s a full split-plot. Can you send me some more details on the design? Regards, Christoph Lorenz.Gygax at fat.admin.ch wrote:>>I have been going over and over the examples in MASS >>and the Pinheiro and Bates example, but cannot get my model >>to run correctly with either aov or lme. >> >>Could someone give me a hand with the correct model statement? >> >> > >It would help to see some of the things you have tried already ... > > > >>First a description of the design. We are studying >>germination rates for >>various species under a variety of treaments. This is a >>blocked split-split >>plot design. The levels and treatments are: >> >>Blocks: 1-6 >> >>Whole plot treatment: >> Overstory: Yes or No >> >>Split plot treatments: >> Caging (to protect against seed predators): Yes or No >> Herbaceous competition (i.e., grass): Yes or No >> >>Split-split plot treatment: >> Tree species: 7 kinds >> >>The response variable is Lag, which is a indication of when >>the seeds first germinated. >> >> > >I would try somthing like > >lme (fixed= Lag ~ Caging + herbaceous + tree, > data= your.data, > random= ~ 1 | Overstory/split/splitsplit) > >Perhaps you want/need to add some interactions as well. Overstory, split and >splitsplit would be factors with specific levels for each of the plots, >split plots and split-split plots, respectively. > >Thus what I attempted here is to separate the variables of the hierarchical >design of data gathering (which go into the random effects) and the >treatments (which go into the fixed effects). > >The degrees of freedom for the fixed effects are automatically adjusted to >the correct level in the hierarchy. > >Did you try that? What did not work out with it? > > > >>Lastly, I have unbalanced data since some treatment >>combinations never had any germination. >> >> > >In principle, the REML estimates in lme are not effected by unbalanced data. > >BUT I do not think that the missing germinations by themselves lead to an >unbalanced data set: I assume it is informative that in some treatment >combinations there was no germination. Thus, your lag there is something >close to infinity (or at least longer than you cared to wait ;-). Thus, I >would argue you have to somehow include these data points as well, otherwise >you can only make a very restricted statement of the kind: if there was >germination, this depended on such and such. > > > >>Since the data are highly nonnormal, I hope to do a >>permutations test on the F-values for each main effect and >>interaction in order to get my p-values. >> >> > >As these are durations a log transformation of your response might be >enough. > >Regards, Lorenz >- >Lorenz Gygax, Dr. sc. nat. >Centre for proper housing of ruminants and pigs >Swiss Federal Veterinary Office >agroscope FAT T?nikon, CH-8356 Ettenhausen / Switzerland > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > >
Christoph and others: I did succeed, but it was not straightforward. I ended up using the formula: mod<-aov(Lag~Ov*Pr*He*Sp + Error(Bl/Ov/Pr:He),data=dat,na.action=na.omit) and repartioning the sums of squares into the proper factors. I am still not entirely sure I did it correctly or why R split the sums of squares up the way it did, but I know I am very close. I suspect that the contrasts need to be set differently. I should also note that the R output is rather strange in that I had to do a lot of searching to find which terms were actually error terms and which ones were model terms. An example of the output from R shows this: summary(mod) Error: Bl Df Sum Sq Mean Sq Ov 1 8.466 8.466 Pr 1 10.845 10.845 He 1 72.015 72.015 Sp 2 310.009 155.005 Error: Bl:Ov Df Sum Sq Mean Sq Ov 1 1783.17 1783.17 Pr 1 9.70 9.70 He 1 194.48 194.48 Sp 3 66.80 22.27 Error: Bl:Ov:Pr:He Df Sum Sq Mean Sq F value Pr(>F) Pr 1 303.01 303.01 2.8526 0.1119 He 1 253.35 253.35 2.3851 0.1433 Sp 6 1119.67 186.61 1.7568 0.1759 Ov:Pr 1 307.52 307.52 2.8952 0.1095 Ov:He 1 88.39 88.39 0.8321 0.3761 Pr:He 1 7.64 7.64 0.0719 0.7922 Ov:Sp 6 285.01 47.50 0.4472 0.8359 Pr:Sp 2 102.27 51.14 0.4814 0.6271 He:Sp 1 43.78 43.78 0.4121 0.5306 Ov:Pr:He 1 60.07 60.07 0.5656 0.4637 Residuals 15 1593.30 106.22 Error: Within Df Sum Sq Mean Sq F value Pr(>F) Sp 6 23448.5 3908.1 54.1807 <2e-16 *** Ov:Sp 6 466.6 77.8 1.0781 0.3766 Pr:Sp 6 183.3 30.6 0.4236 0.8628 He:Sp 6 571.2 95.2 1.3199 0.2494 Ov:Pr:Sp 6 360.4 60.1 0.8328 0.5457 Ov:He:Sp 6 47.6 7.9 0.1100 0.9952 Pr:He:Sp 6 384.4 64.1 0.8881 0.5044 Ov:Pr:He:Sp 6 291.2 48.5 0.6727 0.6718 Residuals 215 15508.1 72.1 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 It was relatively straightforward to do the repartioning most of the time; occasionally I couldn't figure it out, however. Those crossed factors at the split-plot level really give me fits sometimes. Anyway, thanks everyone for the help. Mike ----------------------- Dear Mike, Have you succeeded with your split-plot ANOVA? Best wishes, Christoph Mike Saunders wrote: Christoph, Thanks for the help. I think the place where I am having the most problems is the crossed factors at the split-plot level. A synopsis of the design follows: Blocks: There are 6 blocks of treatments spread across a site. Each block is complete in design (Block = 1-6). Whole plot: Each block has two plots, one of which has the residual tree overstory removed and the other intact (Overstory = yes or no). Split plot: In each whole plot, there are 4 planting grids for the seeds. A 2 x 2 factorial, at this level, of seed predator control (Caging = yes or no) and herbaceous competition, i.e., grass seed (Herb = yes or no). Spatial placement of grids in the whole plot was random and assignment of caging and herbaceous competition treatments to each grid were random, with the restriction that each of the four treatment combinations appeared only once. [Note the two factors are crossed at this level and this has been giving me the most problems when setting up the model statement]. Split-split plot: Each planting grid is a 3 x 3 array of planting areas (individual planting areas are about 15 cm x 15 cm). Nine different tree species (Species = 1 - 9) were randomly assigned to each planting area and 25 seeds planted in each area. Time to germination, total germination (proportion of the 25) and rate of germination are response variables of interest. I am interested in the main effects Overstory, Caging, Herb, and Species as well as all interactions (although I might throw out 3- and 4-way interactions later). A drawing is attached as a pdf. Thanks in advance for any help you can provide. Mike Mike Saunders Research Assistant Forest Ecosystem Research Program Department of Forest Ecosystem Sciences University of Maine Orono, ME 04469 207-581-2763 (O) 207-581-4257 (F) ----- Original Message ----- From: "Christoph Scherber" <Christoph.Scherber@uni-jena.de> To: <mike_saunders@umenfa.maine.edu> Cc: <r-help@hypatia.math.ethz.ch> Sent: Thursday, February 03, 2005 5:40 AM Subject: Re: [R] Split-split plot ANOVA Hi Mike, Do you have a schematic drawing of how exactly your treatments were applied? In split-plot experiments, it is generally very important to clearly define the sequence of plot sizes, because if you don´t do this properly, then the output will be confusing. Checking if your degrees of freedom at each level are correct should give you a good idea about whether you´ve specified the model in the right way. Generally, I see some problem with your model specification as you seem to have two (not one) treatments in some of your subplots. If I got it right, the sequence of terms should be something like Block/Whole.plot/Caging/Competition/Species at least if it´s a full split-plot. Can you send me some more details on the design? Regards, Christoph Lorenz.Gygax@fat.admin.ch wrote: I have been going over and over the examples in MASS and the Pinheiro and Bates example, but cannot get my model to run correctly with either aov or lme. Could someone give me a hand with the correct model statement? It would help to see some of the things you have tried already ... First a description of the design. We are studying germination rates for various species under a variety of treaments. This is a blocked split-split plot design. The levels and treatments are: Blocks: 1-6 Whole plot treatment: Overstory: Yes or No Split plot treatments: Caging (to protect against seed predators): Yes or No Herbaceous competition (i.e., grass): Yes or No Split-split plot treatment: Tree species: 7 kinds The response variable is Lag, which is a indication of when the seeds first germinated. I would try somthing like lme (fixed= Lag ~ Caging + herbaceous + tree, data= your.data, random= ~ 1 | Overstory/split/splitsplit) Perhaps you want/need to add some interactions as well. Overstory, split and splitsplit would be factors with specific levels for each of the plots, split plots and split-split plots, respectively. Thus what I attempted here is to separate the variables of the hierarchical design of data gathering (which go into the random effects) and the treatments (which go into the fixed effects). The degrees of freedom for the fixed effects are automatically adjusted to the correct level in the hierarchy. Did you try that? What did not work out with it? Lastly, I have unbalanced data since some treatment combinations never had any germination. In principle, the REML estimates in lme are not effected by unbalanced data. BUT I do not think that the missing germinations by themselves lead to an unbalanced data set: I assume it is informative that in some treatment combinations there was no germination. Thus, your lag there is something close to infinity (or at least longer than you cared to wait ;-). Thus, I would argue you have to somehow include these data points as well, otherwise you can only make a very restricted statement of the kind: if there was germination, this depended on such and such. Since the data are highly nonnormal, I hope to do a permutations test on the F-values for each main effect and interaction in order to get my p-values. As these are durations a log transformation of your response might be enough. Regards, Lorenz - Lorenz Gygax, Dr. sc. nat. Centre for proper housing of ruminants and pigs Swiss Federal Veterinary Office agroscope FAT Tänikon, CH-8356 Ettenhausen / Switzerland ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Mike Saunders Research Assistant Forest Ecosystem Research Program Department of Forest Ecosystem Sciences University of Maine Orono, ME 04469 207-581-2763 (O) 207-581-4257 (F) [[alternative HTML version deleted]]