Jennifer Mollon
2009-Nov-09 09:57 UTC
[R] Incomplete, unbalanced design, and pseudoreplication?
Hello, I am trying to help someone who has carried out an experiment and I'm finding it quite difficult to understand the appropriate model to use & code it. The response is a measurement - the amount of DNA extracted during the experiment. There were 2 factors to be tested - one is the condition under which the experiment took place and the other is the type of DNA to be extracted. Each set of factors was replicated, so condition A and DNA type A were tested twice using the same input material. Finally, the whole experiment was repeated twice, but in one of the experiments there was not enough input material and one of the DNA types (call it type D) was not tested at all, but all other levels of that factor and the condition factor were tested. From this, I think: 1. The replicates within each experiment are pseudoreplicates - there are pairs of measures with the same input material, and both factor levels are the same. 2. The 2 experiments can be treated as blocks, but they are not balanced or complete. There are 2 questions of interest to the experimenter: 1. Does the amount of DNA extracted differ for the different DNA types under the different conditions? 2. One of the conditions is new, and of particular interest. Under this condition, are there significantly different amounts of DNA extracted depending on DNA type? There are 2 particular contrasts of interest here, call them DNA types B&C vs A, and B&C vs D. DNA type D is only tested in the second experiment. I would be very grateful for comments about the analysis of this complicated data set. Are my beliefs above correct, regarding the design? If so, which R packages and methods can help me with this analysis? In particular, how should the error term be structured for this design? And finally, are the 2 research questions best answered by 2 separate analyses (e.g. the second one looking at only the one condition in isolation), or can a single analysis of a full model answer both of these questions? Many thanks for your consideration and time, Jen
Dieter Menne
2009-Nov-09 13:23 UTC
[R] Incomplete, unbalanced design, and pseudoreplication?
Jennifer Mollon wrote:> > > The response is a measurement - the amount of DNA extracted during the > experiment. There were 2 factors to be tested - one is the condition > under which the experiment took place and the other is the type of DNA > to be extracted. Each set of factors was replicated, so condition A > and DNA type A were tested twice using the same input material. > Finally, the whole experiment was repeated twice, but in one of the > experiments there was not enough input material and one of the DNA > types (call it type D) was not tested at all, but all other levels of > that factor and the condition factor were tested. From this, I think: > .... >This is a classical case for procedure lme in package nlme which is quite robust for unbalanced designs. Arrange your data in the long form; preferably use characters instead of numbers to describe the levels. Check the book by Pinheiro/Bates for details; or, as a starter, the examples in library/nlme/scripts. Dieter Condition DNA Run Amount CondA DNAA Run1 33 CondA DNAB Run1 22 -- View this message in context: http://old.nabble.com/Incomplete%2C-unbalanced-design%2C-and-pseudoreplication--tp26263739p26266346.html Sent from the R help mailing list archive at Nabble.com.