Dear R-helpers, I have a dataset named "qu", organized as follows: Sample Run Replicate Value 1 1 1 25 1 1 2 40 1 1 3 33 1 1 4 29 1 2 1 37 1 2 2 44 1 2 3 45 1 3 1 25 1 3 2 40 1 4 1 33 1 4 2 29 1 4 3 25 2 ... Basically, a sample was run on an assay multiple times within a single day. Each of these results is "Replicate". Then run was repeated several times in consecutive days - variable "Run". There are 210 such samples. I need to actually calculate the CV for each sample: - within run (between replicates) - that's easy to do in Excel - between run - that's the problem. I was thinking of using either 'aov' or 'lme' to solve this. However, I don't know how to interpret the output. For example, a summary output from "aov(Value~Run+Replicate, subset(qu,Sample==79))' for one sample was: Df Sum Sq Mean Sq F value Pr(>F) Run 1 4.000 4.000 0.3214 0.6104 Replicate 1 73.500 73.500 5.9062 0.0933 . Residuals 3 37.333 12.444 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Do you guys think this is correct approach? How do I extract these numbers (sum of squares) to store in a separate dataframe for further calculations? And how should I interpret the "Residual" in this setting? I will appreciate your comments. -- Michal J. Figurski
Michal Figurski
2008-Oct-07 14:25 UTC
[R] How to store the results of multiple iterations of 'aov' in a data.frame?
A follow-up to "Need to calculate within- and between-run CV" I have a dataset of 210 Samples, of which each was run several times in several consecutive days. A dataset for one sample is below: qu.s <- structure(list(Sample = c(44L, 44L, 44L, 44L, 44L, 44L, 44L, 44L, 44L, 44L), Run = c(1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L ), Rep = c(1, 2, 1, 2, 3, 4, 1, 2, 3, 4), value = c(120L, 107L, 117L, 124L, 118L, 127L, 110L, 113L, 109L, 113L)), .Names = c("Sample", "Run", "Rep", "value"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), class = "data.frame") and the code I used is below: a=aov(value ~ Run, data=qu.s) I assume that 'Residual' is the variance between replicates in my data. Since I have to run the above code 210 times, my question is: how to store the results of 'aov' in a separate data.frame? For each iteration of 'aov' I need to store: the _sums of squares_ for Run and Residual, the _N_ of replicates and the _mean_. Later I want to use it to calculate the coefficients of variation (CV) for each sample. I looked at the structure of aov object, but the sums of squares are not listed there, though 'summary(a)' prints them. Please help. -- Michal Figurski Michal Figurski wrote:> Dear R-helpers, > > I have a dataset named "qu", organized as follows: > > Sample Run Replicate Value > 1 1 1 25 > 1 1 2 40 > 1 1 3 33 > 1 1 4 29 > 1 2 1 37 > 1 2 2 44 > 1 2 3 45 > 1 3 1 25 > 1 3 2 40 > 1 4 1 33 > 1 4 2 29 > 1 4 3 25 > 2 ... > > Basically, a sample was run on an assay multiple times within a single > day. Each of these results is "Replicate". Then run was repeated several > times in consecutive days - variable "Run". There are 210 such samples. > > I need to actually calculate the CV for each sample: > - within run (between replicates) - that's easy to do in Excel > - between run - that's the problem. > > I was thinking of using either 'aov' or 'lme' to solve this. However, I > don't know how to interpret the output. For example, a summary output > from "aov(Value~Run+Replicate, subset(qu,Sample==79))' for one sample was: > > Df Sum Sq Mean Sq F value Pr(>F) > Run 1 4.000 4.000 0.3214 0.6104 > Replicate 1 73.500 73.500 5.9062 0.0933 . > Residuals 3 37.333 12.444 > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Do you guys think this is correct approach? > How do I extract these numbers (sum of squares) to store in a separate > dataframe for further calculations? > > And how should I interpret the "Residual" in this setting? > > I will appreciate your comments. >