Hi all, I am having trouble wrapping my head around a probably simple issue: After using the reshape package, I have a melted dataframe with the columns group (factor), time (int), condition (factor), value(int). These are experimental data. The data were obtained from different treatment groups (group) under different conditions at different time points. I would now like to perform ANOVA, boxplots and calculate means to compare groups for all combinations of conditions and time points with something like fit <- lm(value~group, data=[subset of data with combination of condition/timepoint]) summary (fit) p <- ggplot([subset of data with combination of condition/timepoint], aes(x= group, y=value)) + geom_boxplot () print (p) tapply ([subset of data with combination of condition/timepoint]$value, subset of data with combination of condition/timepoint]$group, mean) How can I loop through these combinations and output the data in an elegant way? Thanks so much! Best, Kai [[alternative HTML version deleted]]
Bert Gunter
2016-Aug-31 22:28 UTC
[R] Looping through different groups of variables in models
Kai: 1. I think that this is a very bad idea, statistically, if I understand you correctly. Generally, your model should incorporate all groups, time points, and conditions together, not individually. 2. But plotting results in "small multiples" -- aka "trellis plots" may be useful. This is done in ggplot through "faceting" which you could read up on and try (I use lattice, not ggplot, to do this sort of thing, so can't help with code). 3. However, I think your question is mostly statistical in nature (define "elegant"), and if so, is off topic here. You might therefore try stats.stackexchange.com instead to get ideas on how to approach your data, solicit other opinions on whether what you want to do makes sense (and if not, what else), etc. Or, perhaps better yet, consult a local statistical resource. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Aug 31, 2016 at 2:58 PM, Kai Mx <govokai at gmail.com> wrote:> Hi all, > > I am having trouble wrapping my head around a probably simple issue: > > After using the reshape package, I have a melted dataframe with the columns > group (factor), time (int), condition (factor), value(int). > > These are experimental data. The data were obtained from different > treatment groups (group) under different conditions at different time > points. > > I would now like to perform ANOVA, boxplots and calculate means to compare > groups for all combinations of conditions and time points with something > like > > fit <- lm(value~group, data=[subset of data with combination of > condition/timepoint]) > summary (fit) > p <- ggplot([subset of data with combination of condition/timepoint], > aes(x= group, y=value)) + geom_boxplot () > print (p) > tapply ([subset of data with combination of condition/timepoint]$value, > subset of data with combination of condition/timepoint]$group, mean) > > How can I loop through these combinations and output the data in an elegant > way? > > Thanks so much! > > Best, > > Kai > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Jim Lemon
2016-Aug-31 22:39 UTC
[R] Looping through different groups of variables in models
Hi Kai, Perhaps something like this: kmdf<-data.frame(group=rep(c("exp","cont"),each=50), time=factor(rep(1:5,20)), condition=rep(rep(c("hot","cold"),each=25),2), value=sample(100:200,100)) for(timeindx in levels(kmdf$time)) { for(condindx in levels(kmdf$condition)) { cat("Time",timeindx,"Condition",condindx,"\n") subdat<-kmdf[kmdf$time == timeindx & kmdf$condition == condindx,] fit<-lm(value~group,subdat) print(summary(fit)) plot(subdat$group,subdat$value) by(subdat$value,subdat$group,mean) } } Getting elegant output is another matter. Have a look at packages meant to produce fancier R output. Jim On Thu, Sep 1, 2016 at 7:58 AM, Kai Mx <govokai at gmail.com> wrote:> Hi all, > > I am having trouble wrapping my head around a probably simple issue: > > After using the reshape package, I have a melted dataframe with the columns > group (factor), time (int), condition (factor), value(int). > > These are experimental data. The data were obtained from different > treatment groups (group) under different conditions at different time > points. > > I would now like to perform ANOVA, boxplots and calculate means to compare > groups for all combinations of conditions and time points with something > like > > fit <- lm(value~group, data=[subset of data with combination of > condition/timepoint]) > summary (fit) > p <- ggplot([subset of data with combination of condition/timepoint], > aes(x= group, y=value)) + geom_boxplot () > print (p) > tapply ([subset of data with combination of condition/timepoint]$value, > subset of data with combination of condition/timepoint]$group, mean) > > How can I loop through these combinations and output the data in an elegant > way? > > Thanks so much! > > Best, > > Kai > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.