Hi all, I apologize for this probably stupid question, but I really can't figure it out. I have a dataframe like this: group <- c(rep('A', 8), rep('B', 15), rep('C', 6)) time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2)) value <- runif (29, 1, 10) dfx <- data.frame (group, time, value) I want to calculate mean and standard deviation for all values that belong to the same group and the same time and end up with a dataframe with the columns time, group, mean and sd that contains the calculated values for every group at every time point only once (12). What is the most elegant way to do this? Oh, and I would like to avoid renaming columns (like the _X1/_X2 created by casting with multiple functions), if possible. I am sure that this is pretty basic, but I have already wasted a ridiculous amount of time on this. Thanks, Kai [[alternative HTML version deleted]]
Hi> Hi all, > I apologize for this probably stupid question, but I really can't figureit> out. > I have a dataframe like this: > > group <- c(rep('A', 8), rep('B', 15), rep('C', 6)) > time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2)) > value <- runif (29, 1, 10) > dfx <- data.frame (group, time, value) > > I want to calculate mean and standard deviation for all values thatbelong> to the same group and the same time and end up with a dataframe with the > columns time, group, mean and sd that contains the calculated values for > every group at every time point only once (12). > What is the most elegant way to do this? Oh, and I would like to avoid > renaming columns (like the _X1/_X2 created by casting with multiple > functions), if possible. > I am sure that this is pretty basic, but I have already wasted aridiculous> amount of time on this.see ?aggregate aggregate(dfx$value, list(group=dfx$group, time=dfx$time), function(x) c(mean(x), sd(x))) and maybe also plyr package could help you Regards Petr> > Thanks, > > Kai > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
You can use data.table:> group <- c(rep('A', 8), rep('B', 15), rep('C', 6)) > time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2)) > value <- runif (29, 1, 10) > dfx <- data.frame (group, time, value) > require(data.table) > dfx <- data.table(dfx) > dfx[,+ list(mean = mean(value), sd = sd(value)) + , by = list(group, time) + ] group time mean sd [1,] A 1 7.902432 0.8484807 [2,] A 2 5.583566 1.1996167 [3,] A 3 3.412691 1.1138794 [4,] A 4 7.786522 2.2367483 [5,] B 1 6.669257 2.1476769 [6,] B 2 2.902291 1.6630821 [7,] B 3 6.913593 0.9110182 [8,] B 4 4.713124 0.9521689 [9,] B 5 7.285824 1.5884689 [10,] C 1 3.799665 3.7728015 [11,] C 2 9.218785 0.9415034 [12,] C 3 5.098077 3.5256497 On Wed, Aug 31, 2011 at 4:19 AM, Kai Megerle <govokai at gmail.com> wrote:> Hi all, > I apologize for this probably stupid question, but I really can't figure it > out. > I have a dataframe like this: > > group <- c(rep('A', 8), rep('B', 15), rep('C', 6)) > time <- c(rep(seq(1:4), 2), rep(seq(1:5), 3), rep(seq(1:3), 2)) > value <- runif (29, 1, 10) > dfx <- data.frame (group, time, value) > > I want to calculate mean and standard deviation for all values that belong > to the same group and the same time and end up with a dataframe with the > columns time, group, mean and sd that contains the calculated values for > every group at every time point only once (12). > What is the most elegant way to do this? Oh, and I would like to avoid > renaming columns (like the _X1/_X2 created by casting with multiple > functions), if possible. > I am sure that this is pretty basic, but I have already wasted a ridiculous > amount of time on this. > > Thanks, > > Kai > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve?