I have a data frame and I would like to reshape it to wide format while at the same time applying different aggregate functions to each column AND at times multiple aggregate functions: test1 = data.frame( id = c(rep('101',8),rep('102',8)), phase = rep(c('D','D','L','L'),4), day = rep(c('1','1','1','1','2','2','2','2'),2), col1 = c(rep(1,8),rep(2,8)), col2 = c(runif(8,min=0,max=1),runif(8,min=0,max=10)) ) In this example, I would like to end up with 2 rows (for the 2 ids) and different columns for phase-day. Values of col1 should just be summed and for col 2 there should be a column with the mean AND one with standard deviation for each phase-day combination. Obviously the real data have much more number of columns therefore I guess I will need to provide a list of functions? Thank you in advance! [[alternative HTML version deleted]]
This sounds like a job for the melt/cast scheme for the reshape package. There is a tutorial for using it here: http://www.r-statistics.com/2012/01/aggregation-and-restructuring-data-from-r-in-action/ Good luck, Tal ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Wed, Jan 18, 2012 at 11:21 AM, dimitris fekas <feki_o@yahoo.co.uk> wrote:> I have a data frame and I would like to reshape it to wide format while at > the same time applying different aggregate functions to each column AND at > times multiple aggregate functions: > > test1 = data.frame( > id = c(rep('101',8),rep('102',8)), > phase = rep(c('D','D','L','L'),4), > day = rep(c('1','1','1','1','2','2','2','2'),2), > col1 = c(rep(1,8),rep(2,8)), > col2 = c(runif(8,min=0,max=1),runif(8,min=0,max=10)) > ) > > In this example, I would like to end up with 2 rows (for the 2 ids) and > different columns for phase-day. Values of col1 > should just be summed and for col 2 there should be a column with the > mean AND one with standard deviation for each phase-day combination. > > Obviously the real data have much more number of columns therefore I guess > I will need to provide a list of functions? > > Thank you in advance! > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
Hello, and thanks for your answer, I have melted the data already and can reshape it with cast alright, but I still can find no way to provide a list of aggregate functions per column, nor multiple aggregate functions for some of them (i.e. mean AND stdev) -- View this message in context: http://r.789695.n4.nabble.com/Re-Reshape-with-multiple-aggregation-functions-tp4306131p4306182.html Sent from the R help mailing list archive at Nabble.com.
dimitris fekas wrote on 01/18/2012 03:21:44 AM:> I have a data frame and I would like to reshape it to wide format > while at the same time applying different aggregate functions to > each column AND at times multiple aggregate functions: > > test1 = data.frame( > id = c(rep('101',8),rep('102',8)), > phase = rep(c('D','D','L','L'),4), > day = rep(c('1','1','1','1','2','2','2','2'),2), > col1 = c(rep(1,8),rep(2,8)), > col2 = c(runif(8,min=0,max=1),runif(8,min=0,max=10)) > ) > > In this example, I would like to end up with 2 rows (for the 2 ids) > and different columns for phase-day. Values of col1 > should just be summed and for col 2 there should be a column with the > mean AND one with standard deviation for each phase-day combination. > > Obviously the real data have much more number of columns therefore I > guess I will need to provide a list of functions? > > Thank you in advance!It's not clear to me what you want to end up with. You seem to want four separate columns for each phase-day combination. But then you describe summary statistics for both col1 and col2. Can you provide an example, test2, to make it clear? Jean [[alternative HTML version deleted]]