Hi All, Is there an easy way to reduce a data.frame to 1 'id' per row while keeping information from the other rows of that same variable, if applicable? e.g.: # data multi[1:15,] id r n wi wi.tau z k alliance a.rater eml treatment outcome o.rater german 1 100 0.2800000 44 41 21.72514 0.2876821 210 <NA> <NA> <NA> <NA> <NA> Client <NA> 2 100 0.2800000 44 41 21.80953 0.2876821 182 <NA> <NA> Early <NA> <NA> <NA> <NA> 3 100 0.2800000 44 41 22.36641 0.2876821 206 <NA> Client <NA> <NA> <NA> <NA> <NA> 4 100 0.2800000 44 41 23.59224 0.2876821 188 <NA> <NA> <NA> <NA> <NA> <NA> Other 5 100 0.2800000 44 41 23.83157 0.2876821 147 WAI <NA> <NA> <NA> <NA> <NA> <NA> 6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA> Early <NA> <NA> <NA> <NA> 7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA> <NA> Psychodymic <NA> <NA> <NA> 8 101 0.5423790 37 34 19.58820 0.6075200 210 <NA> <NA> <NA> <NA> <NA> Observer <NA> 9 101 0.5423790 37 34 21.09334 0.6075200 188 <NA> <NA> <NA> <NA> <NA> <NA> Other 10 101 0.9075737 37 34 19.65678 1.5135878 182 <NA> <NA> Late <NA> <NA> <NA> <NA> 11 103a 0.4950000 18 15 10.36364 0.5426615 90 <NA> <NA> <NA> <NA> SCL <NA> <NA> 12 103a 0.6171548 18 15 11.32425 0.7203964 210 <NA> <NA> <NA> <NA> <NA> Observer <NA> 13 103a 0.6171548 18 15 11.34714 0.7203964 182 <NA> <NA> Early <NA> <NA> <NA> <NA> 14 103a 0.6171548 18 15 11.49606 0.7203964 206 <NA> Client <NA> <NA> <NA> <NA> <NA> 15 103a 0.6171548 18 15 11.81150 0.7203964 188 <NA> <NA> <NA> <NA> <NA> <NA> Other # with the goal of having a reduced df (1 id per row) like this: id r n wi wi.tau z k alliance a.rater eml treatment outcome o.rater german 1 100 0.2800000 44 41 21.72514 0.2876821 210 wai client early <NA> <NA> Client other 101 etc... Ideally, I would like to reduce by id and r, if the values are the same and keep any discrepant values as a separate row (if possible), e.g.: 6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA> Early <NA> <NA> <NA> <NA> 7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA> Late Psychodymic <NA> Observer Other I appreciate any assistance, AC [[alternative HTML version deleted]]
Hi you can use aggregate or tapply. You did not specify which function to use for "reduction" so I assume mean. aggregate(multi[, some columns], multi[, c("id", "r")], mean, na.rm=T) but this does not solve character columns. For them you could maybe try ?ave. or split/sapply way. There could be another issue with r values which seems to be fractional numeric and depending on their way of creation they may not be equal. Regards Petr r-help-bounces at r-project.org napsal dne 25.02.2010 06:44:03:> Hi All, > > Is there an easy way to reduce a data.frame to 1 'id' per row whilekeeping> information from the other rows of that same variable, if applicable?e.g.:> > # data > > multi[1:15,] > id r n wi wi.tau z k alliance a.rater eml > treatment outcome o.rater german > 1 100 0.2800000 44 41 21.72514 0.2876821 210 <NA> <NA> <NA> > <NA> <NA> Client <NA> > 2 100 0.2800000 44 41 21.80953 0.2876821 182 <NA> <NA> Early > <NA> <NA> <NA> <NA> > 3 100 0.2800000 44 41 22.36641 0.2876821 206 <NA> Client <NA> > <NA> <NA> <NA> <NA> > 4 100 0.2800000 44 41 23.59224 0.2876821 188 <NA> <NA> <NA> > <NA> <NA> <NA> Other > 5 100 0.2800000 44 41 23.83157 0.2876821 147 WAI <NA> <NA> > <NA> <NA> <NA> <NA> > 6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA> Early > <NA> <NA> <NA> <NA> > 7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA> <NA> > Psychodymic <NA> <NA> <NA> > 8 101 0.5423790 37 34 19.58820 0.6075200 210 <NA> <NA> <NA> > <NA> <NA> Observer <NA> > 9 101 0.5423790 37 34 21.09334 0.6075200 188 <NA> <NA> <NA> > <NA> <NA> <NA> Other > 10 101 0.9075737 37 34 19.65678 1.5135878 182 <NA> <NA> Late > <NA> <NA> <NA> <NA> > 11 103a 0.4950000 18 15 10.36364 0.5426615 90 <NA> <NA> <NA> > <NA> SCL <NA> <NA> > 12 103a 0.6171548 18 15 11.32425 0.7203964 210 <NA> <NA> <NA> > <NA> <NA> Observer <NA> > 13 103a 0.6171548 18 15 11.34714 0.7203964 182 <NA> <NA> Early > <NA> <NA> <NA> <NA> > 14 103a 0.6171548 18 15 11.49606 0.7203964 206 <NA> Client <NA> > <NA> <NA> <NA> <NA> > 15 103a 0.6171548 18 15 11.81150 0.7203964 188 <NA> <NA> <NA> > <NA> <NA> <NA> Other > > # with the goal of having a reduced df (1 id per row) like this: > > id r n wi wi.tau z k alliance a.rater eml > treatment outcome o.rater german > 1 100 0.2800000 44 41 21.72514 0.2876821 210 wai client early > <NA> <NA> Client other > 101 etc... > > Ideally, I would like to reduce by id and r, if the values are the sameand> keep any discrepant values as a separate row (if possible), e.g.: > > 6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA> Early > <NA> <NA> <NA> <NA> > 7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA> Late > Psychodymic <NA> Observer Other > > I appreciate any assistance, > > AC > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Perhaps the reshape package? It's just about impossible to read your data layout. Could you resubmit the example using dput()? Thanks --- On Thu, 2/25/10, AC Del Re <delre at wisc.edu> wrote:> From: AC Del Re <delre at wisc.edu> > Subject: [R] reducing data.frame > To: r-help at r-project.org > Received: Thursday, February 25, 2010, 12:44 AM > Hi All, > > Is there an easy way to reduce a data.frame to 1 'id' per > row while keeping > information from the other rows of that same variable, if > applicable? e.g.: > > # data > > multi[1:15,] > ? ???id? ? ? > ???r? n wi???wi.tau? > ? ? ???z???k > alliance a.rater???eml > treatment outcome? o.rater german > 1???100 0.2800000 44 41 21.72514 0.2876821 > 210? ???<NA>? ? > <NA>? <NA> > <NA>? ? > <NA>???Client???<NA> > 2???100 0.2800000 44 41 21.80953 0.2876821 > 182? ???<NA>? ? > <NA> Early > <NA>? ? <NA>? > ???<NA>???<NA> > 3???100 0.2800000 44 41 22.36641 0.2876821 > 206? ???<NA>? Client? > <NA> > <NA>? ? <NA>? > ???<NA>???<NA> > 4???100 0.2800000 44 41 23.59224 0.2876821 > 188? ???<NA>? ? > <NA>? <NA> > <NA>? ? <NA>? > ???<NA>? Other > 5???100 0.2800000 44 41 23.83157 0.2876821 > 147? ? ? WAI? ? <NA>? > <NA> > <NA>? ? <NA>? > ???<NA>???<NA> > 6???101 0.0000000 37 34 19.65678 0.0000000 > 182? ???<NA>? ? > <NA> Early > <NA>? ? <NA>? > ???<NA>???<NA> > 7???101 0.5423790 37 34 17.65078 > 0.6075200? 98? ???<NA>? > ? <NA>? <NA> > Psychodymic? ? <NA>? > ???<NA>???<NA> > 8???101 0.5423790 37 34 19.58820 0.6075200 > 210? ???<NA>? ? > <NA>? <NA> > <NA>? ? <NA> > Observer???<NA> > 9???101 0.5423790 37 34 21.09334 0.6075200 > 188? ???<NA>? ? > <NA>? <NA> > <NA>? ? <NA>? > ???<NA>? Other > 10? 101 0.9075737 37 34 19.65678 1.5135878 182? > ???<NA>? ? <NA>? > Late > <NA>? ? <NA>? > ???<NA>???<NA> > 11 103a 0.4950000 18 15 10.36364 0.5426615? 90? > ???<NA>? ? <NA>? > <NA> > <NA>? ???SCL? > ???<NA>???<NA> > 12 103a 0.6171548 18 15 11.32425 0.7203964 210? > ???<NA>? ? <NA>? > <NA> > <NA>? ? <NA> > Observer???<NA> > 13 103a 0.6171548 18 15 11.34714 0.7203964 182? > ???<NA>? ? <NA> Early > <NA>? ? <NA>? > ???<NA>???<NA> > 14 103a 0.6171548 18 15 11.49606 0.7203964 206? > ???<NA>? Client? <NA> > <NA>? ? <NA>? > ???<NA>???<NA> > 15 103a 0.6171548 18 15 11.81150 0.7203964 188? > ???<NA>? ? <NA>? > <NA> > <NA>? ? <NA>? > ???<NA>? Other > > # with the goal of having a reduced df (1 id per row) like > this: > > ???id? ? ? > ???r? n wi???wi.tau? > ? ? ???z???k > alliance a.rater???eml > treatment outcome? o.rater german > 1???100 0.2800000 44 41 21.72514 0.2876821 > 210? ???wai? ? client? > early > ???<NA>? ? > <NA>???Client???other > ? ???101 etc... > > Ideally, I would like to reduce by id and r, if the values > are the same and > keep any discrepant values as a separate row (if possible), > e.g.: > > 6???101 0.0000000 37 34 19.65678 0.0000000 > 182? ???<NA>? ? > <NA> Early > <NA>? ? <NA>? > ???<NA>???<NA> > 7???101 0.5423790 37 34 17.65078 > 0.6075200? 98? ???<NA>? > ? <NA>? Late > Psychodymic? ? > <NA>???Observer? Other > > I appreciate any assistance, > > AC > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >__________________________________________________________________ Looking for the perfect gift? Give the gift of Flickr! http://www.flickr.com/gift/