Hi All,
Is there an easy way to reduce a data.frame to 1 'id' per row while
keeping
information from the other rows of that same variable, if applicable? e.g.:
# data
multi[1:15,]
id r n wi wi.tau z k alliance a.rater eml
treatment outcome o.rater german
1 100 0.2800000 44 41 21.72514 0.2876821 210 <NA> <NA>
<NA>
<NA> <NA> Client <NA>
2 100 0.2800000 44 41 21.80953 0.2876821 182 <NA> <NA>
Early
<NA> <NA> <NA> <NA>
3 100 0.2800000 44 41 22.36641 0.2876821 206 <NA> Client
<NA>
<NA> <NA> <NA> <NA>
4 100 0.2800000 44 41 23.59224 0.2876821 188 <NA> <NA>
<NA>
<NA> <NA> <NA> Other
5 100 0.2800000 44 41 23.83157 0.2876821 147 WAI <NA>
<NA>
<NA> <NA> <NA> <NA>
6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA>
Early
<NA> <NA> <NA> <NA>
7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA>
<NA>
Psychodymic <NA> <NA> <NA>
8 101 0.5423790 37 34 19.58820 0.6075200 210 <NA> <NA>
<NA>
<NA> <NA> Observer <NA>
9 101 0.5423790 37 34 21.09334 0.6075200 188 <NA> <NA>
<NA>
<NA> <NA> <NA> Other
10 101 0.9075737 37 34 19.65678 1.5135878 182 <NA> <NA>
Late
<NA> <NA> <NA> <NA>
11 103a 0.4950000 18 15 10.36364 0.5426615 90 <NA> <NA>
<NA>
<NA> SCL <NA> <NA>
12 103a 0.6171548 18 15 11.32425 0.7203964 210 <NA> <NA>
<NA>
<NA> <NA> Observer <NA>
13 103a 0.6171548 18 15 11.34714 0.7203964 182 <NA> <NA>
Early
<NA> <NA> <NA> <NA>
14 103a 0.6171548 18 15 11.49606 0.7203964 206 <NA> Client
<NA>
<NA> <NA> <NA> <NA>
15 103a 0.6171548 18 15 11.81150 0.7203964 188 <NA> <NA>
<NA>
<NA> <NA> <NA> Other
# with the goal of having a reduced df (1 id per row) like this:
id r n wi wi.tau z k alliance a.rater eml
treatment outcome o.rater german
1 100 0.2800000 44 41 21.72514 0.2876821 210 wai client early
<NA> <NA> Client other
101 etc...
Ideally, I would like to reduce by id and r, if the values are the same and
keep any discrepant values as a separate row (if possible), e.g.:
6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA>
Early
<NA> <NA> <NA> <NA>
7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA>
Late
Psychodymic <NA> Observer Other
I appreciate any assistance,
AC
[[alternative HTML version deleted]]
Hi
you can use aggregate or tapply. You did not specify which function to use
for "reduction" so I assume mean.
aggregate(multi[, some columns], multi[, c("id", "r")],
mean, na.rm=T)
but this does not solve character columns. For them you could maybe try
?ave. or split/sapply way.
There could be another issue with r values which seems to be fractional
numeric and depending on their way of creation they may not be equal.
Regards
Petr
r-help-bounces at r-project.org napsal dne 25.02.2010 06:44:03:
> Hi All,
>
> Is there an easy way to reduce a data.frame to 1 'id' per row while
keeping> information from the other rows of that same variable, if applicable?
e.g.:>
> # data
>
> multi[1:15,]
> id r n wi wi.tau z k alliance a.rater eml
> treatment outcome o.rater german
> 1 100 0.2800000 44 41 21.72514 0.2876821 210 <NA> <NA>
<NA>
> <NA> <NA> Client <NA>
> 2 100 0.2800000 44 41 21.80953 0.2876821 182 <NA> <NA>
Early
> <NA> <NA> <NA> <NA>
> 3 100 0.2800000 44 41 22.36641 0.2876821 206 <NA> Client
<NA>
> <NA> <NA> <NA> <NA>
> 4 100 0.2800000 44 41 23.59224 0.2876821 188 <NA> <NA>
<NA>
> <NA> <NA> <NA> Other
> 5 100 0.2800000 44 41 23.83157 0.2876821 147 WAI <NA>
<NA>
> <NA> <NA> <NA> <NA>
> 6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA>
Early
> <NA> <NA> <NA> <NA>
> 7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA>
<NA>
> Psychodymic <NA> <NA> <NA>
> 8 101 0.5423790 37 34 19.58820 0.6075200 210 <NA> <NA>
<NA>
> <NA> <NA> Observer <NA>
> 9 101 0.5423790 37 34 21.09334 0.6075200 188 <NA> <NA>
<NA>
> <NA> <NA> <NA> Other
> 10 101 0.9075737 37 34 19.65678 1.5135878 182 <NA> <NA>
Late
> <NA> <NA> <NA> <NA>
> 11 103a 0.4950000 18 15 10.36364 0.5426615 90 <NA> <NA>
<NA>
> <NA> SCL <NA> <NA>
> 12 103a 0.6171548 18 15 11.32425 0.7203964 210 <NA> <NA>
<NA>
> <NA> <NA> Observer <NA>
> 13 103a 0.6171548 18 15 11.34714 0.7203964 182 <NA> <NA>
Early
> <NA> <NA> <NA> <NA>
> 14 103a 0.6171548 18 15 11.49606 0.7203964 206 <NA> Client
<NA>
> <NA> <NA> <NA> <NA>
> 15 103a 0.6171548 18 15 11.81150 0.7203964 188 <NA> <NA>
<NA>
> <NA> <NA> <NA> Other
>
> # with the goal of having a reduced df (1 id per row) like this:
>
> id r n wi wi.tau z k alliance a.rater eml
> treatment outcome o.rater german
> 1 100 0.2800000 44 41 21.72514 0.2876821 210 wai client early
> <NA> <NA> Client other
> 101 etc...
>
> Ideally, I would like to reduce by id and r, if the values are the same
and> keep any discrepant values as a separate row (if possible), e.g.:
>
> 6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA>
Early
> <NA> <NA> <NA> <NA>
> 7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA>
Late
> Psychodymic <NA> Observer Other
>
> I appreciate any assistance,
>
> AC
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Perhaps the reshape package? It's just about impossible to read your data layout. Could you resubmit the example using dput()? Thanks --- On Thu, 2/25/10, AC Del Re <delre at wisc.edu> wrote:> From: AC Del Re <delre at wisc.edu> > Subject: [R] reducing data.frame > To: r-help at r-project.org > Received: Thursday, February 25, 2010, 12:44 AM > Hi All, > > Is there an easy way to reduce a data.frame to 1 'id' per > row while keeping > information from the other rows of that same variable, if > applicable? e.g.: > > # data > > multi[1:15,] > ? ???id? ? ? > ???r? n wi???wi.tau? > ? ? ???z???k > alliance a.rater???eml > treatment outcome? o.rater german > 1???100 0.2800000 44 41 21.72514 0.2876821 > 210? ???<NA>? ? > <NA>? <NA> > <NA>? ? > <NA>???Client???<NA> > 2???100 0.2800000 44 41 21.80953 0.2876821 > 182? ???<NA>? ? > <NA> Early > <NA>? ? <NA>? > ???<NA>???<NA> > 3???100 0.2800000 44 41 22.36641 0.2876821 > 206? ???<NA>? Client? > <NA> > <NA>? ? <NA>? > ???<NA>???<NA> > 4???100 0.2800000 44 41 23.59224 0.2876821 > 188? ???<NA>? ? > <NA>? <NA> > <NA>? ? <NA>? > ???<NA>? Other > 5???100 0.2800000 44 41 23.83157 0.2876821 > 147? ? ? WAI? ? <NA>? > <NA> > <NA>? ? <NA>? > ???<NA>???<NA> > 6???101 0.0000000 37 34 19.65678 0.0000000 > 182? ???<NA>? ? > <NA> Early > <NA>? ? <NA>? > ???<NA>???<NA> > 7???101 0.5423790 37 34 17.65078 > 0.6075200? 98? ???<NA>? > ? <NA>? <NA> > Psychodymic? ? <NA>? > ???<NA>???<NA> > 8???101 0.5423790 37 34 19.58820 0.6075200 > 210? ???<NA>? ? > <NA>? <NA> > <NA>? ? <NA> > Observer???<NA> > 9???101 0.5423790 37 34 21.09334 0.6075200 > 188? ???<NA>? ? > <NA>? <NA> > <NA>? ? <NA>? > ???<NA>? Other > 10? 101 0.9075737 37 34 19.65678 1.5135878 182? > ???<NA>? ? <NA>? > Late > <NA>? ? <NA>? > ???<NA>???<NA> > 11 103a 0.4950000 18 15 10.36364 0.5426615? 90? > ???<NA>? ? <NA>? > <NA> > <NA>? ???SCL? > ???<NA>???<NA> > 12 103a 0.6171548 18 15 11.32425 0.7203964 210? > ???<NA>? ? <NA>? > <NA> > <NA>? ? <NA> > Observer???<NA> > 13 103a 0.6171548 18 15 11.34714 0.7203964 182? > ???<NA>? ? <NA> Early > <NA>? ? <NA>? > ???<NA>???<NA> > 14 103a 0.6171548 18 15 11.49606 0.7203964 206? > ???<NA>? Client? <NA> > <NA>? ? <NA>? > ???<NA>???<NA> > 15 103a 0.6171548 18 15 11.81150 0.7203964 188? > ???<NA>? ? <NA>? > <NA> > <NA>? ? <NA>? > ???<NA>? Other > > # with the goal of having a reduced df (1 id per row) like > this: > > ???id? ? ? > ???r? n wi???wi.tau? > ? ? ???z???k > alliance a.rater???eml > treatment outcome? o.rater german > 1???100 0.2800000 44 41 21.72514 0.2876821 > 210? ???wai? ? client? > early > ???<NA>? ? > <NA>???Client???other > ? ???101 etc... > > Ideally, I would like to reduce by id and r, if the values > are the same and > keep any discrepant values as a separate row (if possible), > e.g.: > > 6???101 0.0000000 37 34 19.65678 0.0000000 > 182? ???<NA>? ? > <NA> Early > <NA>? ? <NA>? > ???<NA>???<NA> > 7???101 0.5423790 37 34 17.65078 > 0.6075200? 98? ???<NA>? > ? <NA>? Late > Psychodymic? ? > <NA>???Observer? Other > > I appreciate any assistance, > > AC > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >__________________________________________________________________ Looking for the perfect gift? Give the gift of Flickr! http://www.flickr.com/gift/