Dear all, I have a dataset, and I wanted to merge the rows with duplicated IDs by calculating the means or medians from the duplicate rows. I tried using the command duplicated(x), but it only tells where the duplicated rows are. Any suggestions will be appreciated. -- View this message in context: http://www.nabble.com/Merge-partially-duplicated-rows-tp24790752p24790752.html Sent from the R help mailing list archive at Nabble.com.
On Aug 3, 2009, at 9:24 AM, Rnewbie wrote:> > Dear all, > > I have a dataset, and I wanted to merge the rows with duplicated IDs > by > calculating the means or medians from the duplicate rows. I tried > using the > command duplicated(x), but it only tells where the duplicated rows > are.You might want to look at the ave function. It will calculate a function within IDs and you can assign that as another row in the datafrme before you exclude the duplicates. David Winsemius, MD Heritage Laboratories West Hartford, CT
On Aug 3, 2009, at 7:12 PM, David Winsemius wrote:> > On Aug 3, 2009, at 9:24 AM, Rnewbie wrote: > >> >> Dear all, >> >> I have a dataset, and I wanted to merge the rows with duplicated >> IDs by >> calculating the means or medians from the duplicate rows. I tried >> using the >> command duplicated(x), but it only tells where the duplicated rows >> are. > > You might want to look at the ave function. It will calculate a > function within IDs and you can assign that as another row in the > datafrme before you exclude the duplicates.^^^^^^ err... I meant to say another column. > tst <- data.frame(ID = sample(c("1234", "4567", "2346"), 10, replace=TRUE), val=rnorm(10)) > tst ID val 1 2346 0.22659389 2 2346 0.46835154 3 2346 -0.53702251 4 2346 -1.00187606 5 1234 0.90843566 6 2346 -0.59654370 7 4567 -0.04355647 8 1234 0.65332120 9 4567 -2.22517105 10 1234 -0.26911187 > tst$IDmn <- ave(tst$val, tst$ID) #default function for ave is mean but others can be used > tst ID val IDmn 1 2346 0.22659389 -0.2880994 2 2346 0.46835154 -0.2880994 3 2346 -0.53702251 -0.2880994 4 2346 -1.00187606 -0.2880994 5 1234 0.90843566 0.4308817 6 2346 -0.59654370 -0.2880994 7 4567 -0.04355647 -1.1343638 8 1234 0.65332120 0.4308817 9 4567 -2.22517105 -1.1343638 10 1234 -0.26911187 0.4308817> > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
Thanks very much :handshake: David Winsemius wrote:> > > You might want to look at the ave function. It will calculate a > function within IDs and you can assign that as another row in the > datafrme before you exclude the duplicates. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/Merge-partially-duplicated-rows-tp24790752p24803781.html Sent from the R help mailing list archive at Nabble.com.