-- Eredeti üzenet -- Feladó: Dévaványai Agamemnón <devavanyai@citromail.hu>Címzett: r-hel@r-project.org, r-hel@r-project.orgElküldve: 2010. július 29. 16:29Tárgy : duplicates Sorry! I try it again Dear R Users! I have a dataframe with duplicatecases. Var1 duplicated by var2. var1 var2 var3 var4 var5 1 4 500 1 2 1 3 200 2 5 1 8 125 1 9 2 2 120 2 52 2 6 22 1 20 2 9 400 1 22 3 1 100 2 8 3 2 200 5 40 4 8 20 1 60 I want to delete duplicate ones from var1 which have low rank at var2, and keep that case which has highest rank at var2. I would like to keep the Whole row (with the other variables: var1 var2 var3 var4 var5 1 8 125 1 9 2 9 400 1 22 3 2 200 50 40 4 8 200 1 60 Thanks Ag [[alternative HTML version deleted]]
Does this works? (Untested) library(plyr) ddply(your_dataframe, "var1", function(x){ x[which.max(x$var2), ] }) ---------------------------------------------------------------------------- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey> -----Oorspronkelijk bericht----- > Van: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Namens D?vav?nyai Agamemn?n > Verzonden: donderdag 29 juli 2010 16:31 > Aan: r-help at r-project.org > Onderwerp: [R] Fwd: duplicates > > > > -- Eredeti ?zenet -- > Felad?: D?vav?nyai Agamemn?n > <devavanyai at citromail.hu>C?mzett: r-hel at r-project.org, > r-hel at r-project.orgElk?ldve: 2010. j?lius 29. 16:29T?rgy : duplicates > > Sorry! > I try it again > > Dear R Users! > > > I have a dataframe with duplicatecases. Var1 duplicated by var2. > > > > var1 var2 var3 var4 var5 > 1 4 500 1 2 > 1 3 200 2 5 > 1 8 125 1 9 > 2 2 120 2 52 > 2 6 22 1 20 > 2 9 400 1 22 > 3 1 100 2 8 > 3 2 200 5 40 > 4 8 20 1 60 > > I want to delete duplicate ones from var1 which have low rank > at var2, and keep that case which has highest rank at var2. I > would like to keep the Whole row (with the other variables: > > var1 var2 var3 var4 var5 > 1 8 125 1 9 > 2 9 400 1 22 > 3 2 200 50 40 > 4 8 200 1 60 > > Thanks Ag > > [[alternative HTML version deleted]] > >Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
Hi rather complicated one liner assuming your data frame has name test do.call(rbind,lapply(split(test,test$var1), function(x) x[which.max(x[,"var2"]),])) Here it is in 3 lines test.s <- split(test,test$var1) # splits data frame result <- lapply(test.s, function(x) x[which.max(x[,"var2"]),]) # chose maximum value from var2 and selects corresponding row do.call(rbind, result) # put evereything into one data frame again There could be issues if you had NA values in var1 or var2 Regards Petr r-help-bounces at r-project.org napsal dne 29.07.2010 16:31:06:> > > -- Eredeti ??zenet -- > Felad??: D??vav??nyai Agamemn??n<devavanyai at citromail.hu>C??mzett: r-> hel at r-project.org, r-hel at r-project.orgElk??ldve: 2010. j??lius 29.16:29T??rgy> : duplicates > > Sorry! > I try it again > > Dear R Users! > > > I have a dataframe with duplicatecases. Var1 duplicated by var2. > > > > var1 var2 var3 var4 var5 > 1 4 500 1 2 > 1 3 200 2 5 > 1 8 125 1 9 > 2 2 120 2 52 > 2 6 22 1 20 > 2 9 400 1 22 > 3 1 100 2 8 > 3 2 200 5 40 > 4 8 20 1 60 > > I want to delete duplicate ones from var1 which have low rank at var2,and> keep that case which has highest rank at var2. I would like to keep theWhole> row (with the other variables: > > var1 var2 var3 var4 var5 > 1 8 125 1 9 > 2 9 400 1 22 > 3 2 200 50 40 > 4 8 200 1 60 > > Thanks Ag > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Hi, Please try ?rle t.x <- x[order(x[,1],x[,2]),] t.x[cumsum(rle(t.x[,1])$lengths),] ----- A R learner. -- View this message in context: http://r.789695.n4.nabble.com/Fwd-duplicates-tp2306555p2306617.html Sent from the R help mailing list archive at Nabble.com.