thr3ads.net - R help - [R] removing duplicate rows [May 2010]

If this information is useful, please help other people find it:
Share via:

Jim Bouldin

2010-May-12 00:07 UTC

[R] removing duplicate rows

I'm trying to identify and remove rows in a data frame that are duplicated
only on particular columns within it (i.e. not on all columns).  The
"unique" function looks for uniqueness across all columns of a data
frame.
 Identifying unique rows based only on specific columns of interest returns
only those columns, not all of the columns in the original frame.  I tried
this, and then added an identifier column to this truncated data frame, and
then tried merging this with the original data frame and selecting only
those rows container the identifier.  But this did not work no matter how
the arguments were altered: all records were returned instead of the
uniques.  Completely stumped--any help appreciated. Thanks.



Jim Bouldin, PhD
Research Ecologist
Department of Plant Sciences, UC Davis
Davis CA, 95616
530-554-1740

Sean Anderson

2010-May-12 00:28 UTC

head link

[R] removing duplicate rows

On Tue, May 11, 2010 at 9:07 PM, Jim Bouldin <jrbouldin at ucdavis.edu>
wrote:>
> I'm trying to identify and remove rows in a data frame that are
duplicated
> only on particular columns within it (i.e. not on all columns).
This is probably the cleanest way:

dat <- data.frame(x = c(1, 2, 3), y = c(1, 1, 3))
subset(dat, !duplicated(y))

See this thread (among others) for some other options:
http://finzi.psych.upenn.edu/Rhelp10/2010-January/224658.html

Maybe Matching Threads

Search for more possibly parallel threads

R help - May 2010 - removing duplicate rows

[R] removing duplicate rows

[R] removing duplicate rows

Maybe Matching Threads