Dear R-helpers, I know how to use unique to select unique rows, e.g. unique.rows<-unique(dataframe) but I would like to select those rows that are unique only only TWO of my dataframe's columns (so, two rows with the same value on these two columns would not be kept, even if they had different values in other columns). For example, I have a dataframe with 10 columns, two of which are LATITUDE and LONGITUDE. I wish to keep only one row per unique combination of these two columns, so I've tried: unique.latlong<-extracted[unique(paste(extracted$latitude,extracted$longitude)),] but this is returning a dataframe of missing values (NAs). Could anyone point me in the right direction? Thanks! Mark Na [[alternative HTML version deleted]]
Henrique Dallazuanna
2009-Jun-29 22:19 UTC
[R] How to select partially (not completely) unique rows?
Try this: DF[!duplicated(DF[,c("lat","lon")]),] On Mon, Jun 29, 2009 at 6:55 PM, Mark Na <mtb954@gmail.com> wrote:> Dear R-helpers, > > I know how to use unique to select unique rows, e.g. > > unique.rows<-unique(dataframe) > > but I would like to select those rows that are unique only only TWO of my > dataframe's columns (so, two rows with the same value on these two columns > would not be kept, even if they had different values in other columns). > > For example, I have a dataframe with 10 columns, two of which are LATITUDE > and LONGITUDE. I wish to keep only one row per unique combination of these > two columns, so I've tried: > > > unique.latlong<-extracted[unique(paste(extracted$latitude,extracted$longitude)),] > > but this is returning a dataframe of missing values (NAs). > > Could anyone point me in the right direction? > > Thanks! Mark Na > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
Paul Hiemstra
2009-Jun-30 09:19 UTC
[R] How to select partially (not completely) unique rows?
Hi Mark, If you convert the data.frame to a Spatial class (see the sp-package documentation) you can use the function zerodist to find spatial locations that are at the same locations. cheers, Paul Mark Na wrote:> Dear R-helpers, > > I know how to use unique to select unique rows, e.g. > > unique.rows<-unique(dataframe) > > but I would like to select those rows that are unique only only TWO of my > dataframe's columns (so, two rows with the same value on these two columns > would not be kept, even if they had different values in other columns). > > For example, I have a dataframe with 10 columns, two of which are LATITUDE > and LONGITUDE. I wish to keep only one row per unique combination of these > two columns, so I've tried: > > unique.latlong<-extracted[unique(paste(extracted$latitude,extracted$longitude)),] > > but this is returning a dataframe of missing values (NAs). > > Could anyone point me in the right direction? > > Thanks! Mark Na > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul