thr3ads.net - R help - [R] How to delete a duplicate observation [Sep 2007]

If this information is useful, please help other people find it:
Share via:

nuyaying

2007-Sep-13 16:50 UTC

[R] How to delete a duplicate observation

I have a data set with 3 variables V1, V2, V3.  If there are 2 data points
have the same values on both V1 and V2,  I want to delete one of them which
has smaller V3 value.    i.e., in the data below, I want to delete 
the first observation.  How can I do that ?    Thanks in advance!      

V1  V2  V3
3    3     1
3    3     4

-- 
View this message in context:
http://www.nabble.com/How-to-delete-a-duplicate-observation-tf4437033.html#a12659033
Sent from the R help mailing list archive at Nabble.com.

Peter Dalgaard

2007-Sep-13 18:17 UTC

head link

[R] How to delete a duplicate observation

nuyaying wrote:> I have a data set with 3 variables V1, V2, V3.  If there are 2 data points
> have the same values on both V1 and V2,  I want to delete one of them which
> has smaller V3 value.    i.e., in the data below, I want to delete 
> the first observation.  How can I do that ?    Thanks in advance!      
>
> V1  V2  V3
> 3    3     1
> 3    3     4
>
>   Tricky one... I think something like this should work:

l <- split(d$V3, list(d$V1,d$V2))
ixl <- lapply(l, function(x) {
   if ((n <- nrow(x)) == 2)
      seq_len(n) != which.min(x)
   else
      rep(TRUE, n)
})
ix <- unsplit(ixl, list(d$V1,d$V2))
d[ix,]

-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

Sundar Dorai-Raj

2007-Sep-13 18:42 UTC

head link

[R] How to delete a duplicate observation

nuyaying said the following on 9/13/2007 9:50 AM:> 
> I have a data set with 3 variables V1, V2, V3.  If there are 2 data points
> have the same values on both V1 and V2,  I want to delete one of them which
> has smaller V3 value.    i.e., in the data below, I want to delete 
> the first observation.  How can I do that ?    Thanks in advance!      
> 
> V1  V2  V3
> 3    3     1
> 3    3     4
> 

How about:

## some sample data
d <- read.table(textConnection("V1 V2 V3
3 3 2
3 3 4
3 3 1
3 2 1
3 2 5"), header = TRUE)

## the code
d <- d[rev(do.call("order", d)), ]
d <- d[!duplicated(d[1:2]), ]
d

HTH,

--sundar

Greg Snow

2007-Sep-13 18:58 UTC

head link

[R] How to delete a duplicate observation

How about (assuming the data is in the data frame my.df):
> my.df2 <- my.df[order(my.df$V3, decreasing=TRUE),]
> my.df3 <- my.df2[ !duplicated( my.df2[,c('V1','V2')] ),
]
If order of the rows matters then we will need to add a couple of steps
to reorder.  You did not say what to do if 3 or more points matched,
this approach takes the largest single V3 value from all matching on V1
and V2.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 
 
> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of nuyaying
> Sent: Thursday, September 13, 2007 10:51 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] How to delete a duplicate observation
> 
> 
> 
> I have a data set with 3 variables V1, V2, V3.  If there are 
> 2 data points have the same values on both V1 and V2,  I want 
> to delete one of them which
> has smaller V3 value.    i.e., in the data below, I want to delete 
> the first observation.  How can I do that ?    Thanks in 
> advance!      
> 
> V1  V2  V3
> 3    3     1
> 3    3     4
> 
> --
> View this message in context: 
> http://www.nabble.com/How-to-delete-a-duplicate-observation-tf
> 4437033.html#a12659033
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Sep 2007 - How to delete a duplicate observation

[R] How to delete a duplicate observation

[R] How to delete a duplicate observation

[R] How to delete a duplicate observation

[R] How to delete a duplicate observation

Possibly Parallel Threads