thr3ads.net - R help - [R] Finding (swapped) repetitions of numbers pairs across two columns [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Emmanuel Levy

2012-Dec-27 20:30 UTC

[R] Finding (swapped) repetitions of numbers pairs across two columns

Hi,

I've had this problem for a while and tackled it is a quite dirty way
so I'm wondering is a better solution exists:

If we have two vectors:

v1 = c(0,1,2,3,4)
v2 = c(5,3,2,1,0)

How to remove one instance of the "3,1" / "1,3" double?

At the moment I'm using the following solution, which is quite horrible:

v1 = c(0,1,2,3,4)
v2 = c(5,3,2,1,0)
ft <- cbind(v1, v2)
direction = apply( ft, 1, function(x) return(x[1]>x[2]))
ft.tmp = ft
ft[which(direction),1] = ft.tmp[which(direction),2]
ft[which(direction),2] = ft.tmp[which(direction),1]
uniques     = apply( ft, 1, function(x) paste(x, collapse="%") )
uniques     = unique(uniques)
ft.unique   = matrix(unlist(strsplit(uniques,"%")), ncol=2,
byrow=TRUE)


Any better solution would be very welcome!

All the best,

Emmanuel

Marc Schwartz

2012-Dec-27 20:39 UTC

head link

[R] Finding (swapped) repetitions of numbers pairs across two columns

On Dec 27, 2012, at 2:30 PM, Emmanuel Levy <emmanuel.levy at gmail.com>
wrote:
> Hi,
> 
> I've had this problem for a while and tackled it is a quite dirty way
> so I'm wondering is a better solution exists:
> 
> If we have two vectors:
> 
> v1 = c(0,1,2,3,4)
> v2 = c(5,3,2,1,0)
> 
> How to remove one instance of the "3,1" / "1,3" double?
> 
> At the moment I'm using the following solution, which is quite
horrible:
> 
> v1 = c(0,1,2,3,4)
> v2 = c(5,3,2,1,0)
> ft <- cbind(v1, v2)
> direction = apply( ft, 1, function(x) return(x[1]>x[2]))
> ft.tmp = ft
> ft[which(direction),1] = ft.tmp[which(direction),2]
> ft[which(direction),2] = ft.tmp[which(direction),1]
> uniques     = apply( ft, 1, function(x) paste(x, collapse="%") )
> uniques     = unique(uniques)
> ft.unique   = matrix(unlist(strsplit(uniques,"%")), ncol=2,
byrow=TRUE)
> 
> 
> Any better solution would be very welcome!
> 
> All the best,
> 
> Emmanuel

Try this:
> unique(t(apply(cbind(v1, v2), 1, sort)))     [,1] [,2]
[1,]    0    5
[2,]    1    3
[3,]    2    2
[4,]    0    4


Basically, sort each row so that you don't have to worry about the
permutations of values, then get the unique rows as a result.

Regards,

Marc Schwartz

Emmanuel Levy

2012-Dec-27 20:48 UTC

head link

[R] Finding (swapped) repetitions of numbers pairs across two columns

I did not know that unique worked on entire rows!

That is great, thank you very much!

Emmanuel


On 27 December 2012 22:39, Marc Schwartz <marc_schwartz at me.com>
wrote:> unique(t(apply(cbind(v1, v2), 1, sort)))

Marc Schwartz

2012-Dec-27 20:59 UTC

head link

[R] Finding (swapped) repetitions of numbers pairs across two columns

Yep. There are methods for:
> methods(unique)[1] unique.array           unique.data.frame      unique.default        
[4] unique.matrix          unique.numeric_version unique.POSIXlt   

and for the matrix and data.frame methods, unique rows will be returned by
default. For array and matrix objects, you can change the MARGIN argument to a
different value (eg. 2 for columns, etc.).

See ?unique for more information, notably the Details and Value sections.

Marc

On Dec 27, 2012, at 2:48 PM, Emmanuel Levy <emmanuel.levy at gmail.com>
wrote:
> I did not know that unique worked on entire rows!
> 
> That is great, thank you very much!
> 
> Emmanuel
> 
> 
> On 27 December 2012 22:39, Marc Schwartz <marc_schwartz at me.com>
wrote:
>> unique(t(apply(cbind(v1, v2), 1, sort)))

arun

2012-Dec-28 02:49 UTC

head link

[R] Finding (swapped) repetitions of numbers pairs across two columns

Hi,

You could also use:
apply(cbind(v1,v2),1,function(x) x[order(x)])
#or
unique(t(apply(cbind(v1,v2),1,sort.int,method="quick")))

By comparing different methods:
set.seed(51)
v1<-sample(0:9,1e5,replace=TRUE)
set.seed(49)
v2<-sample(0:9,1e5,replace=TRUE)
system.time(res1<-unique(t(apply(cbind(v1, v2), 1, sort))))
# user? system elapsed 
# 11.373?? 0.188? 11.918 

system.time(res2<-unique(t(apply(cbind(v1,v2),1,sort.int,method="quick"))))
#?? user? system elapsed 
#? 7.088?? 0.120?? 7.446 

?identical(res1,res2)
#[1] TRUE
?system.time(res3 <- unique(t(apply(cbind(v1,v2),1,function(x)
x[order(x)])))) #found to be faster
#?? user? system elapsed 
#? 2.693?? 0.072?? 2.857 

?identical(res1,res3)
#[1] TRUE



A.K.



----- Original Message -----
From: Emmanuel Levy <emmanuel.levy at gmail.com>
To: R-help Mailing List <r-help at r-project.org>
Cc: 
Sent: Thursday, December 27, 2012 3:30 PM
Subject: [R] Finding (swapped) repetitions of numbers pairs across two columns

Hi,

I've had this problem for a while and tackled it is a quite dirty way
so I'm wondering is a better solution exists:

If we have two vectors:

v1 = c(0,1,2,3,4)
v2 = c(5,3,2,1,0)

How to remove one instance of the "3,1" / "1,3" double?

At the moment I'm using the following solution, which is quite horrible:

v1 = c(0,1,2,3,4)
v2 = c(5,3,2,1,0)
ft <- cbind(v1, v2)
direction = apply( ft, 1, function(x) return(x[1]>x[2]))
ft.tmp = ft
ft[which(direction),1] = ft.tmp[which(direction),2]
ft[which(direction),2] = ft.tmp[which(direction),1]
uniques? ?  = apply( ft, 1, function(x) paste(x, collapse="%") )
uniques? ?  = unique(uniques)
ft.unique?  = matrix(unlist(strsplit(uniques,"%")), ncol=2,
byrow=TRUE)


Any better solution would be very welcome!

All the best,

Emmanuel

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Maybe Matching Threads

Search for more maybe matching threads

R help - Dec 2012 - Finding (swapped) repetitions of numbers pairs across two columns

[R] Finding (swapped) repetitions of numbers pairs across two columns

[R] Finding (swapped) repetitions of numbers pairs across two columns

[R] Finding (swapped) repetitions of numbers pairs across two columns

[R] Finding (swapped) repetitions of numbers pairs across two columns

[R] Finding (swapped) repetitions of numbers pairs across two columns

Maybe Matching Threads