thr3ads.net - R help - [R] create a factor variable from two numeric variables when order is irrelevant [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Daniel Malter

2011-Jun-28 19:59 UTC

[R] create a factor variable from two numeric variables when order is irrelevant

Hi all,

I have two numeric variables that form combinations in a matched sample.
Let's say I have five levels of x and y. What I am seeking to create is a
factor variable that ignores the order of x and y, i.e., the factor should
indicate x=1, y=5, as the same factor as x=5, y=1. Obviously, this becomes
increasingly cumbersome to do by hand as the number of levels increases.

f<-1:5
x<-sample(f,100,replace=T)
y<-sample(f,100,replace=T)
d<-matrix(cbind(x,y),ncol=2)

#A working solution is to remove the order, multiply one column by a scaling
constant, add the second column, and create the factor for this numeric
value. However, I was wondering whether there is less awkward, more direct
way to do this.

i<-apply(t(apply(d,1,function(x) sort(x))),1,function(y) 10*y[1]+y[2])
i<-factor(i)
i

Thanks for your help,
Daniel




--
View this message in context:
http://r.789695.n4.nabble.com/create-a-factor-variable-from-two-numeric-variables-when-order-is-irrelevant-tp3631318p3631318.html
Sent from the R help mailing list archive at Nabble.com.

David Winsemius

2011-Jun-28 20:53 UTC

head link

[R] create a factor variable from two numeric variables when order is irrelevant

On Jun 28, 2011, at 3:59 PM, Daniel Malter wrote:
> Hi all,
>
> I have two numeric variables that form combinations in a matched  
> sample.
> Let's say I have five levels of x and y. What I am seeking to create  
> is a
> factor variable that ignores the order of x and y, i.e., the factor  
> should
> indicate x=1, y=5, as the same factor as x=5, y=1. Obviously, this  
> becomes
> increasingly cumbersome to do by hand as the number of levels  
> increases.
>
> f<-1:5
> x<-sample(f,100,replace=T)
> y<-sample(f,100,replace=T)
> d<-matrix(cbind(x,y),ncol=2)
>
> #A working solution is to remove the order, multiply one column by a  
> scaling
> constant, add the second column, and create the factor for this  
> numeric
> value. However, I was wondering whether there is less awkward, more  
> direct
> way to do this.
>
> i<-apply(t(apply(d,1,function(x) sort(x))),1,function(y) 10*y[1]+y[2])
> i<-factor(i)
> i
I came up with the same solution, but implemented it a bit differently:

 > d <- pmin(x,y)+5*pmax(x,y)

 > sort(unique(d))
  [1] 11 21 22 31 32 33 41 42 43 44 51 52 53 54 55

 > d <- factor(pmin(x,y)+10*pmax(x,y))
 > unique(d)
  [1] 41 42 32 54 51 21 22 33 53 11 31 44 43 52 55
Levels: 11 21 22 31 32 33 41 42 43 44 51 52 53 54 55


Seems that you might find the the BioC people doing something  
isomorphic to this with gene allele pairs using their fancy S4 methods.

--
David Winsemius, MD
West Hartford, CT

Maybe Matching Threads

Search for more reasonably related threads

R help - Jun 2011 - create a factor variable from two numeric variables when order is irrelevant

[R] create a factor variable from two numeric variables when order is irrelevant

[R] create a factor variable from two numeric variables when order is irrelevant

Maybe Matching Threads