thr3ads.net - R help - [R] Computing an ordering on subsets of a data frame [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Lukas Biewald

2007-Apr-18 21:49 UTC

[R] Computing an ordering on subsets of a data frame

If I have a data frame X that looks like this:

A B
- -
1 2
1 3
1 4
2 3
2 1
2 1
3 2
3 1
3 3

and I want to make another column which has the rank of B computed
separately for each value of A.

I.e. something like:

A B C
- - -
1 2 1
1 3 2
1 4 3
2 3 3
2 1 1
2 1 2
3 2 2
3 1 1
3 3 3

by(X, X[,1], function(x) { rank(x[,1], ties.method="random") } )
almost
seems to work, but the data is not in a frame, and I can't figure out how to
merge it back into X properly.

Thanks,
Lukas

jim holtman

2007-Apr-19 02:54 UTC

head link

[R] Computing an ordering on subsets of a data frame

Does this do what you want?
> x <- "A B+ 1 2
+ 1 3
+ 1 4
+ 2 3
+ 2 1
+ 2 1
+ 3 2
+ 3 1
+ 3 3"> x <- read.table(textConnection(x), header=TRUE)
> x$C <- ave(x$B, x$A, FUN=rank)
> x  A B   C
1 1 2 1.0
2 1 3 2.0
3 1 4 3.0
4 2 3 3.0
5 2 1 1.5
6 2 1 1.5
7 3 2 2.0
8 3 1 1.0
9 3 3 3.0


On 4/18/07, Lukas Biewald <lukeb at powerset.com>
wrote:> If I have a data frame X that looks like this:
>
> A B
> - -
> 1 2
> 1 3
> 1 4
> 2 3
> 2 1
> 2 1
> 3 2
> 3 1
> 3 3
>
> and I want to make another column which has the rank of B computed
> separately for each value of A.
>
> I.e. something like:
>
> A B C
> - - -
> 1 2 1
> 1 3 2
> 1 4 3
> 2 3 3
> 2 1 1
> 2 1 2
> 3 2 2
> 3 1 1
> 3 3 3
>
> by(X, X[,1], function(x) { rank(x[,1], ties.method="random") } )
almost
> seems to work, but the data is not in a frame, and I can't figure out
how to
> merge it back into X properly.
>
> Thanks,
> Lukas
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

Steven McKinney

2007-Apr-20 02:54 UTC

head link

[R] Computing an ordering on subsets of a data frame

Hi Lukas,

Using by() or its cousins tapply() etc. is tricky,
as you need to properly merge results back into X.

You can do that by adding a key ID variable to X, 
and carrying along that key ID variable in calls
to by() etc., though I haven't tested out a method.

You can also create a new column in X to hold the
results, and then sort the subsections of X in a
for() loop.
> X <- data.frame(A = c(1,1,1,2,2,2,3,3,3), B = c(2,3,4,3,1,1,2,1,3))
> X  A B
1 1 2
2 1 3
3 1 4
4 2 3
5 2 1
6 2 1
7 3 2
8 3 1
9 3 3> 
> X$C <- rep(as.numeric(NA), nrow(X))
> 
> sortLevels <- unique(X$A)
> 
> for(i in seq(along = sortLevels)) {+   sortIdxp <- X$A == sortLevels[i]
+   X$C[sortIdxp] <- rank(X$B[sortIdxp], ties.method = "random")
+ }> X  A B C
1 1 2 1
2 1 3 2
3 1 4 3
4 2 3 3
5 2 1 1
6 2 1 2
7 3 2 2
8 3 1 1
9 3 3 3> 
Merging results back in after using
tapply() or by() is harder if your
data frame is in random order, but the
for() loop approach with indexing
still works fine.
> set.seed(123)
> Y <- X[sample(9), ]
> Y  A B C
3 1 4 3
7 3 2 2
9 3 3 3
6 2 1 2
5 2 1 1
1 1 2 1
2 1 3 2
8 3 1 1
4 2 3 3> Y$C <- rep(as.numeric(NA), nrow(Y))
> 
> sortLevels <- unique(Y$A)## You can also use levels() instead of unique() if Y$A is a
factor.> 
> for(i in seq(along = sortLevels)) {+   sortIdxp <- Y$A == sortLevels[i]
+   Y$C[sortIdxp] <- rank(Y$B[sortIdxp], ties.method = "random")
+ }> Y  A B C
3 1 4 3
7 3 2 2
9 3 3 3
6 2 1 2
5 2 1 1
1 1 2 1
2 1 3 2
8 3 1 1
4 2 3 3> oY <- order(Y$A)
> Y[oY,]  A B C
3 1 4 3
1 1 2 1
2 1 3 2
6 2 1 2
5 2 1 1
4 2 3 3
7 3 2 2
9 3 3 3
8 3 1 1>
 
HTH
 

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney at bccrc.ca
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada


 

 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-
> bounces at stat.math.ethz.ch] On Behalf Of Lukas Biewald
> Sent: Wednesday, April 18, 2007 2:49 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Computing an ordering on subsets of a data frame
> 
> If I have a data frame X that looks like this:
> 
> A B
> - -
> 1 2
> 1 3
> 1 4
> 2 3
> 2 1
> 2 1
> 3 2
> 3 1
> 3 3
> 
> and I want to make another column which has the rank of B computed
> separately for each value of A.
> 
> I.e. something like:
> 
> A B C
> - - -
> 1 2 1
> 1 3 2
> 1 4 3
> 2 3 3
> 2 1 1
> 2 1 2
> 3 2 2
> 3 1 1
> 3 3 3
> 
> by(X, X[,1], function(x) { rank(x[,1], ties.method="random") } )
almost> seems to work, but the data is not in a frame, and I can't figure out
how> to
> merge it back into X properly.
> 
> Thanks,
> Lukas
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Apr 2007 - Computing an ordering on subsets of a data frame

[R] Computing an ordering on subsets of a data frame

[R] Computing an ordering on subsets of a data frame

[R] Computing an ordering on subsets of a data frame

Possibly Parallel Threads