thr3ads.net - R help - [R] sorting a data.frame using a vector [Nov 2004]

If this information is useful, please help other people find it:
Share via:

Toralf Kirsten

2004-Nov-26 17:12 UTC

[R] sorting a data.frame using a vector

Hi all,
I'm looking for an efficient solution (speed and memory) for the 
following problem:
Given
- a data.frame x containing numbers of type double
   with nrow(x)>ncol(x) and unique row lables and
- a character vector y containing a sorted order labels

Now, I'd like to sort the rows of the data.frame x w.r.t. the order of 
labels in y.

example:
x <- data.frame(c(1:4),c(5:8))
row.names(x)<-LETTERS[1:4]
y <- c("C","A","D","B")


My current solution is like this:
if(!is.null(y) && is.vector(y)) {
    nObj <- length(y)
    for (i in 1:nObj) {
      sObj <- y[i]
      k <- c(1:nrow(x))[row.names(x)==sObj]
      if (i != k) {
        names <- row.names(x)
        tObj <- row.names(x[i,])
        temp <- x[i,]
        x[i,] <- x[k,]
        x[k,] <- temp
        names[i] <- sObj
        names[k] <- tObj
        row.names(x) <- names
     }
   }
}

But I'm not happy with it because it is not really efficient. Any other 
suggestions are welcome!

Thanks, Toralf

Peter Dalgaard

2004-Nov-26 17:22 UTC

head link

[R] sorting a data.frame using a vector

Toralf Kirsten <tkirsten at izbi.uni-leipzig.de> writes:
> Hi all,
> I'm looking for an efficient solution (speed and memory) for the
> following problem:
> Given
> - a data.frame x containing numbers of type double
>    with nrow(x)>ncol(x) and unique row lables and
> - a character vector y containing a sorted order labels
> 
> Now, I'd like to sort the rows of the data.frame x w.r.t. the order of
> labels in y.
> 
> example:
> x <- data.frame(c(1:4),c(5:8))
> row.names(x)<-LETTERS[1:4]
> y <- c("C","A","D","B")
> 
> 
> My current solution is like this:
> if(!is.null(y) && is.vector(y)) {
>     nObj <- length(y)
>     for (i in 1:nObj) {
>       sObj <- y[i]
>       k <- c(1:nrow(x))[row.names(x)==sObj]
>       if (i != k) {
>         names <- row.names(x)
>         tObj <- row.names(x[i,])
>         temp <- x[i,]
>         x[i,] <- x[k,]
>         x[k,] <- temp
>         names[i] <- sObj
>         names[k] <- tObj
>         row.names(x) <- names
>      }
>    }
> }
> 
> But I'm not happy with it because it is not really efficient. Any
> other suggestions are welcome!
Anything wrong with x[y,] ???

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

Liaw, Andy

2004-Nov-27 04:39 UTC

head link

[R] sorting a data.frame using a vector

> From: Peter Dalgaard
> 
> Toralf Kirsten <tkirsten at izbi.uni-leipzig.de> writes:
> 
> > Hi all,
> > I'm looking for an efficient solution (speed and memory) for the
> > following problem:
> > Given
> > - a data.frame x containing numbers of type double
> >    with nrow(x)>ncol(x) and unique row lables and
> > - a character vector y containing a sorted order labels
> > 
> > Now, I'd like to sort the rows of the data.frame x w.r.t. 
> the order of
> > labels in y.
> > 
> > example:
> > x <- data.frame(c(1:4),c(5:8))
> > row.names(x)<-LETTERS[1:4]
> > y <- c("C","A","D","B")
> > 
> > 
> > My current solution is like this:
> > if(!is.null(y) && is.vector(y)) {
> >     nObj <- length(y)
> >     for (i in 1:nObj) {
> >       sObj <- y[i]
> >       k <- c(1:nrow(x))[row.names(x)==sObj]
> >       if (i != k) {
> >         names <- row.names(x)
> >         tObj <- row.names(x[i,])
> >         temp <- x[i,]
> >         x[i,] <- x[k,]
> >         x[k,] <- temp
> >         names[i] <- sObj
> >         names[k] <- tObj
> >         row.names(x) <- names
> >      }
> >    }
> > }
> > 
> > But I'm not happy with it because it is not really efficient. Any
> > other suggestions are welcome!
> 
> Anything wrong with x[y,] ???
Well... sometimes:
> nm <- as.character(sample(1:1e5))
> x <- data.frame(x1=rnorm(1e5), row.names=1:1e5)
> system.time(x[nm, , drop=FALSE], gcFirst=TRUE)
[1] 155.13   0.01 156.10     NA     NA> system.time(x2<-x[match(nm, rownames(x)), , drop=FALSE], gcFirst=TRUE)
[1] 0.37 0.00 0.37   NA   NA> all(rownames(x2) == nm)
[1] TRUE> R.version         _              
platform i386-pc-mingw32
arch     i386           
os       mingw32        
system   i386, mingw32  
status                  
major    2              
minor    0.1            
year     2004           
month    11             
day      15             
language R              

Cheers,
Andy 
 > -- 
>    O__  ---- Peter Dalgaard             Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: 
> (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: 
> (+45) 35327907
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Nov 2004 - sorting a data.frame using a vector

[R] sorting a data.frame using a vector

[R] sorting a data.frame using a vector

[R] sorting a data.frame using a vector

Apparently Analagous Threads