Hi all, I'm looking for an efficient solution (speed and memory) for the following problem: Given - a data.frame x containing numbers of type double with nrow(x)>ncol(x) and unique row lables and - a character vector y containing a sorted order labels Now, I'd like to sort the rows of the data.frame x w.r.t. the order of labels in y. example: x <- data.frame(c(1:4),c(5:8)) row.names(x)<-LETTERS[1:4] y <- c("C","A","D","B") My current solution is like this: if(!is.null(y) && is.vector(y)) { nObj <- length(y) for (i in 1:nObj) { sObj <- y[i] k <- c(1:nrow(x))[row.names(x)==sObj] if (i != k) { names <- row.names(x) tObj <- row.names(x[i,]) temp <- x[i,] x[i,] <- x[k,] x[k,] <- temp names[i] <- sObj names[k] <- tObj row.names(x) <- names } } } But I'm not happy with it because it is not really efficient. Any other suggestions are welcome! Thanks, Toralf
Toralf Kirsten <tkirsten at izbi.uni-leipzig.de> writes:> Hi all, > I'm looking for an efficient solution (speed and memory) for the > following problem: > Given > - a data.frame x containing numbers of type double > with nrow(x)>ncol(x) and unique row lables and > - a character vector y containing a sorted order labels > > Now, I'd like to sort the rows of the data.frame x w.r.t. the order of > labels in y. > > example: > x <- data.frame(c(1:4),c(5:8)) > row.names(x)<-LETTERS[1:4] > y <- c("C","A","D","B") > > > My current solution is like this: > if(!is.null(y) && is.vector(y)) { > nObj <- length(y) > for (i in 1:nObj) { > sObj <- y[i] > k <- c(1:nrow(x))[row.names(x)==sObj] > if (i != k) { > names <- row.names(x) > tObj <- row.names(x[i,]) > temp <- x[i,] > x[i,] <- x[k,] > x[k,] <- temp > names[i] <- sObj > names[k] <- tObj > row.names(x) <- names > } > } > } > > But I'm not happy with it because it is not really efficient. Any > other suggestions are welcome!Anything wrong with x[y,] ??? -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
> From: Peter Dalgaard > > Toralf Kirsten <tkirsten at izbi.uni-leipzig.de> writes: > > > Hi all, > > I'm looking for an efficient solution (speed and memory) for the > > following problem: > > Given > > - a data.frame x containing numbers of type double > > with nrow(x)>ncol(x) and unique row lables and > > - a character vector y containing a sorted order labels > > > > Now, I'd like to sort the rows of the data.frame x w.r.t. > the order of > > labels in y. > > > > example: > > x <- data.frame(c(1:4),c(5:8)) > > row.names(x)<-LETTERS[1:4] > > y <- c("C","A","D","B") > > > > > > My current solution is like this: > > if(!is.null(y) && is.vector(y)) { > > nObj <- length(y) > > for (i in 1:nObj) { > > sObj <- y[i] > > k <- c(1:nrow(x))[row.names(x)==sObj] > > if (i != k) { > > names <- row.names(x) > > tObj <- row.names(x[i,]) > > temp <- x[i,] > > x[i,] <- x[k,] > > x[k,] <- temp > > names[i] <- sObj > > names[k] <- tObj > > row.names(x) <- names > > } > > } > > } > > > > But I'm not happy with it because it is not really efficient. Any > > other suggestions are welcome! > > Anything wrong with x[y,] ???Well... sometimes:> nm <- as.character(sample(1:1e5)) > x <- data.frame(x1=rnorm(1e5), row.names=1:1e5) > system.time(x[nm, , drop=FALSE], gcFirst=TRUE)[1] 155.13 0.01 156.10 NA NA> system.time(x2<-x[match(nm, rownames(x)), , drop=FALSE], gcFirst=TRUE)[1] 0.37 0.00 0.37 NA NA> all(rownames(x2) == nm)[1] TRUE> R.version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 0.1 year 2004 month 11 day 15 language R Cheers, Andy> -- > O__ ---- Peter Dalgaard Blegdamsvej 3 > c/ /'_ --- Dept. of Biostatistics 2200 Cph. N > (*) \(*) -- University of Copenhagen Denmark Ph: > (+45) 35327918 > ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: > (+45) 35327907 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >