Hi, I have a huge matrix (4000 * 2000 data points) and I would like to retrieve the coordinates (column and row) for the top 50 (or x) values. Some positions in the matrix have NA as a value. These should be discarded. My current method is to replace all NAs by 0, then rank all the values and then extract the positions with the 50 highest ranks. It is very time-consuming! Is there a simpler way to do this? Thank you, Ulrich -- View this message in context: http://r.789695.n4.nabble.com/Find-the-50-highest-values-in-a-matrix-tp2259721p2259721.html Sent from the R help mailing list archive at Nabble.com.
Matrix is just a vector. So order should work haven't verified the following code. a <- matrix(rnorm(4000*2000), 4000, 2000) b <- order(a, na.last=TRUE, decreasing=TRUE)[1:50] use %% or %/% to get the row# and column #s Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina nikhil.list at gmail.com On Jun 18, 2010, at 1:41 AM, uschlecht wrote:> > Hi, > > I have a huge matrix (4000 * 2000 data points) and I would like to > retrieve > the coordinates (column and row) for the top 50 (or x) values. Some > positions in the matrix have NA as a value. These should be discarded. > > My current method is to replace all NAs by 0, then rank all the > values and > then extract the positions with the 50 highest ranks. It is very > time-consuming! > > Is there a simpler way to do this? > > Thank you, > Ulrich > > -- > View this message in context: http://r.789695.n4.nabble.com/Find-the-50-highest-values-in-a-matrix-tp2259721p2259721.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.b - b%%nrow(a)
Hi: Here's a faked up example: a <- matrix(rnorm(4000*2000), 4000, 2000) # Generate some NAs in the matrix nr <- sample(50, 1:4000) nc <- sample(50, 1:2000) a[nr, nc] <- NA # convert to data frame: b <- data.frame(row = rep(1:4000, 2000), col = rep(1:2000, each = 4000), x = as.vector(a)) # relatively time consuming...about 13.5 s on my machine bb <- b[rev(order(b$x, na.last = FALSE)), ]> bb[1:10, ]row col x 691269 3269 173 5.103704 7815076 3076 1954 4.961544 4999621 3621 1250 4.953265 500469 469 126 4.937655 5878224 2224 1470 4.929150 4287270 3270 1072 4.913791 4442521 2521 1111 4.896869 4668867 867 1168 4.863504 5716575 575 1430 4.760778 3055274 3274 764 4.758995 HTH, Dennis On Thu, Jun 17, 2010 at 10:41 PM, uschlecht <ulrich.schlecht@stanford.edu>wrote:> > Hi, > > I have a huge matrix (4000 * 2000 data points) and I would like to retrieve > the coordinates (column and row) for the top 50 (or x) values. Some > positions in the matrix have NA as a value. These should be discarded. > > My current method is to replace all NAs by 0, then rank all the values and > then extract the positions with the 50 highest ranks. It is very > time-consuming! > > Is there a simpler way to do this? > > Thank you, > Ulrich > > -- > View this message in context: > http://r.789695.n4.nabble.com/Find-the-50-highest-values-in-a-matrix-tp2259721p2259721.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]