Hi, I need to extract the second largest element from each row of a matrix. Below is my solution, but I think there should be a more efficient way to accomplish the same, or not? set.seed(1) a <- matrix(rnorm(9), 3 ,3) sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,]) ans <- sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) ans Thanks in advance for your help, Lars. [[alternative HTML version deleted]]
one way is the following: a <- matrix(rnorm(9), 3 ,3) aa <- a[order(row(a), -a)] matrix(aa, nrow(a), byrow = TRUE)[, 2] I hope it helps. Best, Dimitris On 4/26/2011 2:01 PM, Lars Bishop wrote:> Hi, > > I need to extract the second largest element from each row of a > matrix. Below is my solution, but I think there should be a more efficient > way to accomplish the same, or not? > > > set.seed(1) > a<- matrix(rnorm(9), 3 ,3) > sec.large<- as.vector(apply(a, 1, order, decreasing=T)[2,]) > ans<- sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) > ans > > Thanks in advance for your help, > Lars. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/
would this work (shorter)? apply(a, 1, function(x) x[order(x)[2]]) On Tue, Apr 26, 2011 at 5:31 PM, Lars Bishop <lars52r at gmail.com> wrote:> Hi, > > I need to extract the second largest element from each row of a > matrix. Below is my solution, but I think there should be a more efficient > way to accomplish the same, or not? > > > ?set.seed(1) > ?a <- matrix(rnorm(9), 3 ,3) > ?sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,]) > ?ans <- sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) > ?ans > > Thanks in advance for your help, > Lars. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 2011-04-26 05:01, Lars Bishop wrote:> Hi, > > I need to extract the second largest element from each row of a > matrix. Below is my solution, but I think there should be a more efficient > way to accomplish the same, or not? > > > set.seed(1) > a<- matrix(rnorm(9), 3 ,3) > sec.large<- as.vector(apply(a, 1, order, decreasing=T)[2,]) > ans<- sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) > ansTry apply(a, 1, function(x) sort(x, decreasing=TRUE)[2]) Peter Ehlers> > Thanks in advance for your help, > Lars. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Apr 26, 2011, at 8:01 AM, Lars Bishop wrote:> Hi, > > I need to extract the second largest element from each row of a > matrix. Below is my solution, but I think there should be a more > efficient > way to accomplish the same, or not? > > > set.seed(1) > a <- matrix(rnorm(9), 3 ,3) > sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,]) > ans <- sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) > ansThere are probably many but this one is reasonably compact, one-step, and readable: > ans2 <- apply(a, 1, function(i) sort(i)[ dim(a)[2]-1]) > ans2 Refreshing my mail client proves I was right about many solutions, but this is the first (so far) to use the dim attribute. -- David Winsemius, MD West Hartford, CT
On Apr 26, 2011, at 14:36 , David Winsemius wrote:> > On Apr 26, 2011, at 8:01 AM, Lars Bishop wrote: > >> Hi, >> >> I need to extract the second largest element from each row of a >> matrix. Below is my solution, but I think there should be a more efficient >> way to accomplish the same, or not? >> >> >> set.seed(1) >> a <- matrix(rnorm(9), 3 ,3) >> sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,]) >> ans <- sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) >> ans > > There are probably many but this one is reasonably compact, one-step, and readable: > > > ans2 <- apply(a, 1, function(i) sort(i)[ dim(a)[2]-1]) > > ans2 > > Refreshing my mail client proves I was right about many solutions, but this is the first (so far) to use the dim attribute.Anything with sort() or order() will have complexity O(n*log(n)) or worse (n is the number of columns), whereas finding the k-th largest element has complexity O(k*n). For moderate n, this may be unimportant, but you could potentially find a speedup using sort.int(i, decreasing=TRUE, partial=2)[2] or max(i[-which.max(i)]) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
A different approach is to use order() to sort first by row number and then break the ties by value. It is quick when there are lots of short rows.> f1 <- function (x)+ apply(x, 1, function(row) sort(row, decreasing = TRUE)[2])> f2 <- function (x)+ -apply(-x, 1, function(row) sort.int(row, partial = 2)[2])> f3 <- function (x)+ { + # order by row number then by value + y <- t(x) + array(y[order(col(y), y)], dim(y))[nrow(y) - 1, ] + }> f4 <- function (x)+ apply(x, 1, function(row) max(row[-which.max(row)]))> x <- matrix(runif(1e5*6), nrow=1e5) > library(rbenchmark) > benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x),+ replications=5, columns=c("test","replications","elapsed"), order="elapsed") test replications elapsed 3 r3 <- f3(x) 5 1.08 4 r4 <- f4(x) 5 12.59 2 r2 <- f2(x) 5 23.19 1 r1 <- f1(x) 5 59.54> identical(r1,r2) && identical(r1, r3) && identical(r1, r4)[1] TRUE Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of peter dalgaard > Sent: Tuesday, April 26, 2011 8:13 AM > To: David Winsemius > Cc: r-help at r-project.org > Subject: Re: [R] Second largest element from each matrix row > > > On Apr 26, 2011, at 14:36 , David Winsemius wrote: > > > > > On Apr 26, 2011, at 8:01 AM, Lars Bishop wrote: > > > >> Hi, > >> > >> I need to extract the second largest element from each row of a > >> matrix. Below is my solution, but I think there should be > a more efficient > >> way to accomplish the same, or not? > >> > >> > >> set.seed(1) > >> a <- matrix(rnorm(9), 3 ,3) > >> sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,]) > >> ans <- sapply(1:length(sec.large), function(i) a[i, sec.large[i]]) > >> ans > > > > There are probably many but this one is reasonably compact, > one-step, and readable: > > > > > ans2 <- apply(a, 1, function(i) sort(i)[ dim(a)[2]-1]) > > > ans2 > > > > Refreshing my mail client proves I was right about many > solutions, but this is the first (so far) to use the dim attribute. > > Anything with sort() or order() will have complexity > O(n*log(n)) or worse (n is the number of columns), whereas > finding the k-th largest element has complexity O(k*n). > > For moderate n, this may be unimportant, but you could > potentially find a speedup using > > sort.int(i, decreasing=TRUE, partial=2)[2] > > or > > max(i[-which.max(i)]) > > -- > Peter Dalgaard > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
And I hit the send button before adding the timings for when there were lots of columns and few rows. f3 changes from the best to the worst in this case. There is rarely one most efficient function for all datasets.> x <- t(x) > benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x),replications=5, columns=c("test","replications","elapsed"), order="elapsed") test replications elapsed 4 r4 <- f4(x) 5 0.19 2 r2 <- f2(x) 5 0.24 1 r1 <- f1(x) 5 0.79 3 r3 <- f3(x) 5 3.75> identical(r1,r2) && identical(r1, r3) && identical(r1, r4)[1] TRUE> dim(x)[1] 6 100000 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of William Dunlap > Sent: Tuesday, April 26, 2011 9:11 AM > To: peter dalgaard; David Winsemius > Cc: r-help at r-project.org > Subject: Re: [R] Second largest element from each matrix row > > A different approach is to use order() to sort > first by row number and then break the ties by > value. It is quick when there are lots of short > rows. > > > f1 <- function (x) > + apply(x, 1, function(row) sort(row, decreasing = TRUE)[2]) > > f2 <- function (x) > + -apply(-x, 1, function(row) sort.int(row, partial = 2)[2]) > > f3 <- function (x) > + { > + # order by row number then by value > + y <- t(x) > + array(y[order(col(y), y)], dim(y))[nrow(y) - 1, ] > + } > > f4 <- function (x) > + apply(x, 1, function(row) max(row[-which.max(row)])) > > x <- matrix(runif(1e5*6), nrow=1e5) > > library(rbenchmark) > > benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x), > + replications=5, columns=c("test","replications","elapsed"), > order="elapsed") > test replications elapsed > 3 r3 <- f3(x) 5 1.08 > 4 r4 <- f4(x) 5 12.59 > 2 r2 <- f2(x) 5 23.19 > 1 r1 <- f1(x) 5 59.54 > > identical(r1,r2) && identical(r1, r3) && identical(r1, r4) > [1] TRUE > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > > > -----Original Message----- > > From: r-help-bounces at r-project.org > > [mailto:r-help-bounces at r-project.org] On Behalf Of peter dalgaard > > Sent: Tuesday, April 26, 2011 8:13 AM > > To: David Winsemius > > Cc: r-help at r-project.org > > Subject: Re: [R] Second largest element from each matrix row > > > > > > On Apr 26, 2011, at 14:36 , David Winsemius wrote: > > > > > > > > On Apr 26, 2011, at 8:01 AM, Lars Bishop wrote: > > > > > >> Hi, > > >> > > >> I need to extract the second largest element from each row of a > > >> matrix. Below is my solution, but I think there should be > > a more efficient > > >> way to accomplish the same, or not? > > >> > > >> > > >> set.seed(1) > > >> a <- matrix(rnorm(9), 3 ,3) > > >> sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,]) > > >> ans <- sapply(1:length(sec.large), function(i) a[i, > sec.large[i]]) > > >> ans > > > > > > There are probably many but this one is reasonably compact, > > one-step, and readable: > > > > > > > ans2 <- apply(a, 1, function(i) sort(i)[ dim(a)[2]-1]) > > > > ans2 > > > > > > Refreshing my mail client proves I was right about many > > solutions, but this is the first (so far) to use the dim attribute. > > > > Anything with sort() or order() will have complexity > > O(n*log(n)) or worse (n is the number of columns), whereas > > finding the k-th largest element has complexity O(k*n). > > > > For moderate n, this may be unimportant, but you could > > potentially find a speedup using > > > > sort.int(i, decreasing=TRUE, partial=2)[2] > > > > or > > > > max(i[-which.max(i)]) > > > > -- > > Peter Dalgaard > > Center for Statistics, Copenhagen Business School > > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > > Phone: (+45)38153501 > > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >