Marius Hofert
2012-Dec-08 11:28 UTC
[R] How to efficiently compare each row in a matrix with each row in another matrix?
Dear expeRts, I have two matrices A and B. They have the same number of columns but possibly different number of rows. I would like to compare each row of A with each row of B and check whether all entries in a row of A are less than or equal to all entries in a row of B. Here is a minimal working example: A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix B <- matrix(1:10, ncol=2) # (5, 2) matrix ( ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) ) # (3, 5) = (nrow(A), nrow(B)) matrix The question is: How can this be implemented more efficiently in R, that is, in a faster way? Thanks & cheers, Marius
Thomas Stewart
2012-Dec-08 14:46 UTC
[R] How to efficiently compare each row in a matrix with each row in another matrix?
One option is to consider a Kronecker-type expansion. See code below. -tgs perhaps <- function(A,B){ nA <- nrow(A) nB <- nrow(B) C <- kronecker(matrix(1,nrow=nA,ncol=1),B) >kronecker(A,matrix(1,nrow=nB,ncol=1)) matrix(rowSums(C) == ncol(A), nA, nB, byrow=TRUE) } Marius <- function(A,B) apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) N <- 1000 M <- 5 P <- 5000 A <- matrix(runif(N,1,1000),nrow=N,ncol=M) B <- matrix(runif(M,1,1000),nrow=P,ncol=M) system.time(perhaps(A,B)) system.time(Marius(A,B)) On Sat, Dec 8, 2012 at 6:28 AM, Marius Hofert <marius.hofert@math.ethz.ch>wrote:> Dear expeRts, > > I have two matrices A and B. They have the same number of columns but > possibly different number of rows. I would like to compare each row of A > with each row of B and check whether all entries in a row of A are less > than or equal to all entries in a row of B. Here is a minimal working > example: > > A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix > B <- matrix(1:10, ncol=2) # (5, 2) matrix > ( ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) ) # > (3, 5) = (nrow(A), nrow(B)) matrix > > The question is: How can this be implemented more efficiently in R, that > is, in a faster way? > > Thanks & cheers, > > Marius > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > >[[alternative HTML version deleted]]
Hofert Jan Marius
2012-Dec-08 15:28 UTC
[R] How to efficiently compare each row in a matrix with each row in another matrix?
Nice idea, Thomas, thanks. I could further decrease run time a bit, by building the required matrices by hand. Any other ideas? Marius <- function(A, B) apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) perhaps <- function(A, B){ nA <- nrow(A) nB <- nrow(B) C <- kronecker(matrix(1, nrow=nA, ncol=1), B) >= kronecker(A, matrix(1, nrow=nB, ncol=1)) matrix(rowSums(C) == ncol(A), nA, nB, byrow=TRUE) } Marius.2.0 <- function(A, B){ nA <- nrow(A) nB <- nrow(B) C <- do.call(rbind, rep(list(B), nA)) >= matrix(rep(A, each=nB), ncol=ncol(B)) matrix(rowSums(C) == ncol(A), nA, nB, byrow=TRUE) } M <- 5 N <- 1000 P <- 5000 A <- matrix(runif(N,1,1000), nrow=N, ncol=M) B <- matrix(runif(M,1,1000), nrow=P, ncol=M) system.time(Marius(A, B))[[3]] # ~ 18s system.time(foo <- perhaps(A, B))[[3]] # ~ 1.4s system.time(bar <- Marius.2.0(A, B))[[3]] # ~ 1s stopifnot(all.equal(foo, bar)) ________________________________ From: tgstewart at gmail.com [tgstewart at gmail.com] on behalf of Thomas Stewart [tgs.public.mail at gmail.com] Sent: Saturday, December 08, 2012 3:46 PM To: Hofert Jan Marius Cc: mailman, r-help Subject: Re: [R] How to efficiently compare each row in a matrix with each row in another matrix? One option is to consider a Kronecker-type expansion. See code below. -tgs perhaps <- function(A,B){ nA <- nrow(A) nB <- nrow(B) C <- kronecker(matrix(1,nrow=nA,ncol=1),B) >kronecker(A,matrix(1,nrow=nB,ncol=1)) matrix(rowSums(C) == ncol(A), nA, nB, byrow=TRUE) } Marius <- function(A,B) apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) N <- 1000 M <- 5 P <- 5000 A <- matrix(runif(N,1,1000),nrow=N,ncol=M) B <- matrix(runif(M,1,1000),nrow=P,ncol=M) system.time(perhaps(A,B)) system.time(Marius(A,B)) On Sat, Dec 8, 2012 at 6:28 AM, Marius Hofert <marius.hofert at math.ethz.ch<mailto:marius.hofert at math.ethz.ch>> wrote: Dear expeRts, I have two matrices A and B. They have the same number of columns but possibly different number of rows. I would like to compare each row of A with each row of B and check whether all entries in a row of A are less than or equal to all entries in a row of B. Here is a minimal working example: A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix B <- matrix(1:10, ncol=2) # (5, 2) matrix ( ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) ) # (3, 5) = (nrow(A), nrow(B)) matrix The question is: How can this be implemented more efficiently in R, that is, in a faster way? Thanks & cheers, Marius ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
arun
2012-Dec-08 18:29 UTC
[R] How to efficiently compare each row in a matrix with each row in another matrix?
Hi, May be this: N <- 1000 M <- 5 P <- 5000 set.seed(15) A <- matrix(runif(N,1,1000),nrow=N,ncol=M) set.seed(425) B <- matrix(runif(M,1,1000),nrow=P,ncol=M) Marius.3.0<-function(A,B){do.call(cbind,lapply(split(B,row(B)),function(x) colSums(x>=t(A))==ncol(A)))} ?system.time(Marius.3.0(A,B)) ? # user? system elapsed ?# 0.524?? 0.000?? 0.523 system.time(Marius.2.0(A,B)) #?? user? system elapsed ?# 0.972?? 0.236?? 1.212 system.time(perhaps(A,B)) ? # user? system elapsed ? #1.232?? 0.244?? 1.482 system.time(Marius(A,B)) #?? user? system elapsed # 19.266?? 0.000? 19.298 With the toy example: A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix ?B <- matrix(1:10, ncol=2) # (5, 2) matrix ?ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) ind #????? [,1]? [,2]? [,3]? [,4]? [,5] #[1,]? TRUE? TRUE? TRUE? TRUE? TRUE #[2,] FALSE FALSE? TRUE? TRUE? TRUE #[3,] FALSE FALSE FALSE FALSE FALSE ?Marius.3.0(A,B) #???????? 1???? 2???? 3???? 4???? 5 #[1,]? TRUE? TRUE? TRUE? TRUE? TRUE #[2,] FALSE FALSE? TRUE? TRUE? TRUE #[3,] FALSE FALSE FALSE FALSE FALSE ?str(ind) # logi [1:3, 1:5] TRUE FALSE FALSE TRUE FALSE FALSE ... ?str(Marius.3.0(A,B)) # logi [1:3, 1:5] TRUE FALSE FALSE TRUE FALSE FALSE ... ?#- attr(*, "dimnames")=List of 2 ? #..$ : NULL ? #..$ : chr [1:5] "1" "2" "3" "4" ... A.K. ----- Original Message ----- From: Marius Hofert <marius.hofert at math.ethz.ch> To: R-help <r-help at r-project.org> Cc: Sent: Saturday, December 8, 2012 6:28 AM Subject: [R] How to efficiently compare each row in a matrix with each row in another matrix? Dear expeRts, I have two matrices A and B. They have the same number of columns but possibly different number of rows. I would like to compare each row of A with each row of B and check whether all entries in a row of A are less than or equal to all entries in a row of B. Here is a minimal working example: A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix B <- matrix(1:10, ncol=2) # (5, 2) matrix ( ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) ) # (3, 5) = (nrow(A), nrow(B)) matrix The question is: How can this be implemented more efficiently in R, that is, in a faster way? Thanks & cheers, Marius ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
arun
2012-Dec-08 18:43 UTC
[R] How to efficiently compare each row in a matrix with each row in another matrix?
Hi, Just to add: N <- 1000 M <- 5 P <- 5000 set.seed(15) A <- matrix(runif(N,1,1000),nrow=N,ncol=M) set.seed(425) B <- matrix(runif(M,1,1000),nrow=P,ncol=M) Marius.3.0<-function(A,B){do.call(cbind,lapply(split(B,row(B)),function(x) colSums(x>=t(A))==ncol(A)))} Marius.2.0 <- function(A, B){ ??? nA <- nrow(A) ??? nB <- nrow(B) ??? C <- do.call(rbind, rep(list(B), nA)) >= matrix(rep(A, each=nB), ncol=ncol(B)) ??? matrix(rowSums(C) == ncol(A), nA, nB, byrow=TRUE) } system.time(z3.0<-Marius.3.0(A,B)) #?? user? system elapsed ?# 0.524?? 0.020?? 0.548 system.time(z2.0<-Marius.2.0(A,B)) #?? user? system elapsed ?# 0.968?? 0.216?? 1.189 ?system.time(z1<-perhaps(A,B)) #?? user? system elapsed ?# 1.264?? 0.204?? 1.473 ?attr(z3.0,"dim")<-dim(z2.0) ?identical(z3.0,z2.0) #[1] TRUE identical(z1,z3.0) #[1] TRUE A.K. ----- Original Message ----- From: Marius Hofert <marius.hofert at math.ethz.ch> To: R-help <r-help at r-project.org> Cc: Sent: Saturday, December 8, 2012 6:28 AM Subject: [R] How to efficiently compare each row in a matrix with each row in another matrix? Dear expeRts, I have two matrices A and B. They have the same number of columns but possibly different number of rows. I would like to compare each row of A with each row of B and check whether all entries in a row of A are less than or equal to all entries in a row of B. Here is a minimal working example: A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix B <- matrix(1:10, ncol=2) # (5, 2) matrix ( ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) ) # (3, 5) = (nrow(A), nrow(B)) matrix The question is: How can this be implemented more efficiently in R, that is, in a faster way? Thanks & cheers, Marius ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
arun
2012-Dec-10 23:17 UTC
[R] How to efficiently compare each row in a matrix with each row in another matrix?
HI Jonathan, Thanks for the email. I crosschecked my output with the output generated from the initial solution ("ind"). perhaps <- function(A,B){ nA <- nrow(A) nB <- nrow(B) C <- kronecker(matrix(1,nrow=nA,ncol=1),B) >kronecker(A,matrix(1,nrow=nB,ncol=1)) matrix(rowSums(C) == ncol(A), nA, nB, byrow=TRUE) } Marius.5.0.Prev <- function(A,B) outer(rowMaxs(A),rowMins(B),'<') #Jonathan function Marius.5.0 <- function(A,B) outer(rowMaxs(A),rowMins(B),'<=')? #updated Jonathan function ?Marius.4.0<-function(A,B){apply(B,1,function(x) colSums(x>=t(A)))==ncol(A)} ? A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix ? B <- matrix(1:10, ncol=2) # (5, 2 ?ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) #original function ?ind #????? [,1]? [,2]? [,3]? [,4]? [,5] #[1,]? TRUE? TRUE? TRUE? TRUE? TRUE #[2,] FALSE FALSE? TRUE? TRUE? TRUE #[3,] FALSE FALSE FALSE FALSE FALSE ?Marius.4.0(A,B) #????? [,1]? [,2]? [,3]? [,4]? [,5] #[1,]? TRUE? TRUE? TRUE? TRUE? TRUE #[2,] FALSE FALSE? TRUE? TRUE? TRUE #[3,] FALSE FALSE FALSE FALSE FALSE perhaps(A,B) #????? [,1]? [,2]? [,3]? [,4]? [,5] #[1,]? TRUE? TRUE? TRUE? TRUE? TRUE #[2,] FALSE FALSE? TRUE? TRUE? TRUE #[3,] FALSE FALSE FALSE FALSE FALSE ?Marius.5.0(A,B) #????? [,1]? [,2]? [,3]? [,4]? [,5] #[1,] FALSE? TRUE? TRUE? TRUE? TRUE #[2,] FALSE FALSE FALSE? TRUE? TRUE #[3,] FALSE FALSE FALSE FALSE FALSE Marius.5.0.Prev(A,B) #????? [,1]? [,2]? [,3]? [,4]? [,5] #[1,] FALSE FALSE? TRUE? TRUE? TRUE #[2,] FALSE FALSE FALSE FALSE? TRUE #[3,] FALSE FALSE FALSE FALSE FALSE A.K. ----- Original Message ----- From: "j2kennel at gmail.com" <j2kennel at gmail.com> To: smartpink111 at yahoo.com Cc: Sent: Monday, December 10, 2012 5:39 PM Subject: Re: How to efficiently compare each row in a matrix with each row in another matrix? Hello Arun, I saw your message.? For some reason it doesn't let me post on the help site.? It looks like I forgot an equal sign.? It wasn't a problem with the random numbers because there was little chance a number would be repeated.? It should be: Marius.5.0 <- function(A,B) outer(rowMaxs(A),rowMins(B),'<=') #Jonathan's code However, if you manually look at the example you provided, Marius.4.0 doesn't provide the correct answer either.? There is one too many TRUE values (location [2,3]).? The updated Marius.5.0 gives the correct result. -Jonathan