Dear helpeRs, I have two matrices: mat1 <- expand.grid(0:2, 0:2, 0:2) mat2 <- aa[c(19, 16, 13, 24, 8), ] where mat2 is always a subset of mat1 I need to find the corersponding row numbers in mat1 for each row in mat2. For this I have the following code: apply(mat2, 1, function(x) { which(apply(mat1, 1, function(y) { sum(x == y) }) == ncol(mat1)) }) The code is vectorized, but I wonder if there is a simpler (hence faster) matrix computation that I miss. Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101
Here is a slightly more compact version of your function which might run faster (I did not test timings) since it does not use the sum: apply(mat2, 1, function(x) which(apply(mat1, 1, function(y) all(x == y)) =TRUE)) -Christos -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Adrian Dusa Sent: Saturday, January 20, 2007 5:15 PM To: r-help at stat.math.ethz.ch Subject: [R] comparing two matrices Dear helpeRs, I have two matrices: mat1 <- expand.grid(0:2, 0:2, 0:2) mat2 <- aa[c(19, 16, 13, 24, 8), ] where mat2 is always a subset of mat1 I need to find the corersponding row numbers in mat1 for each row in mat2. For this I have the following code: apply(mat2, 1, function(x) { which(apply(mat1, 1, function(y) { sum(x == y) }) == ncol(mat1)) }) The code is vectorized, but I wonder if there is a simpler (hence faster) matrix computation that I miss. Thank you, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Sun, 2007-01-21 at 00:14 +0200, Adrian Dusa wrote:> Dear helpeRs, > > I have two matrices: > mat1 <- expand.grid(0:2, 0:2, 0:2) > mat2 <- aa[c(19, 16, 13, 24, 8), ] > > where mat2 is always a subset of mat1 > > I need to find the corersponding row numbers in mat1 for each row in mat2. > For this I have the following code: > > apply(mat2, 1, function(x) { > which(apply(mat1, 1, function(y) { > sum(x == y) > }) == ncol(mat1)) > }) > > The code is vectorized, but I wonder if there is a simpler (hence faster) > matrix computation that I miss. > > Thank you, > AdrianI have not fully tested this, but how about: mat1 <- matrix(1:20, ncol = 4, byrow = TRUE) mat2 <- matrix(1:60, ncol = 4, byrow = TRUE) mat2 <- mat2[sample(15), ]> mat1[,1] [,2] [,3] [,4] [1,] 1 2 3 4 [2,] 5 6 7 8 [3,] 9 10 11 12 [4,] 13 14 15 16 [5,] 17 18 19 20> mat2[,1] [,2] [,3] [,4] [1,] 13 14 15 16 [2,] 5 6 7 8 [3,] 41 42 43 44 [4,] 17 18 19 20 [5,] 21 22 23 24 [6,] 25 26 27 28 [7,] 53 54 55 56 [8,] 9 10 11 12 [9,] 57 58 59 60 [10,] 33 34 35 36 [11,] 49 50 51 52 [12,] 45 46 47 48 [13,] 1 2 3 4 [14,] 29 30 31 32 [15,] 37 38 39 40> which(apply(matrix(mat2 %in% mat1, dim(mat2)), 1, all))[1] 1 2 4 8 13 HTH, Marc Schwartz
Hello Marc and Dimitris, There was an error in my first example (therefore not reproducible), so mat1 <- expand.grid(0:2, 0:2, 0:2) mat2 <- mat1[c(19, 16, 13, 24, 8), ] Your solution works if and only if the elements in both matrices are unique. Unfortunately, it does not apply for my matrices where elements do repeat (only the rows are unique).> which(apply(matrix(mat1 %in% mat2, dim(mat1)), 1, all))integer(0)> which((mat1 %in% mat2)[1:nrow(mat1)])integer(0) Another solution would be using base 3 operations: mat1 <- expand.grid(0:2, 0:2, 0:2)[, 3:1] mat2 <- mat1[c(19, 16, 13, 24, 8), ] mylines <- mat2[, 1] for (i in 2:ncol(mat2)) {mylines <- 3*mylines + mat2[, i]} mylines + 1 [1] 19 16 13 24 8 I was still hoping for a direct matrix function to avoid the for() loop. Thanks, Adrian On Sunday 21 January 2007 01:06, Marc Schwartz wrote:> On Sun, 2007-01-21 at 00:14 +0200, Adrian Dusa wrote: > > Dear helpeRs, > > > > I have two matrices: > > mat1 <- expand.grid(0:2, 0:2, 0:2) > > mat2 <- aa[c(19, 16, 13, 24, 8), ] > > > > where mat2 is always a subset of mat1 > > > > I need to find the corersponding row numbers in mat1 for each row in > > mat2. For this I have the following code: > > > > apply(mat2, 1, function(x) { > > which(apply(mat1, 1, function(y) { > > sum(x == y) > > }) == ncol(mat1)) > > }) > > > > The code is vectorized, but I wonder if there is a simpler (hence faster) > > matrix computation that I miss. > > > > Thank you, > > Adrian > > I have not fully tested this, but how about: > > mat1 <- matrix(1:20, ncol = 4, byrow = TRUE) > mat2 <- matrix(1:60, ncol = 4, byrow = TRUE) > mat2 <- mat2[sample(15), ] > > > mat1 > > [,1] [,2] [,3] [,4] > [1,] 1 2 3 4 > [2,] 5 6 7 8 > [3,] 9 10 11 12 > [4,] 13 14 15 16 > [5,] 17 18 19 20 > > > mat2 > > [,1] [,2] [,3] [,4] > [1,] 13 14 15 16 > [2,] 5 6 7 8 > [3,] 41 42 43 44 > [4,] 17 18 19 20 > [5,] 21 22 23 24 > [6,] 25 26 27 28 > [7,] 53 54 55 56 > [8,] 9 10 11 12 > [9,] 57 58 59 60 > [10,] 33 34 35 36 > [11,] 49 50 51 52 > [12,] 45 46 47 48 > [13,] 1 2 3 4 > [14,] 29 30 31 32 > [15,] 37 38 39 40 > > > which(apply(matrix(mat2 %in% mat1, dim(mat2)), 1, all)) > > [1] 1 2 4 8 13 > > > HTH, > > Marc Schwartz-- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101
I think the following should work in your case: mat1 <- data.matrix(expand.grid(0:2, 0:2, 0:2)) mat2 <- mat1[c(19, 16, 13, 24, 8), ] ############ ind1 <- apply(mat1, 1, paste, collapse = "/") ind2 <- apply(mat2, 1, paste, collapse = "/") match(ind2, ind1) I hope it helps. Best, Dimitris Quoting Adrian Dusa <dusa.adrian at gmail.com>:> Hello Marc and Dimitris, > > There was an error in my first example (therefore not reproducible), so > mat1 <- expand.grid(0:2, 0:2, 0:2) > mat2 <- mat1[c(19, 16, 13, 24, 8), ] > > Your solution works if and only if the elements in both matrices are unique. > Unfortunately, it does not apply for my matrices where elements do repeat > (only the rows are unique). > >> which(apply(matrix(mat1 %in% mat2, dim(mat1)), 1, all)) > integer(0) > >> which((mat1 %in% mat2)[1:nrow(mat1)]) > integer(0) > > > Another solution would be using base 3 operations: > mat1 <- expand.grid(0:2, 0:2, 0:2)[, 3:1] > mat2 <- mat1[c(19, 16, 13, 24, 8), ] > > mylines <- mat2[, 1] > for (i in 2:ncol(mat2)) {mylines <- 3*mylines + mat2[, i]} > mylines + 1 > [1] 19 16 13 24 8 > > > I was still hoping for a direct matrix function to avoid the for() loop. > Thanks, > Adrian > > > On Sunday 21 January 2007 01:06, Marc Schwartz wrote: >> On Sun, 2007-01-21 at 00:14 +0200, Adrian Dusa wrote: >> > Dear helpeRs, >> > >> > I have two matrices: >> > mat1 <- expand.grid(0:2, 0:2, 0:2) >> > mat2 <- aa[c(19, 16, 13, 24, 8), ] >> > >> > where mat2 is always a subset of mat1 >> > >> > I need to find the corersponding row numbers in mat1 for each row in >> > mat2. For this I have the following code: >> > >> > apply(mat2, 1, function(x) { >> > which(apply(mat1, 1, function(y) { >> > sum(x == y) >> > }) == ncol(mat1)) >> > }) >> > >> > The code is vectorized, but I wonder if there is a simpler (hence faster) >> > matrix computation that I miss. >> > >> > Thank you, >> > Adrian >> >> I have not fully tested this, but how about: >> >> mat1 <- matrix(1:20, ncol = 4, byrow = TRUE) >> mat2 <- matrix(1:60, ncol = 4, byrow = TRUE) >> mat2 <- mat2[sample(15), ] >> >> > mat1 >> >> [,1] [,2] [,3] [,4] >> [1,] 1 2 3 4 >> [2,] 5 6 7 8 >> [3,] 9 10 11 12 >> [4,] 13 14 15 16 >> [5,] 17 18 19 20 >> >> > mat2 >> >> [,1] [,2] [,3] [,4] >> [1,] 13 14 15 16 >> [2,] 5 6 7 8 >> [3,] 41 42 43 44 >> [4,] 17 18 19 20 >> [5,] 21 22 23 24 >> [6,] 25 26 27 28 >> [7,] 53 54 55 56 >> [8,] 9 10 11 12 >> [9,] 57 58 59 60 >> [10,] 33 34 35 36 >> [11,] 49 50 51 52 >> [12,] 45 46 47 48 >> [13,] 1 2 3 4 >> [14,] 29 30 31 32 >> [15,] 37 38 39 40 >> >> > which(apply(matrix(mat2 %in% mat1, dim(mat2)), 1, all)) >> >> [1] 1 2 4 8 13 >> >> >> HTH, >> >> Marc Schwartz > > -- > Adrian Dusa > Romanian Social Data Archive > 1, Schitu Magureanu Bd > 050025 Bucharest sector 5 > Romania > Tel./Fax: +40 21 3126618 \ > +40 21 3120210 / int.101 > >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
On Sunday 21 January 2007 12:04, Dimitris Rizopoulos wrote:> I think the following should work in your case: > > mat1 <- data.matrix(expand.grid(0:2, 0:2, 0:2)) > mat2 <- mat1[c(19, 16, 13, 24, 8), ] > ############ > ind1 <- apply(mat1, 1, paste, collapse = "/") > ind2 <- apply(mat2, 1, paste, collapse = "/") > match(ind2, ind1)Oh yes, I thought about that too. It works fast enough for small matrices, but I deal with very large ones. Using paste() on such matrices decreases the speed dramatically. Thanks again, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101
But this is using paste() the wrong way round. A better way would be> join <- function(x) do.call("paste", c(as.data.frame(x), sep = "\r")) > which(join(mat1) %in% join(mat2))[1] 8 13 16 19 24 This is essentially the technique used by duplicated.data.frame Bill Venables -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Adrian Dusa Sent: Sunday, 21 January 2007 8:17 PM To: Dimitris Rizopoulos Cc: marc_schwartz at comcast.net; r-help at stat.math.ethz.ch Subject: Re: [R] comparing two matrices On Sunday 21 January 2007 12:04, Dimitris Rizopoulos wrote:> I think the following should work in your case: > > mat1 <- data.matrix(expand.grid(0:2, 0:2, 0:2)) > mat2 <- mat1[c(19, 16, 13, 24, 8), ] > ############ > ind1 <- apply(mat1, 1, paste, collapse = "/") > ind2 <- apply(mat2, 1, paste, collapse = "/") > match(ind2, ind1)Oh yes, I thought about that too. It works fast enough for small matrices, but I deal with very large ones. Using paste() on such matrices decreases the speed dramatically. Thanks again, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.