Ok, this was misleading. And was not that important. My result matrix should look like this: 1 2 3 4 5 6 7 ... 1 p1 p2 2 p 3 4 p1 etc are the frequencies of the combinations 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs. Thanks Hermann 2015-10-07 2:40 GMT+02:00 Boris Steipe <boris.steipe at utoronto.ca>:> Since order is not important to you, you can order your pairs (e.g. > decreasing) before compiling the frequencies. > But I don't understand the second part about values "that do not appear in > the matrix". Do you mean you want to assess all combinations? If that's the > case I would think about a hash table or other indexed data structure, > rather than iterating through a matrix. > > > B. > > > > On Oct 6, 2015, at 4:59 PM, Hermann Norpois <hnorpois at gmail.com> wrote: > > > Hello, > > > > I have a matrix mat (see dput(mat)) > > > >> mat > > [,1] [,2] > > [1,] 5 6 > > [2,] 6 5 > > [3,] 5 4 > > [4,] 5 5 > > .... > > > > I want the frequencies of the pairs in a new matrix, whereas the > > combination 5 and 6 is the same as 6 and 5 (see the first two rows of > mat). > > In other words: What is the probability of each combination (each row) > > ignoring the order in the combination. As a result I would like to have a > > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not > appear > > in my matrix. > > > > dput (mat) > > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5, > > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, > > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L)) > > > > Thanks > > Hermann > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Still not sure I understand. But here is what I think you might mean: # Your data mat <- structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L)) # Create a square matrix with enough space to have an element for each pair. Since # order is not important, only the upper triangle is used. If the matrix is # large and sparse, a different approach might be needed. freq <- matrix(numeric(max(mat) * max(mat)), nrow = max(mat), ncol = max(mat)) # Loop over your input for (i in 1:nrow(mat)) { # Sort the elements of a row by size. x <- sort(mat[i,]) # Increment the corresponding element of the frequency matrix freq[x[1], x[2]] <- freq[x[1], x[2]] + 1 } freq Cheers, B. On Oct 7, 2015, at 1:17 AM, Hermann Norpois <hnorpois at gmail.com> wrote:> Ok, this was misleading. And was not that important. My result matrix should look like this: > > 1 2 3 4 5 6 7 ... > 1 p1 p2 > 2 p > 3 > 4 > > p1 etc are the frequencies of the combinations > > 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs. > Thanks Hermann > > 2015-10-07 2:40 GMT+02:00 Boris Steipe <boris.steipe at utoronto.ca>: > Since order is not important to you, you can order your pairs (e.g. decreasing) before compiling the frequencies. > But I don't understand the second part about values "that do not appear in the matrix". Do you mean you want to assess all combinations? If that's the case I would think about a hash table or other indexed data structure, rather than iterating through a matrix. > > > B. > > > > On Oct 6, 2015, at 4:59 PM, Hermann Norpois <hnorpois at gmail.com> wrote: > > > Hello, > > > > I have a matrix mat (see dput(mat)) > > > >> mat > > [,1] [,2] > > [1,] 5 6 > > [2,] 6 5 > > [3,] 5 4 > > [4,] 5 5 > > .... > > > > I want the frequencies of the pairs in a new matrix, whereas the > > combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat). > > In other words: What is the probability of each combination (each row) > > ignoring the order in the combination. As a result I would like to have a > > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear > > in my matrix. > > > > dput (mat) > > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5, > > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, > > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L)) > > > > Thanks > > Hermann > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
As with Boris, I'm not sure what you are looking for, but this may help> # To get all possibilities, create a grid > grd <- expand.grid(0:9, 0:9) > # Extract those with smaller first column values > grd <- grd[grd$Var1 <= grd$Var2,] > # Tabulate after pasting first and second column > grd2 <- data.frame(table(apply(grd, 1, paste0, collapse=" - "))) > > # Combine the two tables and subtract 1 to get rid of the counts from grd2$Freq > dta2 <- rbind(grd2, dta) > freqs <- data.frame(xtabs(Freq~Var1, dta2) - 1) > str(freqs)'data.frame': 55 obs. of 2 variables: $ Var1: Factor w/ 55 levels "0 - 0","0 - 1",..: 1 2 3 4 5 6 7 8 9 10 ... $ Freq: num 0 0 0 0 0 0 0 0 0 0 ...> freqs[c(40:50), ]Var1 Freq 40 4 - 9 0 41 5 - 5 2 42 5 - 6 10 43 5 - 7 4 44 5 - 8 0 45 5 - 9 0 46 6 - 6 0 47 6 - 7 2 48 6 - 8 0 49 6 - 9 0 50 7 - 7 0 ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Boris Steipe Sent: Wednesday, October 7, 2015 8:10 AM To: Hermann Norpois Cc: r-help Subject: Re: [R] Measure the frequencies of pairs in a matrix Still not sure I understand. But here is what I think you might mean: # Your data mat <- structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L)) # Create a square matrix with enough space to have an element for each pair. Since # order is not important, only the upper triangle is used. If the matrix is # large and sparse, a different approach might be needed. freq <- matrix(numeric(max(mat) * max(mat)), nrow = max(mat), ncol = max(mat)) # Loop over your input for (i in 1:nrow(mat)) { # Sort the elements of a row by size. x <- sort(mat[i,]) # Increment the corresponding element of the frequency matrix freq[x[1], x[2]] <- freq[x[1], x[2]] + 1 } freq Cheers, B. On Oct 7, 2015, at 1:17 AM, Hermann Norpois <hnorpois at gmail.com> wrote:> Ok, this was misleading. And was not that important. My result matrix should look like this: > > 1 2 3 4 5 6 7 ... > 1 p1 p2 > 2 p > 3 > 4 > > p1 etc are the frequencies of the combinations > > 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs. > Thanks Hermann > > 2015-10-07 2:40 GMT+02:00 Boris Steipe <boris.steipe at utoronto.ca>: > Since order is not important to you, you can order your pairs (e.g. decreasing) before compiling the frequencies. > But I don't understand the second part about values "that do not appear in the matrix". Do you mean you want to assess all combinations? If that's the case I would think about a hash table or other indexed data structure, rather than iterating through a matrix. > > > B. > > > > On Oct 6, 2015, at 4:59 PM, Hermann Norpois <hnorpois at gmail.com> wrote: > > > Hello, > > > > I have a matrix mat (see dput(mat)) > > > >> mat > > [,1] [,2] > > [1,] 5 6 > > [2,] 6 5 > > [3,] 5 4 > > [4,] 5 5 > > .... > > > > I want the frequencies of the pairs in a new matrix, whereas the > > combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat). > > In other words: What is the probability of each combination (each row) > > ignoring the order in the combination. As a result I would like to have a > > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear > > in my matrix. > > > > dput (mat) > > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5, > > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, > > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L)) > > > > Thanks > > Hermann > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
You could also call table() on the columns of the input matrix, first converting them to factors with levels 1:max. Then add together the upper and lower triangles of the table if order is not important. E.g., f2 <- function (mat) { maxMat <- max(mat) stopifnot(is.matrix(mat), all(mat %in% seq_len(maxMat))) L <- split(factor(mat, levels = seq_len(maxMat)), col(mat)) Table <- do.call(table, unname(L)) ignoreOrder <- function(M) { stopifnot(length(dim(M)) == 2) lower <- lower.tri(M, diag = FALSE) upper <- upper.tri(M, diag = FALSE) M[lower] <- M[lower] + t(M)[lower] M[upper] <- t(M)[upper] M } ignoreOrder(Table) }> mat <- structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))> f2(mat)1 2 3 4 5 6 7 1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0 2 0 0 2 4 0 0 2 0 4 0 0 5 0 0 0 4 2 10 4 6 0 0 0 0 10 0 2 7 0 0 2 0 4 2 0 Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Oct 7, 2015 at 6:09 AM, Boris Steipe <boris.steipe at utoronto.ca> wrote:> Still not sure I understand. But here is what I think you might mean: > > # Your data > mat <- structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5, > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L)) > > # Create a square matrix with enough space to have an element for each pair. Since > # order is not important, only the upper triangle is used. If the matrix is > # large and sparse, a different approach might be needed. > freq <- matrix(numeric(max(mat) * max(mat)), nrow = max(mat), ncol = max(mat)) > > # Loop over your input > for (i in 1:nrow(mat)) { > # Sort the elements of a row by size. > x <- sort(mat[i,]) > # Increment the corresponding element of the frequency matrix > freq[x[1], x[2]] <- freq[x[1], x[2]] + 1 > } > > freq > > > Cheers, > B. > > > > > > On Oct 7, 2015, at 1:17 AM, Hermann Norpois <hnorpois at gmail.com> wrote: > >> Ok, this was misleading. And was not that important. My result matrix should look like this: >> >> 1 2 3 4 5 6 7 ... >> 1 p1 p2 >> 2 p >> 3 >> 4 >> >> p1 etc are the frequencies of the combinations >> >> 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs. >> Thanks Hermann >> >> 2015-10-07 2:40 GMT+02:00 Boris Steipe <boris.steipe at utoronto.ca>: >> Since order is not important to you, you can order your pairs (e.g. decreasing) before compiling the frequencies. >> But I don't understand the second part about values "that do not appear in the matrix". Do you mean you want to assess all combinations? If that's the case I would think about a hash table or other indexed data structure, rather than iterating through a matrix. >> >> >> B. >> >> >> >> On Oct 6, 2015, at 4:59 PM, Hermann Norpois <hnorpois at gmail.com> wrote: >> >> > Hello, >> > >> > I have a matrix mat (see dput(mat)) >> > >> >> mat >> > [,1] [,2] >> > [1,] 5 6 >> > [2,] 6 5 >> > [3,] 5 4 >> > [4,] 5 5 >> > .... >> > >> > I want the frequencies of the pairs in a new matrix, whereas the >> > combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat). >> > In other words: What is the probability of each combination (each row) >> > ignoring the order in the combination. As a result I would like to have a >> > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear >> > in my matrix. >> > >> > dput (mat) >> > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5, >> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, >> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L)) >> > >> > Thanks >> > Hermann >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
More like this?> mat <- structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,+ 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, + 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))> > # Convert columns in mat so first column is always smaller > mat2 <- data.frame(t(apply(mat, 1, range))) > mat2$X1 <- factor(mat2$X1, 1:9) > mat2$X2 <- factor(mat2$X2, 1:9) > tbl <- xtabs(~X1+X2, mat2) > tbl.p <- tbl/sum(tbl) > round(tbl.p, 2)X2 X1 1 2 3 4 5 6 7 8 9 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3 0.00 0.00 0.00 0.08 0.00 0.00 0.08 0.00 0.00 4 0.00 0.00 0.00 0.00 0.15 0.00 0.00 0.00 0.00 5 0.00 0.00 0.00 0.00 0.08 0.38 0.15 0.00 0.00 6 0.00 0.00 0.00 0.00 0.00 0.00 0.08 0.00 0.00 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 This puts everything on the diagonal and upper triangle. To get the lower triangle just use> tbl <- xtabs(~X2+X1, mat2)------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Hermann Norpois Sent: Wednesday, October 7, 2015 12:17 AM To: Boris Steipe; r-help Subject: Re: [R] Measure the frequencies of pairs in a matrix Ok, this was misleading. And was not that important. My result matrix should look like this: 1 2 3 4 5 6 7 ... 1 p1 p2 2 p 3 4 p1 etc are the frequencies of the combinations 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs. Thanks Hermann 2015-10-07 2:40 GMT+02:00 Boris Steipe <boris.steipe at utoronto.ca>:> Since order is not important to you, you can order your pairs (e.g. > decreasing) before compiling the frequencies. > But I don't understand the second part about values "that do not appear in > the matrix". Do you mean you want to assess all combinations? If that's the > case I would think about a hash table or other indexed data structure, > rather than iterating through a matrix. > > > B. > > > > On Oct 6, 2015, at 4:59 PM, Hermann Norpois <hnorpois at gmail.com> wrote: > > > Hello, > > > > I have a matrix mat (see dput(mat)) > > > >> mat > > [,1] [,2] > > [1,] 5 6 > > [2,] 6 5 > > [3,] 5 4 > > [4,] 5 5 > > .... > > > > I want the frequencies of the pairs in a new matrix, whereas the > > combination 5 and 6 is the same as 6 and 5 (see the first two rows of > mat). > > In other words: What is the probability of each combination (each row) > > ignoring the order in the combination. As a result I would like to have a > > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not > appear > > in my matrix. > > > > dput (mat) > > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5, > > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, > > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L)) > > > > Thanks > > Hermann > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.