R. Mark Sharp
2021-Jun-02 01:59 UTC
[R] wanting to count instances of values in each cell of a series of simulated symmetric matrices of the same size
I want to capture the entire distribution of values for each cell in a sequence of symmetric matrices of the same size. The diagonal values are all 0.5 so I need only the values above or below the diagonal. A small example with three of the structures I am wanting to count follows: F G H I J F 0.6250 0.3750 0.2500 0.1875 0.125 G 0.3750 0.6250 0.2500 0.1875 0.125 H 0.2500 0.2500 0.5000 0.1875 0.125 I 0.1875 0.1875 0.1875 0.5000 0.250 J 0.1250 0.1250 0.1250 0.2500 0.500 F G H I J F 0.5625 0.3125 0.1875 0.1250 0.125 G 0.3125 0.5625 0.1875 0.1250 0.125 H 0.1875 0.1875 0.5000 0.1875 0.125 I 0.1250 0.1250 0.1875 0.5000 0.250 J 0.1250 0.1250 0.1250 0.2500 0.500 F G H I J F 0.50000 0.25000 0.1250 0.09375 0.0625 G 0.25000 0.50000 0.1250 0.09375 0.0625 H 0.12500 0.12500 0.5000 0.18750 0.1250 I 0.09375 0.09375 0.1875 0.50000 0.2500 J 0.06250 0.06250 0.1250 0.25000 0.5000 To be more specific, I have coded up a solution for a single cell with the sequence of values (one from each matrix) in a vector. I used match() below and it works with a matrix but I do not know how to do what is in the if statements with matrices. Since the number of values and the values will be different among the various cells a simple array structure does not seem appropriate and I am assuming I will need to use a list but I would like to do as much as I can with matrices for speed and clarity. #' Counts the number of occurrences of each kinship value seen for a pair of #' individuals. #' #' @examples #' \donttest{ #' set.seed(20210529) #' kSamples <- sample(c(0, 0.0675, 0.125, 0.25, 0.5, 0.75), 10000, replace = TRUE, #' prob = c(0.005, 0.3, 0.15, 0.075, 0.0375, 0.01875)) #' kVC <- list(kinshipValues = numeric(0), #' kinshipCounts = numeric(0)) #' for (kSample in kSamples) { #' kVC <- countKinshipValues(kSample, kVC$kinshipValues, kVC$kinshipCounts) #' } #' kVC #' ## $kinshipValues #' ## [1] 0.2500 0.1250 0.0675 0.7500 0.5000 0.0000 #' ## #' ## $kinshipCounts #' ## [1] 301 2592 5096 1322 592 97 #' } #' #' @param kValue numeric value being counted (kinship value in #' \emph{nprcgenekeepr}) #' @param kinshipValues vector of unique values of \code{kValue} seen #' thus far. #' @param kinshipCounts vector of the counts of the unique values of #' \code{kValue} seen thus far. #' @export countKinshipValues <- function(kValue, kinshipValues = numeric(0), kinshipCounts = numeric(0)) { kinshipValue <- match(kValue, kinshipValues, nomatch = -1L) if (kinshipValue == -1L) { kinshipValues <- c(kinshipValues, kValue) kinshipCounts[length(kinshipCounts) + 1] <- 1 } else { kinshipCounts[kinshipValue] <- kinshipCounts[kinshipValue] + 1 } list(kinshipValues = kinshipValues, kinshipCounts = kinshipCounts) } Mark R. Mark Sharp, Ph.D. Data Scientist and Biomedical Statistical Consultant 7526 Meadow Green St. San Antonio, TX 78251 mobile: 210-218-2868 rmsharp at me.com
Bert Gunter
2021-Jun-02 02:44 UTC
[R] wanting to count instances of values in each cell of a series of simulated symmetric matrices of the same size
Come again?! The diagonal values in your example are not all .5. If space is not an issue, a straightforward approach is to collect all the matrices into a 3d array and use indexing. Here is a simple reprex (as you did not provide one in a convenient form, e.g via dput()) x <- matrix(1:9, nr = 3); y <- x+10 diag(x) <- diag(y) <- 0 print(x) ; print(y) ## Now you need to populate a 3 x 3 x 2 array with these matrices ## How you do this depends on your naming conventions ## You might use a loop, or ls() and assign(), ## or collect your matrices into a list and use do.call() or ... ## You will *not*want to do this if you have lots of matrices: list_of_mats <- list(x,y) arr <- array(do.call(c,list_of_mats), dim = c(3,3,length(list_of_mats))) arr arr[2,3,] ## all the values in the [2,3] cell of the matrices; do whatever you want with them. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Jun 1, 2021 at 7:00 PM R. Mark Sharp via R-help < r-help at r-project.org> wrote:> I want to capture the entire distribution of values for each cell in a > sequence of symmetric matrices of the same size. The diagonal values are > all 0.5 so I need only the values above or below the diagonal. > > A small example with three of the structures I am wanting to count follows: > F G H I J > F 0.6250 0.3750 0.2500 0.1875 0.125 > G 0.3750 0.6250 0.2500 0.1875 0.125 > H 0.2500 0.2500 0.5000 0.1875 0.125 > I 0.1875 0.1875 0.1875 0.5000 0.250 > J 0.1250 0.1250 0.1250 0.2500 0.500 > > F G H I J > F 0.5625 0.3125 0.1875 0.1250 0.125 > G 0.3125 0.5625 0.1875 0.1250 0.125 > H 0.1875 0.1875 0.5000 0.1875 0.125 > I 0.1250 0.1250 0.1875 0.5000 0.250 > J 0.1250 0.1250 0.1250 0.2500 0.500 > > F G H I J > F 0.50000 0.25000 0.1250 0.09375 0.0625 > G 0.25000 0.50000 0.1250 0.09375 0.0625 > H 0.12500 0.12500 0.5000 0.18750 0.1250 > I 0.09375 0.09375 0.1875 0.50000 0.2500 > J 0.06250 0.06250 0.1250 0.25000 0.5000 > > > To be more specific, I have coded up a solution for a single cell with the > sequence of values (one from each matrix) in a vector. > > I used match() below and it works with a matrix but I do not know how to > do what is in the if statements with matrices. Since the number of values > and the values will be different among the various cells a simple array > structure does not seem appropriate and I am assuming I will need to use a > list but I would like to do as much as I can with matrices for speed and > clarity. > > #' Counts the number of occurrences of each kinship value seen for a pair > of > #' individuals. > #' > #' @examples > #' \donttest{ > #' set.seed(20210529) > #' kSamples <- sample(c(0, 0.0675, 0.125, 0.25, 0.5, 0.75), 10000, replace > = TRUE, > #' prob = c(0.005, 0.3, 0.15, 0.075, 0.0375, 0.01875)) > #' kVC <- list(kinshipValues = numeric(0), > #' kinshipCounts = numeric(0)) > #' for (kSample in kSamples) { > #' kVC <- countKinshipValues(kSample, kVC$kinshipValues, > kVC$kinshipCounts) > #' } > #' kVC > #' ## $kinshipValues > #' ## [1] 0.2500 0.1250 0.0675 0.7500 0.5000 0.0000 > #' ## > #' ## $kinshipCounts > #' ## [1] 301 2592 5096 1322 592 97 > #' } > #' > #' @param kValue numeric value being counted (kinship value in > #' \emph{nprcgenekeepr}) > #' @param kinshipValues vector of unique values of \code{kValue} seen > #' thus far. > #' @param kinshipCounts vector of the counts of the unique values of > #' \code{kValue} seen thus far. > #' @export > countKinshipValues <- function(kValue, kinshipValues = numeric(0), > kinshipCounts = numeric(0)) { > kinshipValue <- match(kValue, kinshipValues, nomatch = -1L) > if (kinshipValue == -1L) { > kinshipValues <- c(kinshipValues, kValue) > kinshipCounts[length(kinshipCounts) + 1] <- 1 > } else { > kinshipCounts[kinshipValue] <- kinshipCounts[kinshipValue] + 1 > } > list(kinshipValues = kinshipValues, > kinshipCounts = kinshipCounts) > } > > Mark > > > R. Mark Sharp, Ph.D. > Data Scientist and Biomedical Statistical Consultant > 7526 Meadow Green St. > San Antonio, TX 78251 > mobile: 210-218-2868 > rmsharp at me.com > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]