Hi, I have a quick question regarding grouping data in R. I have the following matrix, A = 0 1 1 0 1 0 0 1 1 0 0 1 1 1 0 0 1 0 0 1 0 1 1 0 I would like to learn how I can group the data on unique rows of A and also count the number of times the row occurred. The command unique(A) provides a matrix with the unique rows, i.e., B B = 0 1 1 0 1 0 0 1 1 1 0 0 but I am also interested in learning how I can count the number of unque rows too. The result can be in either one or two matrices, e.g., B = 0 1 1 0 1 0 0 1 1 1 0 0 Count = 1 3 2 or, C = 0 1 1 0 1 1 0 0 1 3 1 1 0 0 2 Thanks in advance. Rob Kissell Robert.Kissell at Citigroup.com
Here is a way, though probably not the best way: table(apply(A, 1, function(x){paste(x, collapse="")})) Kissell, Robert [EQRE] wrote:> Hi, > > I have a quick question regarding grouping data in R. I have the following matrix, > > A = 0 1 1 0 > 1 0 0 1 > 1 0 0 1 > 1 1 0 0 > 1 0 0 1 > 0 1 1 0 > > I would like to learn how I can group the data on unique rows of A and also count the number of times the row occurred. The command unique(A) provides a matrix with the unique rows, i.e., B > > B = 0 1 1 0 > 1 0 0 1 > 1 1 0 0 > > but I am also interested in learning how I can count the number of unque rows too. The result can be in either one or two matrices, e.g., > > B = 0 1 1 0 > 1 0 0 1 > 1 1 0 0 > > Count = 1 > 3 > 2 > > or, > > C = 0 1 1 0 1 > 1 0 0 1 3 > 1 1 0 0 2-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894
Kissell, Robert [EQRE] <robert.kissell <at> citigroup.com> writes:> I would like to learn how I can group the data on unique rows of A and alsocount the number of times the row> occurred.You have already provided the answer, unique(A), to the first part of your question. Here are two solutions to the second part: 1. Since each row can be regarded as the representation of a binary number: table(A%*%2^(0:3)) 2. Another possibility not dependent on the binary nature of the data is to define: "%+.*%" <- function(a,b)apply(b,2,function(x)apply(t(a) == x,2,all)) This function is the +.x of APL. It defines an infix function that does a matrix multiply of matrix a and matrix b except it replaces the usual inner product of two vectors x and y with all(x==y). In terms of this function, the answer is: colSums( A %+.*% t(unique(A)) )
In rereading this, the solution works but my comments on the names of APL operators was off. I think this would make more sense in terms of naming: "%all.==%" <- function(a,b)apply(b,2,function(x)apply(t(a) == x,2,all)) colSums( A %all.==% t(unique(A)) ) Gabor Grothendieck <ggrothendieck <at> myway.com> writes: : : Kissell, Robert [EQRE] <robert.kissell <at> citigroup.com> writes: : : > I would like to learn how I can group the data on unique rows of A and also : count the number of times the row : > occurred. : : You have already provided the answer, unique(A), to the first part of your : question. Here are two solutions to the second part: : : 1. Since each row can be regarded as the representation of a binary : number: : : table(A%*%2^(0:3)) : : 2. Another possibility not dependent on the binary nature of the : data is to define: : : "%+.*%" <- function(a,b)apply(b,2,function(x)apply(t(a) == x,2,all)) : : This function is the +.x of APL. It defines an infix function that does a : matrix multiply of matrix a and matrix b except it replaces the usual inner : product of two vectors x and y with all(x==y). : : In terms of this function, the answer is: : : colSums( A %+.*% t(unique(A)) )