J?r?me, As a first attempt, how about the function below. It works (or not) by randomly sorting the rows and columns, then searching the table for "squares" with the corners = matrix(c(1,0,0,1),ncol=2) and subtracting them from 1 to give matrix(c(0,1,1,0),ncol=2) (and vice versa). Randomized matrices can be produced as a chain where each permutation is seeded with the previous one. Potential problems: 1. Does it really explore all possible permutations? Does it do it in an unbiased way? 2. Related to above: there is potential autocorrelation (although I haven't found it with my data set), so might need some dememorisation steps. 3. It's slow and dememorisation makes it slower. 4. It isn't clear to me whether it needs the added stochastic element, i.e. squares are only flipped if "runif(1)<0.5". In practice it seems to work without it (as far as I can tell, i.e. it isn't autocorrelated using my data set). I suspect there's a much better way of doing this, which might take the margins as an input, and therefore wouldn't be directly derived from any particular matrix structure. Paul ############################################################### # function to permute binary matrix keeping margins fixed. # the input "x" is the binary matrix to be permuted permute.struct<- function(x) { x<-x[rownames(x)[sample(1:nrow(x))],colnames(x)[sample(1:ncol(x))]] pattern<-c(1,0,0,1) for(i in 1:(nrow(x)-1)) for(j in 1:(ncol(x)-1)) for(ii in (i+1):nrow(x)) for(jj in (j+1):ncol(x)) { query<-c(x[c(i,ii),c(j,jj)]) if((all(query-pattern==0) || all(query==1-pattern)) && runif(1)<0.5) x[c(i,ii),c(j,jj)] <- 1 - x[c(i,ii),c(j,jj)] } x } ############################################################### -------------------------------------------------------- From: Mathieu J?r?me <jerome.mathieu_at_snv.jussieu.fr> Date: Thu 05 Apr 2007 - 13:34:01 GMT Dear R Users, How can I randomize a matrix (with contains only 0 and 1) with the constraint that margin totals (columns and rows sums) are kept constant? Some have called this "swap permutation" or something like that. The principle is to find bloc of 10 01 and change it for 01 10 there can be several rows or columns between these numbers. It probably exists in R, but I can't find the function. I am aware of permcont, but it works only for 2x2 matrices thanks in advance Jerome -- J?r?me Mathieu Laboratory of soil tropical ecology UPMC jerome.mathieu at snv.jussieu.fr
Charles C. Berry
2007-Oct-02 17:05 UTC
[R] permutations of a binary matrix with fixed margins
On Tue, 2 Oct 2007, Paul Johnson wrote:> J?r?me, > > As a first attempt, how about the function below. It works (or not) by > randomly sorting the rows and columns, then searching the table for > "squares" with the corners = matrix(c(1,0,0,1),ncol=2) and subtracting them > from 1 to give matrix(c(0,1,1,0),ncol=2) (and vice versa). Randomized > matrices can be produced as a chain where each permutation is seeded with > the previous one. > > Potential problems: > 1. Does it really explore all possible permutations?Yes. If you randomize. Not sure about 'only if you randomize'.> Does it do it in an unbiased way?No. If you mean are all choices equiprobable. So, you either need to add importance weights or do a rejection step. Zaman and Simberloff (2002, Environmental and Ecological Statistics 9, 405--421) go through this and propose methods for the problem. They give an example of a binary matrix with margins (2, 1, 1) and (1, 2, 1). There are five matrices that satisfy the constraints that the margins are as specified and the cells are either zero or one. Study of that simple setup is revealing of the problems encountered in developing sampling schemes for the general problem. I recall they have a C program for the sampler they propose. Probably the path of least resistance is to get hold of that.> 2. Related to above: there is potential autocorrelation (although I haven't > found it with my data set), so might need some dememorisation steps. > 3. It's slow and dememorisation makes it slower. > 4. It isn't clear to me whether it needs the added stochastic element, i.e. > squares are only flipped if "runif(1)<0.5". In practice it seems to work > without it (as far as I can tell, i.e. it isn't autocorrelated using my data > set). > > I suspect there's a much better way of doing this, which might take the > margins as an input, and therefore wouldn't be directly derived from any > particular matrix structure. >I thought that, too. You can see my _retraction_ here: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/98897.html The difficulty is that under the constraints and the relevant null hypothesis all matrices are equiprobable. It is easy to generate random matrices satisfying all constraints, but not the null. Brute force combinatorics is not attractive either. The number of binary matrices can be very large for problems of practical size. As an example, if you have a 20 by 20 binary table with two margins of rep(10,20), there are many more than 2^100 possible tables. There is a straightforward way of enumerating them, but it is not computationally feasible. HTH, Chuck> Paul > > ############################################################### > > # function to permute binary matrix keeping margins fixed. > # the input "x" is the binary matrix to be permuted > > permute.struct<- > function(x) > { > x<-x[rownames(x)[sample(1:nrow(x))],colnames(x)[sample(1:ncol(x))]] > pattern<-c(1,0,0,1) > for(i in 1:(nrow(x)-1)) > for(j in 1:(ncol(x)-1)) > for(ii in (i+1):nrow(x)) > for(jj in (j+1):ncol(x)) > { > query<-c(x[c(i,ii),c(j,jj)]) > if((all(query-pattern==0) || all(query==1-pattern)) && > runif(1)<0.5) > x[c(i,ii),c(j,jj)] <- 1 - x[c(i,ii),c(j,jj)] > } > x > } > > ############################################################### > > -------------------------------------------------------- > From: Mathieu J?r?me <jerome.mathieu_at_snv.jussieu.fr> > Date: Thu 05 Apr 2007 - 13:34:01 GMT > > > Dear R Users, > How can I randomize a matrix (with contains only 0 and 1) with the > constraint that margin totals (columns and rows sums) are kept constant? > Some have called this "swap permutation" or something like that. > > The principle is to find bloc of > 10 > 01 > and change it for > 01 > 10 > there can be several rows or columns between these numbers. It probably > exists in R, but I can't find the function. I am aware of permcont, but it > works only for 2x2 matrices > > thanks in advance > Jerome > > -- > J?r?me Mathieu > Laboratory of soil tropical ecology > UPMC > jerome.mathieu at snv.jussieu.fr > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901