On Mon, Feb 13, 2012 at 10:05:02AM -0800, zheng wei
wrote:> Dear All,
>
> Sorry for the typoes earlier, let me repost the question.
>
> Suppose I want to generate sequences of length 3 from two symbols {1,2}, we
get the following 8 sequences
> 1 1 1
> 1 1 2
> 1 2 1
> 1 2 2
> 2 1 1
> 2 1 2
> 2 2 1
> 2 2 2
>
> However, I do not want all these 8 sequences. I call two sequencs to be
isomorphic if one sequence could be obtained from the other by relabelling the
symbols. For example, 111 is isomorphic to 222, 112 is isomorphic to 221.?By
eliminating all these isomorphic ones, what I want is the following
> 1 1 1
> 1 1 2
> 1 2 1
> 2 1 1
Eliminating isomorphic sequences may be done differently,
if we select different representatives of each equivalence
class. The following also eliminates isomorphic 1,2 sequences
1 1 1
1 1 2
1 2 1
1 2 2
Is this solution OK?
> In general, I need to generate non-isomorphic sequences of length p from t
distinct symbols. For example, when p=3, t=3 we have
> matrix(c(1,2,3,1,1,2,2,1,1,1,2,1,1,1,1),3,5)
>
> [1,]??? 1??? 1??? 2??? 1??? 1
> [2,]??? 2??? 1??? 1??? 2??? 1
> [3,]??? 3??? 2??? 1??? 1??? 1
>
> When p=4, t=4 we have
>
matrix(c(1,2,3,4,1,1,2,3,1,2,1,3,1,2,3,1,2,1,1,3,2,3,1,1,2,1,3,1,1,1,2,2,1,2,1,2,1,2,2,1,1,1,1,2,1,1,2,1,1,2,1,1,2,1,1,1,1,1,1,1),4,15)
>
> [1,]??? 1??? 1??? 1??? 1??? 2??? 2??? 2??? 1??? 1???? 1???? 1???? 1????
1???? 2???? 1
> [2,]??? 2??? 1??? 2??? 2??? 1??? 3??? 1??? 1??? 2???? 2???? 1???? 1????
2???? 1???? 1
> [3,]??? 3??? 2??? 1??? 3??? 1??? 1??? 3??? 2??? 1???? 2???? 1???? 2????
1???? 1???? 1
> [4,]??? 4??? 3??? 3??? 1??? 3??? 1??? 1??? 2??? 2???? 1???? 2???? 1????
1???? 1???? 1
>
>
> In general, I need to do this for arbitrary choices of p and t.
If p and t are not too large, try the following
check.row <- function(x)
{
y <- unique(x)
all(y == seq.int(along=y))
}
p <- 4
tt <- 4 # do not rewrite t() for transpose
elem <- lapply(as.list(pmin(1:p, tt)), function(x) seq.int(length=x))
a <- as.matrix(rev(expand.grid(rev(elem))))
ok <- apply(a, 1, check.row)
out <- a[ok, ]
out
Var4 Var3 Var2 Var1
[1,] 1 1 1 1
[2,] 1 1 1 2
[3,] 1 1 2 1
[4,] 1 1 2 2
[5,] 1 1 2 3
[6,] 1 2 1 1
[7,] 1 2 1 2
[8,] 1 2 1 3
[9,] 1 2 2 1
[10,] 1 2 2 2
[11,] 1 2 2 3
[12,] 1 2 3 1
[13,] 1 2 3 2
[14,] 1 2 3 3
[15,] 1 2 3 4
This solution differs from yours, for example, in
the row c(1, 2, 3, 3), which is in your solution
represented by c(2, 3, 1, 1). This a different choice
of the representatives. Is the choice important?
A related thread started at
https://stat.ethz.ch/pipermail/r-help/2012-January/301489.html
There was an additional requirement that each of t symbols
has at least one occurrence.
Hope this helps.
Petr Savicky.