thr3ads.net - R help - [R] Combining multiple probability weights for the sample() function. [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Benjamin Ward (ENV)

2015-Jun-02 12:26 UTC

[R] Combining multiple probability weights for the sample() function.

Dear R-List,

I have a set of possibilities I want to sample from:

bases <- list(c('A', 'C'), c('A', 'G'),
c('C', 'T'))
possibilities <- as.matrix(expand.grid(bases))
>possibilitiesVar1 Var2 Var3
[1,] "A"  "A"  "C"
[2,] "C"  "A"  "C"
[3,] "A"  "G"  "C"
[4,] "C"  "G"  "C"
[5,] "A"  "A"  "T"
[6,] "C"  "A"  "T"
[7,] "A"  "G"  "T"
[8,] "C"  "G"  "T"

If I want to randomly sample one of these rows. If I do this, I find that it is
25% likely that my choice will have an identical first and last letter (e.g.
[1,] "A"  "A"  "C"). It is also 25% likely that my
choice will have an identical first and third letter (e.g. [4,] "C" 
"G"  "C"). It is not likely at all that the second and third
letter of my choice could be identical.

What I would like to do, is sample one of the rows, but given the constraint
that the probability of drawing identical letters 1 and 2 should be 50% or 0.5,
and at the same time the probability of drawing identical letters 1 and 3 should
be 50%. I am unsure on how to do this, but I know it involves coming up with a
modified set of weights for the sample() function. My progress is below, any
advice is much appreciated.

Best Wishes,

Ben Ward, UEA.


So I have used the following code to come up with a matrix, which contains
weighting according to each criteria:

possibilities <- as.matrix(expand.grid(bases))
  identities <- apply(possibilities, 1, function(x) c(x[1] == x[2], x[1] ==
x[3], x[2] == x[3]))
  prob <- matrix(rep(0, length(identities)), ncol = ncol(identities))
  consProb <- apply(identities, 1, function(x){0.5 / length(which(x))})
  polProb <- apply(identities, 1, function(x){0.5 / length(which(!x))})
  for(i in 1:nrow(identities)){
    prob[i, which(identities[i,])] <- consProb[i]
    prob[i, which(!identities[i,])] <- polProb[i]
  }
  rownames(prob) <- c("1==2", "1==3", "2==3")
  colnames(prob) <- apply(possibilities, 1, function(x)paste(x, collapse =
", "))

This code gives the following matrix:

                A, A, C    C, A, C          A, G, C        C, G, C       A, A, T
C, A, T       A, G, T       C, G, T
1==2 0.25000000 0.08333333 0.08333333 0.08333333 0.25000000 0.08333333
0.08333333 0.08333333
1==3 0.08333333 0.25000000 0.08333333 0.25000000 0.08333333 0.08333333
0.08333333 0.08333333
2==3 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000
0.06250000 0.06250000

Each column is one of the choices from 'possibilities', and each row
gives a series of weights based on three different criteria:

Row 1, that if it possible from the choices for letter 1 == letter 2, that
combined chance be 50%.
Row 2, that if it possible from the choices for letter 1 == letter 3, that
combined chance be 50%.
Row 3, that if it possible from the choices for letter 2 == letter 3, that
combined chance be 50%.

So:

 If I used sample(x = 1:now(possibilities), size = 1, prob = prob[1,])
repeatedly, I expect about half the choices to contain identical letters 1 and
2.

 If I used sample(x = 1:now(possibilities), size = 1, prob = prob[2,])
repeatedly, I expect about half the choices to contain identical letters 1 and
3.

If I used sample(x = 1:now(possibilities), size = 1, prob = prob[3,])
repeatedly, I expect about half the choices to contain identical letters 2 and
3. Except that in this case, since it is not possible.

Note each row sums to 1.

What I would like to do - if it is possible - is combine these three sets of
weights into one set, that when used with
sample(x = 1:nrow(possibilities, size = 1, prob = MAGICPROB) will give me a list
of choices, where ~50% of them contain identical letters 1 and 2, AND ~50% of
them contain identical letters 1 and 3, AND ~50% again contain identical letters
2 and 3 (except in this example as it is not possible from the choices).

Can multiple probability weightings be combined in such a manner?




	[[alternative HTML version deleted]]

Adams, Jean

2015-Jun-02 13:57 UTC

head link

[R] Combining multiple probability weights for the sample() function.

Ben,

Perhaps I am missing something, but couldn't you simply reduce your
possibilities to:

possibilities[c(1, 5, 2, 4), ]
     Var1 Var2 Var3
[1,] "A"  "A"  "C"
[2,] "A"  "A"  "T"
[3,] "C"  "A"  "C"
[4,] "C"  "G"  "C"

If you sample from these four rows you will have a 50% chance that Var1 and
Var2 are equal and a 50% chance that Var1 and Var3 are equal.

Jean


On Tue, Jun 2, 2015 at 7:26 AM, Benjamin Ward (ENV) <B.Ward at uea.ac.uk>
wrote:
> Dear R-List,
>
> I have a set of possibilities I want to sample from:
>
> bases <- list(c('A', 'C'), c('A', 'G'),
c('C', 'T'))
> possibilities <- as.matrix(expand.grid(bases))
>
> >possibilities
> Var1 Var2 Var3
> [1,] "A"  "A"  "C"
> [2,] "C"  "A"  "C"
> [3,] "A"  "G"  "C"
> [4,] "C"  "G"  "C"
> [5,] "A"  "A"  "T"
> [6,] "C"  "A"  "T"
> [7,] "A"  "G"  "T"
> [8,] "C"  "G"  "T"
>
> If I want to randomly sample one of these rows. If I do this, I find that
> it is 25% likely that my choice will have an identical first and last
> letter (e.g. [1,] "A"  "A"  "C"). It is also
25% likely that my choice will
> have an identical first and third letter (e.g. [4,] "C" 
"G"  "C"). It is
> not likely at all that the second and third letter of my choice could be
> identical.
>
> What I would like to do, is sample one of the rows, but given the
> constraint that the probability of drawing identical letters 1 and 2 should
> be 50% or 0.5, and at the same time the probability of drawing identical
> letters 1 and 3 should be 50%. I am unsure on how to do this, but I know it
> involves coming up with a modified set of weights for the sample()
> function. My progress is below, any advice is much appreciated.
>
> Best Wishes,
>
> Ben Ward, UEA.
>
>
> So I have used the following code to come up with a matrix, which contains
> weighting according to each criteria:
>
> possibilities <- as.matrix(expand.grid(bases))
>   identities <- apply(possibilities, 1, function(x) c(x[1] == x[2], x[1]
> == x[3], x[2] == x[3]))
>   prob <- matrix(rep(0, length(identities)), ncol = ncol(identities))
>   consProb <- apply(identities, 1, function(x){0.5 / length(which(x))})
>   polProb <- apply(identities, 1, function(x){0.5 / length(which(!x))})
>   for(i in 1:nrow(identities)){
>     prob[i, which(identities[i,])] <- consProb[i]
>     prob[i, which(!identities[i,])] <- polProb[i]
>   }
>   rownames(prob) <- c("1==2", "1==3",
"2==3")
>   colnames(prob) <- apply(possibilities, 1, function(x)paste(x, collapse
> ", "))
>
> This code gives the following matrix:
>
>                 A, A, C    C, A, C          A, G, C        C, G, C
>  A, A, T         C, A, T       A, G, T       C, G, T
> 1==2 0.25000000 0.08333333 0.08333333 0.08333333 0.25000000 0.08333333
> 0.08333333 0.08333333
> 1==3 0.08333333 0.25000000 0.08333333 0.25000000 0.08333333 0.08333333
> 0.08333333 0.08333333
> 2==3 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000
> 0.06250000 0.06250000
>
> Each column is one of the choices from 'possibilities', and each
row gives
> a series of weights based on three different criteria:
>
> Row 1, that if it possible from the choices for letter 1 == letter 2, that
> combined chance be 50%.
> Row 2, that if it possible from the choices for letter 1 == letter 3, that
> combined chance be 50%.
> Row 3, that if it possible from the choices for letter 2 == letter 3, that
> combined chance be 50%.
>
> So:
>
>  If I used sample(x = 1:now(possibilities), size = 1, prob = prob[1,])
> repeatedly, I expect about half the choices to contain identical letters 1
> and 2.
>
>  If I used sample(x = 1:now(possibilities), size = 1, prob = prob[2,])
> repeatedly, I expect about half the choices to contain identical letters 1
> and 3.
>
> If I used sample(x = 1:now(possibilities), size = 1, prob = prob[3,])
> repeatedly, I expect about half the choices to contain identical letters 2
> and 3. Except that in this case, since it is not possible.
>
> Note each row sums to 1.
>
> What I would like to do - if it is possible - is combine these three sets
> of weights into one set, that when used with
> sample(x = 1:nrow(possibilities, size = 1, prob = MAGICPROB) will give me
> a list of choices, where ~50% of them contain identical letters 1 and 2,
> AND ~50% of them contain identical letters 1 and 3, AND ~50% again contain
> identical letters 2 and 3 (except in this example as it is not possible
> from the choices).
>
> Can multiple probability weightings be combined in such a manner?
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Jim Lemon

2015-Jun-03 01:25 UTC

head link

[R] Combining multiple probability weights for the sample() function.

Hi Ben,
While Jean's answer looks correct, I think that there is something
amiss with your specification of the problem. You have eight
combinations in your "possibilities". So if you draw samples
"x"
where:

If p(x = possibilities[1,] | possibilities[5,]) = 0.5 AND
 p(x = possibilities[2,] | possibilities[4,]) = 0.5

then:

p(x = possibilities[3,] | possibilities[6,] | possibilities[7,] |
possibilities[8,]) = 0

You state that you want the probability of drawing a triplet with
identical second and third bases to be zero, which is ensured by
having no such "possibilities" in the set to be sampled. This is the
constraint that forces the above, as the two conditions (identical
first and second, identical first and third) are disjunct and as the
sum of all probabilities cannot exceed one, four "possibilities" must
have probabilities of zero.

Jim


On Tue, Jun 2, 2015 at 11:57 PM, Adams, Jean <jvadams at usgs.gov>
wrote:> Ben,
>
> Perhaps I am missing something, but couldn't you simply reduce your
> possibilities to:
>
> possibilities[c(1, 5, 2, 4), ]
>      Var1 Var2 Var3
> [1,] "A"  "A"  "C"
> [2,] "A"  "A"  "T"
> [3,] "C"  "A"  "C"
> [4,] "C"  "G"  "C"
>
> If you sample from these four rows you will have a 50% chance that Var1 and
> Var2 are equal and a 50% chance that Var1 and Var3 are equal.
>
> Jean
>
>
> On Tue, Jun 2, 2015 at 7:26 AM, Benjamin Ward (ENV) <B.Ward at
uea.ac.uk>
> wrote:
>
>> Dear R-List,
>>
>> I have a set of possibilities I want to sample from:
>>
>> bases <- list(c('A', 'C'), c('A',
'G'), c('C', 'T'))
>> possibilities <- as.matrix(expand.grid(bases))
>>
>> >possibilities
>> Var1 Var2 Var3
>> [1,] "A"  "A"  "C"
>> [2,] "C"  "A"  "C"
>> [3,] "A"  "G"  "C"
>> [4,] "C"  "G"  "C"
>> [5,] "A"  "A"  "T"
>> [6,] "C"  "A"  "T"
>> [7,] "A"  "G"  "T"
>> [8,] "C"  "G"  "T"
>>
>> If I want to randomly sample one of these rows. If I do this, I find
that
>> it is 25% likely that my choice will have an identical first and last
>> letter (e.g. [1,] "A"  "A"  "C"). It is
also 25% likely that my choice will
>> have an identical first and third letter (e.g. [4,] "C" 
"G"  "C"). It is
>> not likely at all that the second and third letter of my choice could
be
>> identical.
>>
>> What I would like to do, is sample one of the rows, but given the
>> constraint that the probability of drawing identical letters 1 and 2
should
>> be 50% or 0.5, and at the same time the probability of drawing
identical
>> letters 1 and 3 should be 50%. I am unsure on how to do this, but I
know it
>> involves coming up with a modified set of weights for the sample()
>> function. My progress is below, any advice is much appreciated.
>>
>> Best Wishes,
>>
>> Ben Ward, UEA.
>>
>>
>> So I have used the following code to come up with a matrix, which
contains
>> weighting according to each criteria:
>>
>> possibilities <- as.matrix(expand.grid(bases))
>>   identities <- apply(possibilities, 1, function(x) c(x[1] == x[2],
x[1]
>> == x[3], x[2] == x[3]))
>>   prob <- matrix(rep(0, length(identities)), ncol =
ncol(identities))
>>   consProb <- apply(identities, 1, function(x){0.5 /
length(which(x))})
>>   polProb <- apply(identities, 1, function(x){0.5 /
length(which(!x))})
>>   for(i in 1:nrow(identities)){
>>     prob[i, which(identities[i,])] <- consProb[i]
>>     prob[i, which(!identities[i,])] <- polProb[i]
>>   }
>>   rownames(prob) <- c("1==2", "1==3",
"2==3")
>>   colnames(prob) <- apply(possibilities, 1, function(x)paste(x,
collapse >> ", "))
>>
>> This code gives the following matrix:
>>
>>                 A, A, C    C, A, C          A, G, C        C, G, C
>>  A, A, T         C, A, T       A, G, T       C, G, T
>> 1==2 0.25000000 0.08333333 0.08333333 0.08333333 0.25000000 0.08333333
>> 0.08333333 0.08333333
>> 1==3 0.08333333 0.25000000 0.08333333 0.25000000 0.08333333 0.08333333
>> 0.08333333 0.08333333
>> 2==3 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000
>> 0.06250000 0.06250000
>>
>> Each column is one of the choices from 'possibilities', and
each row gives
>> a series of weights based on three different criteria:
>>
>> Row 1, that if it possible from the choices for letter 1 == letter 2,
that
>> combined chance be 50%.
>> Row 2, that if it possible from the choices for letter 1 == letter 3,
that
>> combined chance be 50%.
>> Row 3, that if it possible from the choices for letter 2 == letter 3,
that
>> combined chance be 50%.
>>
>> So:
>>
>>  If I used sample(x = 1:now(possibilities), size = 1, prob = prob[1,])
>> repeatedly, I expect about half the choices to contain identical
letters 1
>> and 2.
>>
>>  If I used sample(x = 1:now(possibilities), size = 1, prob = prob[2,])
>> repeatedly, I expect about half the choices to contain identical
letters 1
>> and 3.
>>
>> If I used sample(x = 1:now(possibilities), size = 1, prob = prob[3,])
>> repeatedly, I expect about half the choices to contain identical
letters 2
>> and 3. Except that in this case, since it is not possible.
>>
>> Note each row sums to 1.
>>
>> What I would like to do - if it is possible - is combine these three
sets
>> of weights into one set, that when used with
>> sample(x = 1:nrow(possibilities, size = 1, prob = MAGICPROB) will give
me
>> a list of choices, where ~50% of them contain identical letters 1 and
2,
>> AND ~50% of them contain identical letters 1 and 3, AND ~50% again
contain
>> identical letters 2 and 3 (except in this example as it is not possible
>> from the choices).
>>
>> Can multiple probability weightings be combined in such a manner?
>>
>>
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Boris Steipe

2015-Jun-03 18:26 UTC

head link

[R] Combining multiple probability weights for the sample() function.

If letters 1 and 2 must be equal with p=0.5, and 1 and 3 must be equal with
p=0.5, then letter 1 must be the same as either 2 or 3. Therefore:

Choose a letter.
Make a pair of (letter, (not letter)).
Reverse the pair with p = 0.5
Concatenate your letter and the pair.


Is that what you need?


B.



On Jun 2, 2015, at 8:26 AM, Benjamin Ward (ENV) <B.Ward at uea.ac.uk>
wrote:
> Dear R-List,
> 
> I have a set of possibilities I want to sample from:
> 
> bases <- list(c('A', 'C'), c('A', 'G'),
c('C', 'T'))
> possibilities <- as.matrix(expand.grid(bases))
> 
>> possibilities
> Var1 Var2 Var3
> [1,] "A"  "A"  "C"
> [2,] "C"  "A"  "C"
> [3,] "A"  "G"  "C"
> [4,] "C"  "G"  "C"
> [5,] "A"  "A"  "T"
> [6,] "C"  "A"  "T"
> [7,] "A"  "G"  "T"
> [8,] "C"  "G"  "T"
> 
> If I want to randomly sample one of these rows. If I do this, I find that
it is 25% likely that my choice will have an identical first and last letter
(e.g. [1,] "A"  "A"  "C"). It is also 25% likely
that my choice will have an identical first and third letter (e.g. [4,]
"C"  "G"  "C"). It is not likely at all that the
second and third letter of my choice could be identical.
> 
> What I would like to do, is sample one of the rows, but given the
constraint that the probability of drawing identical letters 1 and 2 should be
50% or 0.5, and at the same time the probability of drawing identical letters 1
and 3 should be 50%. I am unsure on how to do this, but I know it involves
coming up with a modified set of weights for the sample() function. My progress
is below, any advice is much appreciated.
> 
> Best Wishes,
> 
> Ben Ward, UEA.
> 
> 
> So I have used the following code to come up with a matrix, which contains
weighting according to each criteria:
> 
> possibilities <- as.matrix(expand.grid(bases))
>  identities <- apply(possibilities, 1, function(x) c(x[1] == x[2], x[1]
== x[3], x[2] == x[3]))
>  prob <- matrix(rep(0, length(identities)), ncol = ncol(identities))
>  consProb <- apply(identities, 1, function(x){0.5 / length(which(x))})
>  polProb <- apply(identities, 1, function(x){0.5 / length(which(!x))})
>  for(i in 1:nrow(identities)){
>    prob[i, which(identities[i,])] <- consProb[i]
>    prob[i, which(!identities[i,])] <- polProb[i]
>  }
>  rownames(prob) <- c("1==2", "1==3",
"2==3")
>  colnames(prob) <- apply(possibilities, 1, function(x)paste(x, collapse
= ", "))
> 
> This code gives the following matrix:
> 
>                A, A, C    C, A, C          A, G, C        C, G, C       A,
A, T         C, A, T       A, G, T       C, G, T
> 1==2 0.25000000 0.08333333 0.08333333 0.08333333 0.25000000 0.08333333
0.08333333 0.08333333
> 1==3 0.08333333 0.25000000 0.08333333 0.25000000 0.08333333 0.08333333
0.08333333 0.08333333
> 2==3 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000
0.06250000 0.06250000
> 
> Each column is one of the choices from 'possibilities', and each
row gives a series of weights based on three different criteria:
> 
> Row 1, that if it possible from the choices for letter 1 == letter 2, that
combined chance be 50%.
> Row 2, that if it possible from the choices for letter 1 == letter 3, that
combined chance be 50%.
> Row 3, that if it possible from the choices for letter 2 == letter 3, that
combined chance be 50%.
> 
> So:
> 
> If I used sample(x = 1:now(possibilities), size = 1, prob = prob[1,])
repeatedly, I expect about half the choices to contain identical letters 1 and
2.
> 
> If I used sample(x = 1:now(possibilities), size = 1, prob = prob[2,])
repeatedly, I expect about half the choices to contain identical letters 1 and
3.
> 
> If I used sample(x = 1:now(possibilities), size = 1, prob = prob[3,])
repeatedly, I expect about half the choices to contain identical letters 2 and
3. Except that in this case, since it is not possible.
> 
> Note each row sums to 1.
> 
> What I would like to do - if it is possible - is combine these three sets
of weights into one set, that when used with
> sample(x = 1:nrow(possibilities, size = 1, prob = MAGICPROB) will give me a
list of choices, where ~50% of them contain identical letters 1 and 2, AND ~50%
of them contain identical letters 1 and 3, AND ~50% again contain identical
letters 2 and 3 (except in this example as it is not possible from the choices).
> 
> Can multiple probability weightings be combined in such a manner?
> 
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Daniel Nordlund

2015-Jun-03 19:28 UTC

head link

[R] Combining multiple probability weights for the sample() function.

On 6/3/2015 11:26 AM, Boris Steipe wrote:> If letters 1 and 2 must be equal with p=0.5, and 1 and 3 must be equal with
p=0.5, then letter 1 must be the same as either 2 or 3. Therefore:
>
> Choose a letter.
> Make a pair of (letter, (not letter)).
> Reverse the pair with p = 0.5
> Concatenate your letter and the pair.
>
>
> Is that what you need?
>
>
> B.
>
>
>
> On Jun 2, 2015, at 8:26 AM, Benjamin Ward (ENV) <B.Ward at uea.ac.uk>
wrote:
>
>> Dear R-List,
>>
>> I have a set of possibilities I want to sample from:
>>
>> bases <- list(c('A', 'C'), c('A',
'G'), c('C', 'T'))
>> possibilities <- as.matrix(expand.grid(bases))
>>
>>> possibilities
>> Var1 Var2 Var3
>> [1,] "A"  "A"  "C"
>> [2,] "C"  "A"  "C"
>> [3,] "A"  "G"  "C"
>> [4,] "C"  "G"  "C"
>> [5,] "A"  "A"  "T"
>> [6,] "C"  "A"  "T"
>> [7,] "A"  "G"  "T"
>> [8,] "C"  "G"  "T"
>>
>> If I want to randomly sample one of these rows. If I do this, I find
that it is 25% likely that my choice will have an identical first and last
letter (e.g. [1,] "A"  "A"  "C"). It is also 25%
likely that my choice will have an identical first and third letter (e.g. [4,]
"C"  "G"  "C"). It is not likely at all that the
second and third letter of my choice could be identical.
>>
>> What I would like to do, is sample one of the rows, but given the
constraint that the probability of drawing identical letters 1 and 2 should be
50% or 0.5, and at the same time the probability of drawing identical letters 1
and 3 should be 50%. I am unsure on how to do this, but I know it involves
coming up with a modified set of weights for the sample() function. My progress
is below, any advice is much appreciated.
>>
>> Best Wishes,
>>
>> Ben Ward, UEA.
>>
>>
>> So I have used the following code to come up with a matrix, which
contains weighting according to each criteria:
>>
>> possibilities <- as.matrix(expand.grid(bases))
>>   identities <- apply(possibilities, 1, function(x) c(x[1] == x[2],
x[1] == x[3], x[2] == x[3]))
>>   prob <- matrix(rep(0, length(identities)), ncol =
ncol(identities))
>>   consProb <- apply(identities, 1, function(x){0.5 /
length(which(x))})
>>   polProb <- apply(identities, 1, function(x){0.5 /
length(which(!x))})
>>   for(i in 1:nrow(identities)){
>>     prob[i, which(identities[i,])] <- consProb[i]
>>     prob[i, which(!identities[i,])] <- polProb[i]
>>   }
>>   rownames(prob) <- c("1==2", "1==3",
"2==3")
>>   colnames(prob) <- apply(possibilities, 1, function(x)paste(x,
collapse = ", "))
>>
>> This code gives the following matrix:
>>
>>                 A, A, C    C, A, C          A, G, C        C, G, C     
A, A, T         C, A, T       A, G, T       C, G, T
>> 1==2 0.25000000 0.08333333 0.08333333 0.08333333 0.25000000 0.08333333
0.08333333 0.08333333
>> 1==3 0.08333333 0.25000000 0.08333333 0.25000000 0.08333333 0.08333333
0.08333333 0.08333333
>> 2==3 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000 0.06250000
0.06250000 0.06250000
>>
>> Each column is one of the choices from 'possibilities', and
each row gives a series of weights based on three different criteria:
>>
>> Row 1, that if it possible from the choices for letter 1 == letter 2,
that combined chance be 50%.
>> Row 2, that if it possible from the choices for letter 1 == letter 3,
that combined chance be 50%.
>> Row 3, that if it possible from the choices for letter 2 == letter 3,
that combined chance be 50%.
>>
>> So:
>>
>> If I used sample(x = 1:now(possibilities), size = 1, prob = prob[1,])
repeatedly, I expect about half the choices to contain identical letters 1 and
2.
>>
>> If I used sample(x = 1:now(possibilities), size = 1, prob = prob[2,])
repeatedly, I expect about half the choices to contain identical letters 1 and
3.
>>
>> If I used sample(x = 1:now(possibilities), size = 1, prob = prob[3,])
repeatedly, I expect about half the choices to contain identical letters 2 and
3. Except that in this case, since it is not possible.
>>
>> Note each row sums to 1.
>>
>> What I would like to do - if it is possible - is combine these three
sets of weights into one set, that when used with
>> sample(x = 1:nrow(possibilities, size = 1, prob = MAGICPROB) will give
me a list of choices, where ~50% of them contain identical letters 1 and 2, AND
~50% of them contain identical letters 1 and 3, AND ~50% again contain identical
letters 2 and 3 (except in this example as it is not possible from the choices).
>>
>> Can multiple probability weightings be combined in such a manner?
>>
>>
Ben,

If I correctly understand your requirements, you can't do what you are 
asking.  If you only have the eight possibilities that you list, then to 
get letters 1 and two to match 50% of the time you must select row 1 
with probability=.25 and row 5 with probability=.25.  To have the first 
and third letters match 50% of the time you must select rows 2 and 4 
each with probability=.25.  Those probabilities sum to 1, so you can 
never select any of the other rows.

Am I missing something?

Dan

-- 
Daniel Nordlund
Bothell, WA USA

R help - Jun 2015 - Combining multiple probability weights for the sample() function.

[R] Combining multiple probability weights for the sample() function.

[R] Combining multiple probability weights for the sample() function.

[R] Combining multiple probability weights for the sample() function.

[R] Combining multiple probability weights for the sample() function.

[R] Combining multiple probability weights for the sample() function.