Dear R users, I would like to sample from a mixture distribution p1*f1+p2*f2+p3*f3 with f1,f2,f3 three different forms of distributions. I know that in the case of two distributions I have to sample the mixture compoment membership. How can I weight my three distributions with their respective probabilities? Evgenia -- View this message in context: http://www.nabble.com/Mixture-of--Distributions-tp15618948p15618948.html Sent from the R help mailing list archive at Nabble.com.
On 21-Feb-08 20:58:25, Evgenia wrote:> > Dear R users, > I would like to sample from a mixture distribution > p1*f1+p2*f2+p3*f3 with f1,f2,f3 three different forms > of distributions. I know that in the case of two > distributions I have to sample the mixture compoment > membership. > > How can I weight my three distributions with their > respective probabilities? > > EvgeniaThere are several ways in which you could write code to do this, but all amount to the fact that each value sampled from such a mixture is obtained by a) choose between f1, f2, f3 at random according to the probabilities p1, p2, p3 (it is assumed p1+p2+p3=1). b) sample 1 value from whichever of f1, f2, f3 was chosen. Suggestion: Suppose the functions rf1(n), rf2(n), rf3(n) respectively sample n values from f1, f2 and f3. S0 <- cbind(rf1(n),rf2(n),rf3(n)) ix <- sample(c(1,2,3),n,3,prob=c(p1,p2,p3),replace=TRUE) S <- S0[,ix] will produce n values, each of which is a sample of size 1 from the mixture. Example: rf1 <- function(n){rnorm(n,0,1)} ## normal distribution with mean=0 rf2 <- function(n){rnorm(n,4,1)} ## normal distribution with mean=4 rf3 <- function(n){rnorm(n,9,1)} ## normal distribution with mean=9 p1 <- 0.2; p2 <- 0.3; p3 <-0.5 S0 <- cbind(rf1(500),rf2(500),rf3(500)) ix <- sample(c(1,2,3),500,prob=c(p1,p2,p3),replace=TRUE) S <- S0[,ix] hist(S,breaks=50) Hoping this helps. Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 21-Feb-08 Time: 21:39:53 ------------------------------ XFMail ------------------------------
I just realised I made a bad mistaqke (see below) On 21-Feb-08 21:39:56, Ted Harding wrote:> On 21-Feb-08 20:58:25, Evgenia wrote: >> >> Dear R users, >> I would like to sample from a mixture distribution >> p1*f1+p2*f2+p3*f3 with f1,f2,f3 three different forms >> of distributions. I know that in the case of two >> distributions I have to sample the mixture compoment >> membership. >> >> How can I weight my three distributions with their >> respective probabilities? >> >> Evgenia > > There are several ways in which you could write code > to do this, but all amount to the fact that each value > sampled from such a mixture is obtained by > a) choose between f1, f2, f3 at random according to the > probabilities p1, p2, p3 (it is assumed p1+p2+p3=1). > b) sample 1 value from whichever of f1, f2, f3 was chosen. > > Suggestion: > > Suppose the functions rf1(n), rf2(n), rf3(n) respectively > sample n values from f1, f2 and f3.S0 <- cbind(rf1(n),rf2(n),rf3(n)) ix <- sample(c(1,2,3),n,3,prob=c(p1,p2,p3),replace=TRUE) So far so good! S <- S0[,ix] But this will not do what I intended (i.e. select element ix[1] from the first row os S0, ix[2] from the second row, and so on). Instead, it will return an nxn matrix with n rows, and n columns which are copies of columns of S0 in the order selected by ix! The following will do the correct thing, though there must be a neater way of doing it! The example below has been corrected. S <- S0[,1]*(ix==1) + S0[,2]*(ix==2) + S0[,3]*(ix==3)> will produce n values, each of which is a sample of size 1 > from the mixture. > > Example:rf1 <- function(n){rnorm(n,0,1)} ## normal distribution with mean=0 rf2 <- function(n){rnorm(n,4,1)} ## normal distribution with mean=4 rf3 <- function(n){rnorm(n,9,1)} ## normal distribution with mean=9 p1 <- 0.2; p2 <- 0.3; p3 <-0.5 S0 <- cbind(rf1(500),rf2(500),rf3(500)) ix <- sample(c(1,2,3),500,prob=c(p1,p2,p3),replace=TRUE) S <- S0[,1]*(ix==1) + S0[,2]*(ix==2) + S0[,3]*(ix==3) hist(S,breaks=50)> > Hoping this helps. > Ted.And again! Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 21-Feb-08 Time: 22:20:04 ------------------------------ XFMail ------------------------------
(Ted Harding) <Ted.Harding at manchester.ac.uk> wrote in news:XFMail.080221222007.Ted.Harding at manchester.ac.uk:> I just realised I made a bad mistaqke (see below) > > On 21-Feb-08 21:39:56, Ted Harding wrote: >> On 21-Feb-08 20:58:25, Evgenia wrote: >>> >>> Dear R users, >>> I would like to sample from a mixture distribution >>> p1*f1+p2*f2+p3*f3 with f1,f2,f3 three different forms >>> of distributions. I know that in the case of two >>> distributions I have to sample the mixture compoment >>> membership. >>> >>> How can I weight my three distributions with their >>> respective probabilities? >>> >>> Evgenia >> >> There are several ways in which you could write code >> to do this, but all amount to the fact that each value >> sampled from such a mixture is obtained by >> a) choose between f1, f2, f3 at random according to the >> probabilities p1, p2, p3 (it is assumed p1+p2+p3=1). >> b) sample 1 value from whichever of f1, f2, f3 was chosen. >> >> Suggestion: >> >> Suppose the functions rf1(n), rf2(n), rf3(n) respectively >> sample n values from f1, f2 and f3. > > S0 <- cbind(rf1(n),rf2(n),rf3(n)) > ix <- sample(c(1,2,3),n,3,prob=c(p1,p2,p3),replace=TRUE) > > So far so good! > > S <- S0[,ix] > > But this will not do what I intended (i.e. select element ix[1] > from the first row os S0, ix[2] from the second row, and so on). > > Instead, it will return an nxn matrix with n rows, and n columns > which are copies of columns of S0 in the order selected by ix! > > The following will do the correct thing, though there must > be a neater way of doing it! The example below has been > corrected. > > S <- S0[,1]*(ix==1) + S0[,2]*(ix==2) + S0[,3]*(ix==3)I discovered a shorter method (and I even think I might understand why it works.) Try: S0[col(S0)==ix] R loops through the S0 matrix and finds only those entries where the column numbers match the ix[] values. I would have thought that the first row and ix[1]th column entry should be first but the machinery seems to be working differently. The user should realize that all the column==1 hits will occur first #-----toy code------- rf1 <- function(n){rnorm(n,0,1)} ## normal distribution with mean=0 rf2 <- function(n){rnorm(n,4,1)} ## normal distribution with mean=4 rf3 <- function(n){rnorm(n,9,1)} ## normal distribution with mean=9 S0 <- cbind(rf1(10),rf2(10),rf3(10)) p1 <- 0.2; p2 <- 0.3; p3 <-0.5 set.seed<-37 ix <- sample(c(1,2,3),10,prob=c(p1,p2,p3),replace=TRUE) #> ix # [1] 2 3 1 3 1 2 1 2 2 1 S0 #---------- S0 [,1] [,2] [,3] [1,] 0.5143426 3.823013 9.666210 [2,] 0.7044110 6.095287 10.276792 [3,] -0.3597926 5.848588 8.801378 [4,] 0.1346979 5.518667 8.357181 [5,] -0.6856507 2.882090 9.893790 [6,] 0.5882977 4.864201 8.708929 [7,] -0.4241657 2.891672 8.056254 [8,] 1.6243569 2.981563 11.684422 [9,] -0.3072410 4.379916 10.679470 [10,] -0.6289604 3.119195 7.216593 #-------------------------- S0[col(S0)==ix] # [1] 0.1346979 -0.3072410 3.8230128 5.8485880 4.8642010 2.9815631 # [7] 10.2767916 9.8937904 8.0562542 7.2165927 R seems to be looping through the columns "looking" for the TRUE values in the col(S0)==ix matrix. col(S0)==ix [,1] [,2] [,3] [1,] FALSE TRUE FALSE [2,] FALSE FALSE TRUE [3,] FALSE TRUE FALSE [4,] TRUE FALSE FALSE [5,] FALSE FALSE TRUE [6,] FALSE TRUE FALSE [7,] FALSE FALSE TRUE [8,] FALSE TRUE FALSE [9,] TRUE FALSE FALSE [10,] FALSE FALSE TRUE -- Regards; David Winsemius