Hi, I have a mixture pdf which has three components, each satisfies the 6 dimension normal distribution. I use mvrnorm() from the MASS library to generate 1000 samples for each component and I add them to get the random samples which satisfies with the mixture distribution. I use Mclust() from the mclust library to get the model of the samples and strange things happened. First it gave a warning > samplesMclust <- Mclust( samples ) Warning messages: 1: In summary.mclustBIC(Bic, data, G = G, modelNames = modelNames) : best model occurs at the min or max # of components considered 2: In Mclust(samples) : optimal number of clusters occurs at min choice Then I input > samplesMclust best model: XXI with 1 components it says the best model is with 1 component ! I am confused ... Is it because the way that I generate samples is wrong??? thanks so much ! -------------------------- Peng Jiang ?? Ph.D. Candidate Antai College of Economics & Management ???????? Department of Mathematics ??? Shanghai Jiaotong University (Minhang Campus) 800 Dongchuan Road 200240 Shanghai P. R. China
You should not add the 3 six dimensional variables!!! By adding them you are getting a multivariate normal variable and not a mixture! To get a mixture with probabilities p1 for the first, p2 for the second and p3 for the third one (p1+p2+p3=1), simulate a [0,1] uniform variable X and return the first one if X < p1, the second one if p1 <= X < p1+p2 and the third one if X >= p1+p2. --- On Mon, 16/6/08, Peng Jiang <jp021 at sjtu.edu.cn> wrote:> From: Peng Jiang <jp021 at sjtu.edu.cn> > Subject: [R] simulating Gaussian Mixture Method > To: R-help at r-project.org > Received: Monday, 16 June, 2008, 3:48 PM > Hi, > > I have a mixture pdf which has three components, each > satisfies the > 6 dimension normal distribution. > > I use mvrnorm() from the MASS library to generate 1000 > samples for > each component and I add them > to get the random samples which satisfies with the > mixture > distribution. > > I use Mclust() from the mclust library to get the model > of the > samples and strange things happened. > First it gave a warning > > > samplesMclust <- Mclust( samples ) > > Warning messages: > 1: In summary.mclustBIC(Bic, data, G = G, modelNames > modelNames) : > best model occurs at the min or max # of components > considered > 2: In Mclust(samples) : optimal number of clusters occurs > at min choice > > Then I input > > samplesMclust > > best model: XXI with 1 components > > it says the best model is with 1 component ! > > I am confused ... Is it because the way that I generate > samples is > wrong??? > > thanks so much ! > > > > > -------------------------- > Peng Jiang > ?? > Ph.D. Candidate > > Antai College of Economics & Management > ???????? > Department of Mathematics > ??? > Shanghai Jiaotong University (Minhang Campus) > 800 Dongchuan Road > 200240 Shanghai > P. R. China > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code.
Adding them together will not give you a sample from a mixture, it gives you a sample from another multivariate normal distribution. Rather than add them together what you have to do is select from each of the two samples with the appropriate probability. e.g. suppose your mixing probability was 0.6 for the first component and hence 0.4 for the second. Also suppose S1 (1000 x 6) and S2 (1000 x 6) are the two samples generated by mvrnorm. Then to get a sample from a mixture of the two you would need to do S12 <- array(c(S1, S2), dim = c(1000*6, 2)) ## both in the one matrix, as two columns comp <- cbind(1:6000, ifelse(runif(1000) < 0.6, 1, 2)) Smix <- matrix(S12[comb], nrow = 1000) This should give you a sample from the mixture. Bill Venables -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Peng Jiang Sent: Monday, 16 June 2008 3:48 PM To: R-help at r-project.org Subject: [R] simulating Gaussian Mixture Method Hi, I have a mixture pdf which has three components, each satisfies the 6 dimension normal distribution. I use mvrnorm() from the MASS library to generate 1000 samples for each component and I add them to get the random samples which satisfies with the mixture distribution. I use Mclust() from the mclust library to get the model of the samples and strange things happened. First it gave a warning > samplesMclust <- Mclust( samples ) Warning messages: 1: In summary.mclustBIC(Bic, data, G = G, modelNames = modelNames) : best model occurs at the min or max # of components considered 2: In Mclust(samples) : optimal number of clusters occurs at min choice Then I input > samplesMclust best model: XXI with 1 components it says the best model is with 1 component ! I am confused ... Is it because the way that I generate samples is wrong??? thanks so much ! -------------------------- Peng Jiang ?? Ph.D. Candidate Antai College of Economics & Management ???????? Department of Mathematics ??? Shanghai Jiaotong University (Minhang Campus) 800 Dongchuan Road 200240 Shanghai P. R. China ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.