This question teeters on the brink of general statistics rather than an R-specific issue, but I'm hoping it boils down to how I coded V&R's EM algorithm for finite mixture modeling and not something worse (i.e. that my plan is impossible). Briefly, I'm trying to simulate an experiment where the hypothesis concerns bimodal vs. unimodal distributions, in order to see what conditions (e.g. sample size, size of underlying effect) would allow a difference in distribution type to be detectable. My code keeps giving me such terrible results that they can't be right ... I hope. There are several steps where things may be going wrong, including the inherent power of the EM algorithm in finite mixture modeling. Now for details. I'm designing a psychology experiment. The hypothesis is that stimulus items selected from category A give rise to responses that fall into a bimodal distribution (univariate Gaussian mixture), whereas items from category B cause responses that fall into a normal distribution. Moreover, both A and B give rise to the same "upper" distribution, but A also gives rise to an extra lower distribution. For any given A item and any given subject, the response might fall into the lower or upper subdistribution. To test if A and B really have different types of distributions, multiple subjects (20, say) respond to many items of each type (100 each, say). Each subject generates two distributions, A vs. B. I run the EM algorithm to find the means for the two subdistributions in each. The algorithm finds two means for both, say Am1, Am2, Bm1, Bm2. Since the algorithm is told to find two distributions, even if there's really only one (as hypothesized for B), it seems wrong to test H0: Am1 = Am2 vs. H0: Bm1 = Bm2 - I'd expect both to be rejected. Rather, I want to test H0: Am1 = Bm1 (should differ) vs. H0: Am2 = Bm2 (should be the same). So I run the EM algorithm described in Venables & Ripley (2002:436ff) for finding the parameters of the two subdistributions in a Gaussian mixture. My R translation handles their example fine, so I suppose it's fine. I assume that the functions in the mclust package would also work fine, but I haven't tried them yet. The V&R algorithm only finds local optima, so I write further code to compare multiple solutions and pick the one that gives the lowest value for the objective function. It seems that there are always only two solutions if the initial guess for the upper mean is the location of the distribution maximum and the initial guess for the lower mean is varied, so to save time, the two solutions are found by initially setting the initial values of the two means the same, and then moving the initial value of the lower mean down until the second solution is found. This part SEEMS to be working OK. Then I fake up some data, generating by-subject distributions for A and for B, using values that I suspect are realistic based on a pilot experiment. In particular, Am2 = Bm1 = Bm2 (approximately) while Am1 < Bm1. But here things go wrong: even after running many fake experiments, the extracted Am1 and Bm1 wobble around randomly across "subjects", instead of Am1 < Bm1 as predicted. The following fake experiment is typical. These were the input parameters: ns=20 # Number of subjects ni=100 # Number of items (in A and B) Ap=0.6 # True probability of falling into lower distribution for A # Seems counterintuitively large, but this is what # V&R's algorithm gave for my pilot data, and # changing it to Ap = 0.3 doesn't seem to help Au1=80 # True Am1 (based on pilot data) Aus1=30 # True SD for Am1 across subjects (just made up) As1=44 # True SD of lower A distribution # (based on pilot data) # (assumed to be the same for all subjects) Au2=115 # True Am2 (based on pilot data) Aus2=30 # True SD for Am2 across subjects (just made up) As2=17 # True SD of upper A distribution # (based on pilot data) Bp=0.6 # Ditto for distribution B Bu1=90 Bus1=30 Bs1=44 Bu2=115 Bus2=30 Bs2=17 These were the terrible results: Subject Am1 Bm1 1 53.61215 47.33698 2 63.81618 38.90335 3 91.92926 68.35133 4 74.38221 30.20928 5 33.59179 88.28502 6 92.94248 54.28026 7 21.74774 83.08491 8 148.1716 116.81887 9 74.26598 68.99257 10 96.0699 88.71108 11 66.56136 39.68733 12 107.08474 43.30455 13 17.2929 65.73852 14 31.77744 40.15622 15 68.14085 38.51634 16 59.58934 99.1507 17 10.21375 178.60228 18 59.46793 82.12094 19 63.66598 103.5837 20 70.46074 60.05804 > t.test(A,B,paired=T) Paired t-test data: A and B t = -0.5612, df = 19, p-value = 0.5812 So choose one or more of the following - (1) If those are realistic initial guesses, the means are too close or SDs are too high to expect the EM algorithm to find them. (2) Give each subject 1000 items, or run 100 subjects, and it'll work. (If so, how can I speed up my simulations to see if this strategy has a chance?) (3) Test the hypothesis in a totally different way, such as.... (4) Looks fine; go bug-hunting. (5) Use a different finite mixture package. (6) Etc.... -- James Myers Graduate Institute of Linguistics National Chung Cheng University 168 University Road, Min-Hsiung Chia-Yi 62102 TAIWAN Email: Lngmyers at ccu.edu.tw Web: http://www.ccunix.ccu.edu.tw/~lngmyers/