Dear all
I have a problem in fitting lines() of the normal distributions
identified with Mclust on a histogram or a mclust1Dplot. Here is some
sample code to explain :
set.seed(22)
foo <- c(rnorm(400, 10, 2), rnorm(500, 17, 4))
mcl <- Mclust(foo, G=2)
mcl.sd <- sqrt(mcl$parameters$variance$sigmasq)
mcl.size <- c(length(mcl$classification[mcl$classification==2]),
length(mcl$classification[mcl$classification==1]))
x <- pretty(c(0:44), 100)
#### my plot of histogram and lines of normal distributions
#### SEEMS OK (or am I wrong ?) using frequencies :
histA <- hist(foo, breaks =c(0:44), ylim = c(0,100))
lines(x, dnorm(x, mcl$parameters$mean[1], mcl.sd[1])*mcl.size[1],
col =2, lw=2)
lines(x, dnorm(x, mcl$parameters$mean[2], mcl.sd[2])*mcl.size[2],
col =2, lw=2)
#### my plot of histogram and lines of normal distributions
#### IS wrong when using prob :
mclust1Dplot(foo, parameters = mcl$parameters, z = mcl$z, what =
"density")
histA <- hist(foo, breaks =c(0:44), prob = T, add =T)
lines(x, dnorm(x, mcl$parameters$mean[2], mcl.sd[2]), col =2, lw=2)
lines(x, dnorm(x, mcl$parameters$mean[1], mcl.sd[1]), col =2, lw=2)
In second plot, the bell shaped curves are obviously too high and it
seems that I miss something obvious in scaling dnorm()'s in building
the second plot: I tried different things like scaling dnorm() by the
proportion of individuals belonging to cluster 1 and 2 respectively,
but with no success.
Could someone help to point my errors ?
Many thanks in advance
Fred J.