Quoting "cladoo.26" <cladoo.26 at
laposte.net>:> Hi,
>
> My questions concern the function 'mclustBIC' which compute BIC for
a range
> of clusters of several models on the given data and the other function
> 'mclustModel' which choose the best model and the best number of
cluster
> accordind to the results of the previous cited function.
>
> 1) When trying the following example (see ?mclustModel), I get negative BIC
> computed by 'mclustBIC', and the best model according to the
results of
> 'mclustModel' is the one with the highest BIC (i.e. the closer to
zero).
>
> irisBIC <- mclustBIC(iris[,-5])
> plot(irisBIC)
> mclustModel(iris[,-5], irisBIC)
>
> Because I don't find anything about this point, could someone confirm
that
> when the BIC are positive, we try to the minimize the criterion (the model
> with the smallest BIC is the best one) but when the BIC are negative we
look
> for the higher BIC (the model with a the BIC closest to zero is the best
one)
> ?
The mclust package seems to be using a definition of BIC that is the
negative of the usual one, i.e. the bic() function in the mclust package
returns
2 * loglik - nparams * log(n)
where "loglik" is the log likelihood, "n" is the number of
observations
and "nparams" is the number of parameters.
BIC is normally defined as
-2 * loglik + nparams * log(n)
and the optimal model is the one with the minimum BIC. However in this
case, you want to maximize it.
> 2) Does the $G argument from the output of 'mclustModel' represent
the best
> number of clusters for the chosen model ?
According to the documentation it does, and you can verify from your
plot that the VEV model with 2 components has maximum "BIC"
> Many thanks, this is my first post on R help, but I often consult the forum
> for 4 years.
>
> Cladoo
>
-----------------------------------------------------------------------
This message and its attachments are strictly confidenti...{{dropped:8}}