Which model to keep (negative BIC)
Quoting "cladoo.26" <cladoo.26 at laposte.net>:
Hi, My questions concern the function 'mclustBIC' which compute BIC for a range of clusters of several models on the given data and the other function 'mclustModel' which choose the best model and the best number of cluster accordind to the results of the previous cited function. 1) When trying the following example (see ?mclustModel), I get negative BIC computed by 'mclustBIC', and the best model according to the results of 'mclustModel' is the one with the highest BIC (i.e. the closer to zero). irisBIC <- mclustBIC(iris[,-5]) plot(irisBIC) mclustModel(iris[,-5], irisBIC) Because I don't find anything about this point, could someone confirm that when the BIC are positive, we try to the minimize the criterion (the model with the smallest BIC is the best one) but when the BIC are negative we look for the higher BIC (the model with a the BIC closest to zero is the best one) ?
The mclust package seems to be using a definition of BIC that is the
negative of the usual one, i.e. the bic() function in the mclust package
returns
2 * loglik - nparams * log(n)
where "loglik" is the log likelihood, "n" is the number of observations
and "nparams" is the number of parameters.
BIC is normally defined as
-2 * loglik + nparams * log(n)
and the optimal model is the one with the minimum BIC. However in this
case, you want to maximize it.
2) Does the $G argument from the output of 'mclustModel' represent the best number of clusters for the chosen model ?
According to the documentation it does, and you can verify from your plot that the VEV model with 2 components has maximum "BIC"
Many thanks, this is my first post on R help, but I often consult the forum for 4 years. Cladoo
-----------------------------------------------------------------------
This message and its attachments are strictly confidenti...{{dropped:8}}