My question is with respect to mCLUST and the values of BIC and log
likelihood. The relevant part of my R script is:
######################### BEGIN MDS ANALYSIS #########################
#load data
data <- read.table("Ecoli33_Barry.dis", header = TRUE, row.names = 1)
#perform MDS Scaling
mds <- metaMDS(data, k = Dimensions, trymax = 20, autotransform =TRUE,
noshare = 0.1, wascores = TRUE, expand = TRUE, trace = FALSE, plot = FALSE,
old.wa = FALSE)
######################### BEGIN EM ANALYSIS #########################
#Use the points determined by MDS to perform EM clustering.
#Allow only the unconstrained models. Sometimes, constrained models mess
things up!
EMclusters <- mclustBIC(mds$points, G=Clusterrange, modelNames= c("VII",
"VVI", "VVV"), prior=NULL, control=emControl(),
initialization=list(hcPairs=NULL, subset=NULL, noise=NULL),
Vinv=NULL, warn=FALSE, x=NULL)
The input data are in the form of an N X N matrix of pairwise genetic
distances between strains. Those distances can either be the total
number of differences over X characters, or can be normalized to the
fraction
of characters that differ by dividing the number of differences by X.
When the data are the total number of differences (over 5866 characters),
the optimal model is VVV for which BIC is -944.1225 and the likelihood
is -452.8305. Two clusters are found
When the data are normalized to the fraction of characters that differ,
the optimal model is VII for which the BIC is 202.3095 and the likelihood
is 127.3786 . Four clusters are found.
There are several things that I do not understand:
(1) How can log likelihood be a positive number?