On Sun, Dec 11, 2011 at 8:43 PM,kbrownk<kbro... at gmail.com> wrote:
The R function hclust is used to do cluster analysis, but based on R
help I see no way to print the actual fusion distances (that is, the
vertical distances for each connected branch pairs seen in the cluster
dendrogram).
Any ideas? I'd like to use them test for significant differences from
the mean fusion distance (i.e. The Best Cut Test).
To perform a cluster analysis I'm using:
x <- dist(mydata, method = "euclidean") # distance matrix
y <- hclust(x, method="ward") #clustering (i.e. fusion) method
plot(y) # display dendogram
You need to dig a bit deeper in the help file :) The return value is a
list that contains, among others, components
'merge' and 'height'. The 'merge' component tells you which objects
were merged at each particular step, and the 'height' component tells
you what the merging height at that step was. The (slightly) tricky
part is to relate the merge component to actual objects - AFAIK there
is no function for that. The function cutree() using the argument k
and varying it between 2 and n should basically do it for you but you
need to match it to the entries in 'merge'. Maybe someone else knows a
better way to do this.
HTH,
Peter