Skip to content
Prev 295526 / 398502 Next

Manually modifying an hclust dendrogram to remove singletons

Dear R-Help,

I have a clustering problem with hclust that I hope someone can help
me with. Consider the classic hclust example:

     hc <- hclust(dist(USArrests), "ave")
     plot(hc)

I would like to cut the tree up in such a way so as to avoid small
clusters, so that we get a minimum number of items in each cluster,
and therefore avoid singletons. e.g. in this example, you can see that
Hawaii is split off onto its own at quite a high level. I would like
to avoid having a single item clustered on its own like this. How can
I achieve this?

I have tried manually modifying the tree using dendrapply but have not
been able to produce a valid solution thus far..

Suggestions are welcome.

Best wishes,

Mark