Document clustering for R
If you are able to implement the computation of the distance matrix, you can use methods such as pam, agnes and hclust, which operate on dissimilarity matrices of any kind. You may also perform a multidimensional scaling with isoMDS, sammon or cmdscale and use any clustering algorithm for n*p data on the outcome. Best, Christian
On Mon, 12 Sep 2005, Raymond K Pon wrote:
I'm working on a project related to document clustering. I know that R has clustering algorithms such as clara, but only supports two distance metrics: euclidian and manhattan, which are not very useful for clustering documents. I was wondering how easy it would be to extend the clustering package in R to support other distance metrics, such as cosine distance, or if there was an API for custom distance metrics. Best regards, Raymond Pon pon3 at llnl.gov x43062
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
*** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche