An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20040203/2ae3f45b/attachment.pl
Clustering with 'agnes'
5 messages · Arnav Sheth, Uwe Ligges, Christian Hennig
Arnav Sheth wrote:
Hello, I had a question regarding clustering using the agnes() function from the 'cluster' package. I was wondering if anyone knew how I can identify cluster points after running the agnes function. For example, I created a dataset with points randomly scattered around (0,0), (0,1) and (1,0). After clustering, the dendrogram shows all the clustered points and I get the ordering and height and the agglomerative coefficient. But nowhere do I see the three actual points listed. Although agnes clusters until there is one main cluster, it is clear that at three clusters, each of the clusters consist of points around the three main points. I was wondering if there was any way in which I can have R give me the actual cluster points at three (or any number, for that matter) clusters, ie (0,0), (0,1) and (1,0). A visual display of the clusters would be even better. I have tried using idenfity after converting the agnes object to an hclust object, but that only gives me a listing of the points in each cluster. I hope this question is clear. I am a little new with both clustering and using R for clustering, so please ask me to clarify if anything is unclear. Your help would be most appreciated!
See the example of ?agnes, where the points are labeled. Most easily use agnes() on a data.frame with rownames. Uwe Ligges
With regards, Arnav [[alternative HTML version deleted]]
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Hi Uwe, Thanks for the tip. I already have row labels. My problem is, (referring to the example below) how can I get R to tell me that upto three clusters, the points are clustered around (0,0), (1,0) and (0,1)? Perhaps it is not even possible, I am not sure. With regards, Arnav Quoting Uwe Ligges <ligges at statistik.uni-dortmund.de>:
Arnav Sheth wrote:
Hello, I had a question regarding clustering using the agnes() function from the
'cluster' package.
I was wondering if anyone knew how I can identify cluster points after
running the agnes function.
For example, I created a dataset with points randomly scattered around
(0,0), (0,1) and (1,0). After clustering, the dendrogram shows all the clustered points and I get the ordering and height and the agglomerative coefficient. But nowhere do I see the three actual points listed. Although agnes clusters until there is one main cluster, it is clear that at three clusters, each of the clusters consist of points around the three main points. I was wondering if there was any way in which I can have R give me the actual cluster points at three (or any number, for that matter) clusters, ie (0,0), (0,1) and (1,0). A visual display of the clusters would be even better.
I have tried using idenfity after converting the agnes object to an hclust
object, but that only gives me a listing of the points in each cluster.
I hope this question is clear. I am a little new with both clustering and
using R for clustering, so please ask me to clarify if anything is unclear.
Your help would be most appreciated!
See the example of ?agnes, where the points are labeled. Most easily use agnes() on a data.frame with rownames. Uwe Ligges
With regards, Arnav [[alternative HTML version deleted]]
______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
Hi, the underlying principle of hierarchical clustering is *not* that the clusters can be represented by some centroid points. Most methods are distance based, i.e. they can be calculated also in absence of any R^p representation of the points. If you want to recover centroids, you should do kmeans, normal mixture clustering (mclust) or pam/clara. Of course you can also take the points belonging to an agnes cluster and compute the mean vector (or any other summary statistic), but that's not what hierarchical clustering is meant to do (it may be reasonable with Ward's method, though). Christian
On Wed, 4 Feb 2004, Arnav Sheth wrote:
Hi Uwe, Thanks for the tip. I already have row labels. My problem is, (referring to the example below) how can I get R to tell me that upto three clusters, the points are clustered around (0,0), (1,0) and (0,1)? Perhaps it is not even possible, I am not sure. With regards, Arnav
*********************************************************************** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag-online.de
Hey Christian, That clarifies a lot of things. Thank you. With regards, Arnav ----- Original Message ----- From: "Christian Hennig" <fm3a004 at math.uni-hamburg.de> To: "Arnav Sheth" <sheth at economics.rutgers.edu> Cc: "Uwe Ligges" <ligges at statistik.uni-dortmund.de>; <R-help at stat.math.ethz.ch> Sent: Thursday, February 05, 2004 8:02 AM Subject: Re: [R] Clustering with 'agnes'
Hi, the underlying principle of hierarchical clustering is *not* that the clusters can be represented by some centroid points. Most methods are distance based, i.e. they can be calculated also in absence of any R^p representation of the points. If you want to recover centroids, you should do kmeans, normal mixture clustering (mclust) or pam/clara. Of course you can also take the points belonging to an agnes cluster and compute the mean vector (or any other summary statistic), but that's not what hierarchical clustering is meant to do (it may be reasonable with Ward's method, though). Christian On Wed, 4 Feb 2004, Arnav Sheth wrote:
Hi Uwe, Thanks for the tip. I already have row labels. My problem is, (referring
to the
example below) how can I get R to tell me that upto three clusters, the
points
are clustered around (0,0), (1,0) and (0,1)? Perhaps it is not even possible, I am not sure. With regards, Arnav
*********************************************************************** Christian Hennig Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag-online.de