Hi,
I have a dataset which has more than two clusters (say 3 clusters).
I used kmeans to cluster the dataset.
I am wondering how I can plot the clustering result on a two-dimensional
figure????
The example in the kmeans help file is as follows:
x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
cl <- kmeans(x, 2, 20)
plot(x, col = cl$cluster)
It seems that this is only working for 2 clusters. When my dataset has more
than two clusters, the plot results are obvious wrong.
Any suggestions to plot more than two clusters.
Thanks!!!
plot clusters
3 messages · pingzhao, Martin Maechler, Christian Hennig
"pingzhao" == pingzhao <pingzhao at waffle.cs.dal.ca>
on Fri, 25 Apr 2003 16:45:19 -0300 writes:
pingzhao> Hi,
pingzhao> I have a dataset which has more than two clusters (say 3 clusters).
pingzhao> I used kmeans to cluster the dataset.
pingzhao> I am wondering how I can plot the clustering result on a two-dimensional
pingzhao> figure????
pingzhao> The example in the kmeans help file is as follows:
pingzhao> x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
pingzhao> matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
pingzhao> cl <- kmeans(x, 2, 20)
pingzhao> plot(x, col = cl$cluster)
pingzhao> It seems that this is only working for 2 clusters. When my dataset has more
pingzhao> than two clusters, the plot results are obvious wrong.
Huh?!
Probably the clustering (method) wasn't what you expected,
but the plot results are obviously correct!
x3 <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2),
matrix(rnorm(100, mean = 1.8, sd = 0.3), ncol = 2))
cl3 <- kmeans(x3, 3, 20)
plot(x3, col = cl3$cluster)
## or even more clear, and also usable on black & white :
plot(x3, col = cl3$cluster, pch = cl3$cluster)
-----
The much more interesting question is what to do when `x' has
more than two *dimensions*.
The easiest is to then plot on the first two principal
components, but I'm sure Christian Hennig will tell about much
more sophisticated solutions coming Monday...
Regards,
Martin
1 day later
Hi,
On Sat, 26 Apr 2003, Martin Maechler wrote:
"pingzhao" == pingzhao <pingzhao at waffle.cs.dal.ca>
on Fri, 25 Apr 2003 16:45:19 -0300 writes:
pingzhao> Hi,
pingzhao> I have a dataset which has more than two clusters (say 3 clusters).
pingzhao> I used kmeans to cluster the dataset.
pingzhao> I am wondering how I can plot the clustering result on a two-dimensional
pingzhao> figure????
pingzhao> The example in the kmeans help file is as follows:
pingzhao> x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
pingzhao> matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
pingzhao> cl <- kmeans(x, 2, 20)
pingzhao> plot(x, col = cl$cluster)
pingzhao> It seems that this is only working for 2 clusters. When my dataset has more
pingzhao> than two clusters, the plot results are obvious wrong.
The much more interesting question is what to do when `x' has more than two *dimensions*. The easiest is to then plot on the first two principal components, but I'm sure Christian Hennig will tell about much more sophisticated solutions coming Monday...
...not without being asked explicitly;-) Christian
*********************************************************************** Christian Hennig Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently) and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg hennig at stat.math.ethz.ch, http://stat.ethz.ch/~hennig/ hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag.de