kmeans - R-help | R Mailing Lists

kmeans

Tue, Jun 3, 2003 9:59 AM #

Dear helpers
 
I was working with kmeans from package mva and found some strange situations. When I run several times the kmeans algorithm with the same dataset I get the same partition. I simulated a little example with 6 observations and run kmeans giving the centers and making just one iteration. I expected that the algorithm just allocated the observations to the nearest center but think this is not the result that I get...
 
Here are the simulated data

[,1] [,2]
[1,] -1.0    0
[2,]  0.0    3
[3,]  2.0    0
[4,]  2.5    6
[5,]  7.0    1
[6,]  9.0    4

$cluster
[1] 1 1 1 1 2 2
$centers
   [,1] [,2]
1 0.875 2.75
2 8.000 2.50
$withinss
[1] 38.9375  6.5000
$size
[1] 4 2
 
 
Any hints?
 
Thanks a lot 
 
Luis Silva

Brian Ripley

Tue, Jun 3, 2003 10:38 AM #

On Tue, 3 Jun 2003, Luis Miguel Almeida da Silva wrote:

Why does that surprise you?

That's not what the documentation says it does:

     The data given by `x' is clustered by the k-means algorithm. When
     this terminates, all cluster centres are at the mean of their
     Voronoi sets (the set of data points which are nearest to the
     cluster centre).

which is true in your example.  It has run one iteration of re-allocation; 
as you can see by reading the source code or the reference.

[...]

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595