Hi All, I was using the following command for performing kmeans for Iris dataset. Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3) This was giving proper results for me. But, in my application we generate the R commands dynamically and there was a requirement that the column names will be sent instead of column indices to the R commands.Hence, to incorporate this, i tried using the R commands in the following way. kmeans_model<-kmeans((SepalLength+SepalWidth+PetalLength+PetalWidth),centers=3) or kmeans_model<-kmeans(as.matrix(SepalLength,SepalWidth,PetalLength,PetalWidth),centers=3) In both the ways, we found that the results are different from what we saw with the first command (with column indices). can you please let us know what is going wrong here.If so, can you please let us know how the column names can be used in kmeans to obtain the correct results? Many thanks, Raji -- View this message in context: http://r.789695.n4.nabble.com/Help-in-kmeans-tp3430433p3430433.html Sent from the R help mailing list archive at Nabble.com.
Help in kmeans
4 messages · Raji, Christian Hennig, raji sankaran
I'm not going to comment on column names, but this is just to make you aware that the results of k-means depend on random initialisation. This means that it is possible that you get different results if you run it several times. It basically gives you a local optimum and there may be more than one of these. Use set.seed to see whether this explains your problem. Best regards, Christian
On Wed, 6 Apr 2011, Raji wrote:
Hi All, I was using the following command for performing kmeans for Iris dataset. Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3) This was giving proper results for me. But, in my application we generate the R commands dynamically and there was a requirement that the column names will be sent instead of column indices to the R commands.Hence, to incorporate this, i tried using the R commands in the following way. kmeans_model<-kmeans((SepalLength+SepalWidth+PetalLength+PetalWidth),centers=3) or kmeans_model<-kmeans(as.matrix(SepalLength,SepalWidth,PetalLength,PetalWidth),centers=3) In both the ways, we found that the results are different from what we saw with the first command (with column indices). can you please let us know what is going wrong here.If so, can you please let us know how the column names can be used in kmeans to obtain the correct results? Many thanks, Raji -- View this message in context: http://r.789695.n4.nabble.com/Help-in-kmeans-tp3430433p3430433.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
*** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110407/4132c01b/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110407/4e7d15a6/attachment.pl>