Hello, I'd like to classify data with kmeans algorithm. In my case, I should get 2 clusters in output. Here is my data colCandInd colCandMed 1 82 2950.5 2 83 1831.5 3 1192 2899.0 4 1193 2103.5 The first cluster is the two first lines the 2nd cluster is the two last lines Here is the code: x = colCandList$colCandInd y = colCandList$colCandMed m = matrix(c(x, y), nrow = length(colCandList$colCandInd), ncol=2) kres = kmeans(m, 2) Is there a way to retrieve both clusters in output of the algorithm in order to process in each cluster ? (I am looking for smthing like kres$clustList ... where I can process each cluster) kres$cluster did not yield what I expected ... thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/kmeans-how-to-retrieve-clusters-tp4426427p4426427.html Sent from the R help mailing list archive at Nabble.com.
kmeans: how to retrieve clusters
4 messages · ikuzar, Peter Langfelder, R. Michael Weylandt
On Mon, Feb 27, 2012 at 3:18 PM, ikuzar <razuki at hotmail.fr> wrote:
Hello, I'd like to classify data with kmeans algorithm. In my case, I should get ?2 clusters in output. Here is my data colCandInd ? ? ? colCandMed 1 ? ? ? 82 ? ? ? ? ? ? ? ?2950.5 2 ? ? ? 83 ? ? ? ? ? ? ? 1831.5 3 ? ? ? 1192 ? ? 2899.0 4 ? ? ? 1193 ? ? 2103.5 The first cluster is the two first lines the 2nd cluster is the two last lines Here is the code: x = colCandList$colCandInd y = colCandList$colCandMed m = matrix(c(x, y), nrow = length(colCandList$colCandInd), ncol=2) kres = kmeans(m, 2) Is there a way to retrieve both clusters in output of the algorithm in order to process in each cluster ? (I am looking for smthing like kres$clustList ... where I can process each cluster) kres$cluster did not yield what I expected ...
Not sure what you mean by "process each cluster" and why kres$cluster is not what you expected. kres$cluster will tell you which cluster each point (row of your matrix) belongs to. The result depends on how you initialize the kmeans since the inter-point distances are quite similar to one another. For example, I get > set.seed(10)
kres = kmeans(m, 2) kres$cluster
[1] 2 2 1 1
set.seed(1) kres = kmeans(m, 2) kres$cluster
[1] 1 1 2 2
set.seed(200) kres = kmeans(m, 2) kres$cluster
[1] 2 2 1 1
kres = kmeans(m, 2) kres$cluster
[1] 1 2 1 2 So 3 times out of 4 I get the result you expect, and once a different one. If you need the result in a different format, that should be no problem.
Hi, Ok, I understand what you mean. I wanted to get sorted data group by cluster in output ... But I have to do it myself using kres$cluster thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/kmeans-how-to-retrieve-clusters-tp4426427p4427543.html Sent from the R help mailing list archive at Nabble.com.
But it won't be hard at all....you can likely get what you need using the tapply() function (or ave) Michael
On Tue, Feb 28, 2012 at 4:33 AM, ikuzar <razuki at hotmail.fr> wrote:
Hi, Ok, I understand what you mean. I wanted to get sorted data group by cluster ?in output ... But I have to do it myself using kres$cluster thanks for your help -- View this message in context: http://r.789695.n4.nabble.com/kmeans-how-to-retrieve-clusters-tp4426427p4427543.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.