An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20081230/285703c0/attachment.pl>
I would appreciate some help with clustering
4 messages · Sarah Goslee, mauede at alice.it, Stavros Macrakis
Is this homework? If so, you should discuss it with the instructor, not us. Regardless, the methods you suggested are a reasonable place to start, and you perhaps should have done so first before asking here. You may well have gotten the results you needed without the delay of waiting for an answer on the list. Sarab
On Tue, Dec 30, 2008 at 6:59 AM, <mauede at alice.it> wrote:
I have a binary vector whose length is known.
Such a vector contains an unspecified number of 1s.
My goal is
1. to generate as many clusters as the number of 1s
2. to place the 1 as much as possible at the center of its own cluster
Example. Say I have the following binary vector:
v <- c(0,0,1,0,0,0,0,1,0,1,0,0)
Then I have to get 3 clusters.
I can generate a matrix containing the distance of each element from each one of the
clusters center (the 1s):
1st_1 2nd_1 3rd_1
---|-----------------------------------
0 | 2 7 9
0 | 1 6 8
1 | 0 5 7
0 | 1 4 6
0 | 2 3 5
0 | 3 2 4
0 | 4 1 3
1 | 5 0 2
0 | 6 1 1
1 | 7 2 0
0 | 8 3 1
0 | 9 4 2
Should I input such matrix to R function "dist" and then use for instance PAM or KMEAN
to get the expected 3 clusters ?
I would greatly appreciate some help.
Thank you so much.
Maura
Sarah Goslee http://www.functionaldiversity.org
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20081230/78f2d929/attachment.pl>
On Tue, Dec 30, 2008 at 8:44 AM, <mauede at alice.it> wrote:
It is no homework. It is part of a project where a binary matrix, whose 1s represent the position of the highest DWT coefficients energy, is used as a template to extract signal features. The approach I am following requires each row of the binary matrix (correspondent to a DWT scale level) to be clustered separately subject to the requirements of generating as many clusters as the numbers of 1s and having the 1s a the centers of the respective clusters.
Perhaps look into the rle (run-length encoding) function? It may be
useful to you directly, or the methods it uses internally may be
useful.
-s
PS The word 'cluster' may be confusing in some contexts. Perhaps it
would be better to call them 'runs'.