question about similarities cluster using hierclust

Thu, Jun 10, 2004 2:04 AM #

my major is bioinformatics, and i'm trying to cluster ( agglomerate
the closest pari of observations ) in R.


i have already got my own similarities metric, but do not know how to
clust it based on similarities instead of dissimilarities.


since the help document of hierclust mentions the parameter "sim",
which seems good to me, but it doesn't appear in the code of
hierclust() function again? and no sample about it.  so could anybody
please help me as author?

thanks in advance

xinan yang
xinan@molgen.mpg.de

Martin Maechler

Thu, Jun 10, 2004 2:30 AM #

Hmm,

why on earth are you using hierclust() from the ORPHANED package
'multiv',  when there's  hclust() in the core 'stats' package
and 'agnes' in the recommended 'cluster' package ?

To your question  "similarities -> dissimilarities"
the textbooks all deal with this.

Assuming similarities s_ij in [0,1]  {which you can get by scaling},
things mentioned are
e.g.,
       d_ij := 1 - s_ij
       d_ij := sqrt(1 - (s_ij)^2)
also   d_ij := sqrt(1 -   s_ij)

but really, in your situation where you're defining your
similarities yourself, you probably should rather think about
defining your dissimilarities yourself *directly* {i.e. not via
the above formulae}.

Martin Maechler

Xinan> my major is bioinformatics, and i'm trying to cluster ( agglomerate
    Xinan> the closest pari of observations ) in R.


    Xinan> i have already got my own similarities metric, but do not know how to
    Xinan> clust it based on similarities instead of dissimilarities.


    Xinan> since the help document of hierclust mentions the parameter "sim",
    Xinan> which seems good to me, but it doesn't appear in the code of
    Xinan> hierclust() function again? and no sample about it.  so could anybody
    Xinan> please help me as author?

    Xinan> thanks in advance

    Xinan> xinan yang
    Xinan> xinan@molgen.mpg.de