Skip to content
Prev 10706 / 398502 Next

Similarity matrix

On Tue, 10 Apr 2001, Frank E Harrell Jr wrote:

            
Testing with Splus 6.0 shows that dist = 1 - sim is used there, so the
simple assumption is correct.

d <- dist(longley.y)
d <- d/max(d)
hclust(d, "ave")
$merge:
      [,1] [,2]
 [1,]   -2   -4
 [2,]   -6   -8
 [3,]   -1   -3
 [4,]  -14  -15
 [5,]  -10  -11
 [6,]   -5    2
 [7,]   -9  -12
 [8,]  -13    5
 [9,]    1    3
[10,]  -16    4
[11,]   -7    7
[12,]    8   10
[13,]    6   11
[14,]    9   13
[15,]   12   14

$height:
 [1] 0.006262043 0.011753372 0.014643545 0.022447014 0.030057803 0.046146438
 [7] 0.047591522 0.061849713 0.087427750 0.106310219 0.123025045 0.153018638
[13] 0.221579969 0.384352922 0.570969820

$order:
 [1] 13 10 11 16 14 15  2  4  1  3  5  6  8  7  9 12

hclust(sim=1-d, method="ave")
$merge:
      [,1] [,2]
 [1,]   -2   -4
 [2,]   -6   -8
 [3,]   -1   -3
 [4,]  -14  -15
 [5,]  -10  -11
 [6,]   -5    2
 [7,]   -9  -12
 [8,]  -13    5
 [9,]    3    1
[10,]  -16    4
[11,]   -7    7
[12,]   10    8
[13,]   11    6
[14,]   13    9
[15,]   14   12

$height:
 [1] 0.9937379 0.9882466 0.9853565 0.9775530 0.9699422 0.9538536 0.9524085
 [8] 0.9381503 0.9125723 0.8936898 0.8769749 0.8469813 0.7784200 0.6156471
[15] 0.4290302

$order:
 [1]  7  9 12  5  6  8  1  3  2  4 16 14 15 13 10 11

which is the same but expressed in similarities.