Skip to content

gdist and gower distance

2 messages · Alessio Boattini, Jari Oksanen

#
Dear All,
 
I would like to ask clarifications on the gower distnce matrix calculated by the function gdistin the library mvpart.
Here is a dummy example:
Loading required package: survival 
Loading required package: splines 
 mvpart package loaded: extends rpart to include
 multivariate and distance-based partitioning
[,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
1        2
2 2.828427         
3 5.656854 2.828427
 
##########################
doing the calculations by hand according to the formula in gdist help page I get the same results. The formula given is:
 'euclidean'   d[jk] = sqrt(sum (x[ij]-x[ik])^2)
#################################
[1] 2.828427
1         2
2 0.7071068          
3 1.4142136 0.7071068
 
#######################################
doing the calculations by hand according to the formula in gdist help page cannot reproduce the same results. The formula given is:
'gower'       d[jk] = sum (abs(x[ij]-x[ik])/(max(i)-min(i))
##########################################
 
Could anybody please shed some light?
 
Regards,
 
Alessio Boattini
#
On Tue, 2004-11-09 at 12:59, Alessio Boattini wrote:
There seems to be a bug in documentation. The function uses different
calculation than the help page specifies. Look at the 'gdist' code. Just
to make things easier: In the function body, gower is method 6, and
Euclidean distances are method 2.

Gower's original paper is available through http://www.jstor.org/
(Biometrics Vol. 27, No. 4, p. 857-871; 1971).

cheers, jari oksanen