Skip to content
Prev 67963 / 398506 Next

Very Slow Gower Similarity Function

Quoting Martin Maechler <maechler at stat.math.ethz.ch>:
The Gower coefficient I am referring to comes from his 1971 article in
Biometrics (27(4):857-871). It differs from most commonly used measures (but
not, apparently, daisy!) by allowing the incorporation of quantitative and
qualitative (binary or unordered multistate characters) variables, and also by
providing a mechanism for dropping missing values from similarity calculations.
This is also covered in Legendre and Legendre.
I was unaware of the daisy function. Looking over it now it differs from the
Gower coefficient primarily in the method of standardization. Gower
standardized each variable by dividing it by it's range ("ranging"), where
daisy does a more conventional standardization (-mean and /SD). As I understand
it, there isn't much to recommend standardizing over ranging (or vice versa) so
daisy may provide a useful alternative for my project. I'll have to look into
it!

Thanks,

Tyler