Multidimensional scaling and distance matrices

Thu, Feb 26, 2004 6:10 AM

A few comments:

MDS is normally done on a dissimilarity matrix, not necessarily a distance 
matrix (no need for the triangle inequality to be enforced).

Some MDS software will autmatically map similarity matrices to
corresponding dissimilarity matrices if told to do so (but not all by the
same mapping, usually D = 1-S or D = sqrt(1-S)).  It looks like a
`kinship' matrix is a cousin of a similarity matrix, which usually have
entries between 0 and 1 and with 1 on the diagonal.

The description of MDS in Statistica at

http://www.statsoftinc.com/textbook/stmulsca.html

is entirely in terms of `observed distances', and Kruskal-type MDS.

Note that non-metric MDS is almost impossible to reproduce due to local 
minima, although hopefully one could get a similar solution in a different 
implementation of the same method.

Faced with your example, I would treat it as a covariance matrix, turn it 
into a correlation matrix and take the distances as 1 - correlations, and 
cross my fingers.

On 26 Feb 2004, Federico Calboli wrote:

Dear All,

I am in the somewhat unfortunate position of having to reproduce the
results previously obtained from (non-metric?) MDS on a "kinship" matrix
using Statistica. A kinship matrix measures affinity between groups, and
has its maximum values on the diagonal. 

Apparently, starting with a nxn kinship matrix, all it was needed to do
was to feed it to Statistica flagging that the matrix was NOT a distance
matrix but a kinship one. If Statistica transformed the kinship matrix
into a distance one (how?) is anybody's guess. 

A quick search immediately showed that a multidimensional scaling is
done on a distance matrix. See for instance:
MASS4, pg 304
"Elements of computational statistics", Jentle, pg 122
Edwards and Oman's article, page 2-7 R-News 3/3 

The fact that Statistica happily perform MDS on a "kinship" matrix is
puzzling. Indeed, I would expect errors, as in the following toy
example, without transforming the kinship matrix to distances:

test

           V1          V2          V3          V4          V5
1 0.198716340 0.003612042 0.011926851 0.019737349 0.015021053
2 0.003612042 0.066742885 0.013809924 0.005121996 0.011175845
3 0.011926851 0.013809924 0.197337389 0.013893087 0.006405424
4 0.019737349 0.005121996 0.013893087 0.216047450 0.006218477
5 0.015021053 0.011175845 0.006405424 0.006218477 0.118812936

cmdscale(test)
   [,1] [,2]
V1  NaN  NaN
V2  NaN  NaN
V3  NaN  NaN
V4  NaN  NaN
V5  NaN  NaN
Warning messages:
1: some of the first 2 eigenvalues are < 0 in: cmdscale(test)
2: NaNs produced in: sqrt(ev)

isoMDS(test)

Error in isoMDS(test) : NAs/Infs not allowed in d

sammon(test)

Error in sammon(test) : initial configuration must be complete
In addition: Warning messages:
1: some of the first 2 eigenvalues are < 0 in: cmdscale(d, k)
2: NaNs produced in: sqrt(ev)


The colleagues who used the above routine are unable to tell me with
certainty whether Statistica used metric/non metric scaling, and if non
metric whether a Kruskall or a Sammon scaling. 

In any case, I would simply like to ask the memebers of the list if I am
correct in thinking that MDS can ONLY be performed on a distance matrix,
and I can therefore reasonably expect that some form of transformation
to a distance matrix has been performed by Statistica prior to the MDS.
It would at least be a first step to understand what exactly Statistica
did with the data.

Regards,

Federico Calboli

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

Multidimensional scaling and distance matrices

Thread (5 messages)