metaMDS: how does stress affect ordination distance?
Quoting Kim Milferstedt <milferst at uiuc.edu>:
Hello, I am using metaMDS in vegan 1.13-1 on R 2.6.1 for ordinating microbial sequence data. If've got three general question about nmds using metaMDS: 1) Is it fair to assess the range of the x and y axes in nmds for comparable data with similar ranges of observed distance?
The scaling of NMDS is undefined, and in general you cannot compare axis scaling across ordinations. Indeed, you can multiply any scores by any constant and it doesn't change the configuration nor the solution. However, metaMDS implements Minchin's (DECODA) half-change scaling that fixes the scale: one unit means halving of similarity from the replicate similarity. Making some assumptions, you can compare the scale. The replicate similarity is one key concept here: you must assume that the solutions are comparable in that sense as well. Replicate similarity is the estimated dissimilarity among points at zero distance in ordination, or an estimate of dissimilarity of replicate samples form the same community, but found from the dissimilarity--distance plot of NMDS.
2) What effect does Kruskal's stress have on the scaling in metaMDS's analysis?
None.
3) Is Kruskal's stress multiplied by a factor of 100 in metaMDS as well, as metaMDS relies on isoMDS (see R-mailing list archive under ``isoMDS - high stress value and strange configuration'')?
metaMDS uses isoMDS and the same stress as isoMDS. That is, a "percent stress" multiplied with 100.
Here's a description of my situation: My sequences come from 12 samples. Depending on their level of sequence similarity, I group them into 300 to 16 groups (300 unique sequence types to 16 sequence types that allow sequences to be 20% different). For all the groupings, the overall observed distance in the data remains quite similar. I now want to see at what level of similarity, samples start coalescing in an ordination plot. For this I use metaMDS for various levels of similarity. I assume that I can see the samples coalesce by observing the range of the x and y axes shrink in the nmds plot (i.e. the ordination distance). As I expected, in general, the range of the x and y axes of the nmds plot is decreasing the less stringent I group sequences together. But there's one exception that puzzles me: One plot has vastly different ranges for the x and y axes than the other plots (200 times wider than for all the others).
You may inspect the scaling plot by calling metaMDS with argument plot = TRUE which plots the half-change scaling regression. If you really have only 12 points, you may be stretching some underlying logic beyond its breaking point.
I noticed that for the exceptional grouping, the calculated Kruskal's stress was about three orders of magnitude smaller than for all the others, even though the raw data fed into metaMDS looks very much like its neighboring groupings. What is happening at this one very different analysis?
Three order of magnitude is quite a lot for a value that is bound to be between 0 and 100, when values below one surely are artefacts: no stress but complete mapping. Possibly you don't have so many points that NMDS is wortwhile. You can always map three points in a plane with lighter machinery than NMDS:
I have not posted any sample data as it is a rather large amount of data. I tried producing a smaller dummy sample but those data did not reproduce the effect. Thanks for helping me out!
Cheers, jari