non-metric multidimensional scaling
Falk,
On 9/06/10 16:19 PM, "Falk Hildebrand" <hagen804 at yahoo.de> wrote:
Dear list,
I have been using the vegan package to do mds via the metaMDS function, but I
have some questions regarding the output.
1) First off about the rankindex function {vegan}: On my data I always get
values that I would consider as low, e.g. something in the range of 0.0344 as
best result (euclidean) and the mean being 0.028 over 7 other metrices. Do
results as low as this have any relevance? Are there some guidelines as to
what absolute (or relative) values one should at least obtain to make a
distinction?
The rankindex() only ranks (or orders). It does not pretend to do any testing. However, value of 0.0344 is low. Typically, there are two alternative explanations: your environmental variables are weak, or you have too many of them in one analysis. If you sum up many environmental variables, noise will dominate over signal. If you only use some of the important variables, your signal will be stronger and correlations are higher. See bioenv() function in vegan for further details. Another alternative is to have a standardized PCA of your environmental variables, and base the rankindex() on some of the first axes.
2) Is there a way to estimate what percentage of the variation within the data can be explained by the mds?
This does not make sense, because NMDS does not try to explain the
variation. Moreover, it is a non-linear method and a good (= clearly
non-linear) NMDS will always "explain less of the variation" than
corresponding linear analysis. However, you can use function stressplot() to
see the normal statistics. This is documented in the metaMDS help page and
in the vegan FAQ that you can read in an R session using command
vegandocs("FAQ").
3) using envfit {vegan} I get significant p-values for 5 out of 14 env.
variables/factors (which is of course very nice). However, if I do a CCA and a
ANOVA (call: anova(cca,by="terms",permu=200)) with the same environmental
values, usually only one of these same variables/factors ends up being
significant. I am aware that these are different techniques, but I always
thought that CCA was supposed to "force" the ordination on the env. vars, so
why then would I get much better p-values for the unconstrained nmds (I use 5
dimensions in the nmds)?
The analyses are different. In constrained ordination you predict species abundances from environmental variables with multiple regression. In envfit() you predict each environmental variable separately form your ordination scores. In particular, when you environmental variables are correlated, only one or some few of the will be important in constrained ordination, but all separately will be nearly equally important in envfit().
4) how can I interpret the relation between species and the environmental fit in a nmds plot call? The same as sites and env. fit? e.g. ef=envfit(nmds,environment) plot(ef); points(nmds, dis = "species");
The species scores are weighted averages. So they have similar interpretation as species scores in *CA. This is documented in metaMDS help. Cheers, Jari Oksanen