ordination and clustering with continous and categorical variables
On Sun, 2010-12-19 at 21:07 +0100, Mario Brusadin wrote:
Dear all, Apologies in advance for the total beginner's question. I would like to perform a simple ordination on a dataset of species traits, contaning both continous and categorical variables, as well as a cluster analysis on the same dataset. My aim is to identify, species niche-breadth using the ordination and identify any clusters of similar species, with the cluster analysis. I would possibly like to overlay the results of the cluster analysis results onto an ordination plot. I have seen from previous discussions on this mailing list that it is possible to apply hierchical clustering to a distance matrix, created with the function daisy from the cluster package. This function can be set to use the Gower dissimilarity coefficient (1971), to create a distance matrix, from a dataset that contains both continous and categorical variables. Is the best option available?
Etienne Laliberte's FD package contains gowdis() which includes several extensions to Gower's Coefficient motivated form an ecological viewpoint. See it and the references cited for more information. On the other hand, daisy() is robust and well tested. Which to use will depend on whether you need Podani's extensions.
As for the ordination, I am not entirely sure about which ordination method which I should use. Is Correspondence Analysis suitable or would it be better to use Principal coordinates analysis? Any suggestions/help would be greatly appreciated!
I wouldn't use CA for this. Principal Coordinates (PCoA) would be a starting point, but nMDS (metaMDS() in package vegan) would be my preferred method if ordinating sites. One problem, or rather issue, I foresee is that neither of these techniques use the original species information - it is effectively lost when we convert to dissimilarities. Species scores can be located in the ordination space, where they are the weighted averages of the site scores. How were you planning on investigating niche breadth from the ordination? What would niche-breadth relate to in terms of traits? With nMDS you can't treat the "axes" separately - there aren't two independent gradients in a 2d nMDS solution, you have to work with the configuration in 2-d space. You can fit a model to this configuration using ordisurf (or do it by hand using gam() ), but then extracting niche widths from a smoother-based model is problematic - but see Heegaard's paper on "borders": Heegaard E. 2002. The outer border and central border for species-environmental relationships estimated by non-parametric generalised additive models. Ecological Modelling 157: 131-139. although I'm not aware of a generally-available R implementation.
Gower, J. C. (1971) A general coefficient of similarity and some of its properties, Biometrics 27, 857?874.
HTH G
Cheers Mario -- Padova Italy
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%