Skip to content

pca or nmds (with which normalization and distance ) for abundance data ?

9 messages · claire della vedova, stephen sefick, Alan Haynes +2 more

#
On Thu 13 Dec 2012 09:24:41 AM CST, claire della vedova wrote:
Is the Hellinger transform done on relative proportions?
You could use something like the broken stick model or others to access 
how many axes are necessary, but two axes explaining <40% of the 
variation seems low.
My opinion is PCA on hellinger transformed relative proportions "means" 
more than an NMDS
I sounds like the normalization you are referring to is relative 
proportion which is si/sum(s); s is a vector of taxon at a site.
I am reading about co-inertia analysis now as it may be useful for some 
of the things that I am planning on doing.  This method looks promising.

You are going to have to decide on what type of ordination to use with 
COIA...

HTH,

Stephen
#
Hello Folks,

Kruskal's "rule of thumb" really is a rule of thumb. That is, it is intended for a rough guideline. In that sense, there is no difference to Clarke's rules. However, I wouldn't judge usability simply by stress: solutions with very low stress can be useless and solutions with fairly high stress can be usable. In stress it is a question about many things, but a large portion of stress is similar as signal/noise ratios. The signal is more difficult to detect with high noise, but if you detect the signal, the amount of noise does not matter. I have quite often seen pretty usable solutions with stress around or above 0.20 (20%), at least when using external explanatory variables. There are limits, though. If you trace single runs, you may see that random starting configurations start typically start with stress 0.4 (40%) or a bit higher. If you cannot improve from that, the solution probably is pretty useless (and metaMDS you will probably have no convergent solutions). However, instead of discarding the results, you may first start with stricter convergence criteria for monoMDS (if you use monoMDS). See its help pages (next version of vegan will have stricter limit for "scale factor of gradient", sfgrmin). 

There is also a limit for low stress. In fact, the current vegan warns of too low stress (Kruskal's "perfect" fit). This is usually a symptom of insufficient data (too many dimensions for too few points, dissimilarities found from too few variables).

In my opinion, ecologists are often too much obsessed with goodness of fit values. This is true in general, but also very manifest with multivariate method. I do think that if you, say, in PCA or RDA "explain" something like >50%, there is something suspect in your analysis. Typical reasons are insufficient data (too few rows or columns) or not really multivariate data. Sometimes there are some very dominant species (high variance) so that the analysis need not care but about a couple of species, and that is an easy task. If you transform your data so that high abundances are squashed down and variances equalized, or even made equal, the data become more multivariate (= all species count). Typically this means that lower proportion of variance is "explained", but often the results are more interpretable. This also happens when you change models: Unscaled PCA/RDA using variances "explains" much of the variance, scaled PCA/RDA using correlations "explains" much less, and CA/CCA studying deviations from expectations "explains" the least. Typically the usability and interpretability of the results improves as "explanatory power" decreases. The same also often holds for NMDS: Euclidean distances often give lower stress and pooorer results athn dissimilarities that treat all species more equally.

Not really R, but perhaps I'm forgiven (this time),

Cheers, Jari Oksanen
#
Claire, Here some small comments
On 13/12/2012, at 17:24 PM, claire della vedova wrote:

            
These numbers cannot be used to say which of these methods is better. You need other criteria. Some people may have strong opinions on the choice here, but these opinions cannot be based on these numbers -- they are based on something else (I do have such an opinion, but I abstain from expressing my opinion).
Hellinger transformation was suggested for Euclidean metric, and normally it is used in PCA/RDA (which are based on Euclidean metric although they do not explicitly calculate Euclidean distances). I haven't heard of any advantages of Hellinger transformation with Bray-Curtis dissimilarity. I suggest you don't use it with Bray-Curtis. I don't know if Kulczy?ski dissimilarity is any better than, say, S?rensen dissimilarity (and both seem to be difficult to spell), but certainly it belongs to the same group of usually well behaving dissimilarities as variants of Bray-Curtis or Jaccard.
Certainly there is a high number of methods you can apply, but why? What you try to analyse? What are your questions?

Cheers, Jari Oksanen
#
On Thu, 2012-12-13 at 14:03 -0600, Stephen Sefick wrote:
<snip />
The transformation includes division by by the row sum and hence
conversion to proportions. As such it can be applied to count data or
relative abundance data; with the latter the division by row sum will
have no effect and then the transformation collapses to a simple square
root transformation of the proportional abundance data.

This is one of the reasons for the apparent contradictions over the
utility of the chord distance in ecological and palaeoecological
disciplines. In the latter we commonly use proportional data whilst
count abundances are common in the former. Directly applying the chord
distance to count abundances carries with it the baggage of the
Euclidean distance (squared differences emphasise the big things). But
chord distance applied to proportional data *is* the Hellinger distance
and hence palaeoecologists have found the chord distance a useful
dissimilarity coefficients in their field.

<snip />
?? NMDS with Hellinger distances could optimise a k-D PCA with Hellinger
transform.

Given that NMDS essentially subsumes PCA I'm not sure what you are
getting at.

G
#
On Fri 14 Dec 2012 05:08:32 AM CST, Gavin Simpson wrote:
Gavin, maybe I have spoken beyond my knowledge.  My though was that a 
PCA has a unique solution and is therefore "better" (as long as an 
appropriate distance is used that deals with the double zero problem 
effectively).  I am sure that this is too simple for the reality of the 
situation.  I don't know what a k-D PCA is.  Would you mind explaining 
or directing me to some reading material?
I don't understand.  Would you mind explaining this?
many thanks,

Stephen

  
    
#
On Fri, 2012-12-14 at 06:22 -0600, Stephen Sefick wrote:
<snip />
By k-D PCA I meant that in nMDS you need to state the dimensionality; in
metaMDS() we start the process from a Principal Coordinates of the data
(PCoA == PCA when Euclidean distances used). I meant that nMDS for say
2d solutions can optimise the configuration arising from the first two
PCA axes.

I don't see the unique solution of PCA as an implicit advantage of that
method. It has a unique solution because the possible solutions are
constrained by the approach; linear combinations of the variables which
best approximate the Euclidean distances between samples. NMDS
generalises this idea extensively into a problem of best preserving the
mapping of the dissimilarities. As such it can do a better job of
drawing the map but that comes at a price.

Again though; horses for courses.
I meant in the sense that PCA is special case of Principal Coordinates
and that nMDS generalises Principal coordinates.

I don't get the point of saying one method is "better" than any other.
Each has uses etc. I certainly don't think any one method "means" more
than the other.

G
#
On Fri 14 Dec 2012 06:51:56 AM CST, Gavin Simpson wrote:
Point taken.  As always, it depends on the question that you are trying 
to answer.  Thank you for the discussion and clarification.