Skip to content

Data transformation prior to RDA

7 messages · Mariano Devoto, Michael Denslow, Etienne Laliberté +2 more

#
Hi Mariano,
Have a look at the vegan function ?decostand with method =
'hellinger'. I believe that it is discussed and recommended in:

Legendre, P. & Gallagher, E.D. (2001) Ecologically meaningful
transformations for ordination of species data. Oecologia 129;
271?280.

Hope this helps,

Michael

  
    
#
Are your variables species abundances, or other types of descriptors? If
the former, standardization by column may not be ideal. Transformations
such as the Hellinger, as suggested by Michael, were developed for
species abundances data (Legendre & Gallagher 2001).

There are many ways to transform variables to normalize them, if that's
what you're after; see chapter 1 or Legendre & Legendre (1998). The
Box-Cox method is possibly the closest thing to what you're asking, i.e.
the "best possible transformation for each of the variables". But I'm
convinced there are as many opinions on the subject as there are
different methods.

Cheers

Etienne

Le lundi 19 avril 2010 ? 20:02 -0300, Devoto Mariano a ?crit :

  
    
#
Dear Devoto Mariano,
On 20/04/10 02:02 AM, "Devoto Mariano" <mdevoto at agro.uba.ar> wrote:

            
You do not need to do this in vegan. Vegan uses methods that cope nicely
with non-centred constraints in original scale. Pierre Legendre explains the
"projection matrix" method where centring is necessary and standardization
useful, but vegan uses different methods (QR decomposition).
This is a difficult question, and there is no easy answer. RDA is basically
a linear method and linear combination scores (LC scores) indeed are linear
combinations of constraints. Nonlinear transformation will change the LC
scores and hence the ordination. Selecting an optimal transformation for
multivariate explanatory variables (constraints) for multivariate response
(species) is a tricky thing, and people usually do not try to do this. I
have no idea how to do this. For instance, I have no idea what would be a
criterion of "good" model -- the only thing I'm sure is that goodness of fit
(eigenvalue) is not a good criterion. What you may be do is to inspect the
constraints by pairs() plots, and see if there are some strange distribution
patterns in pairwise panels. It is a completely different question than
having a good linear relationship between your joint constraints
simultaneously to all species simultaneously, though.

If you intended to ask about transformation of species data, read the other
answers.

Cheers, Jari Oksanen
#
On Tue, 2010-04-20 at 11:48 +1200, Etienne Lalibert? wrote:
I think this needs a little clarification - or a different take on it.
Standardising the species (response) data in PCA/RDA results in each
species having unit variance and hence contributing an equal amount to
the "inertia" measure. This tends to give a more balanced ordination of
abundance data.

In unstandardised PCA/RDA, abundant species with high variance tend to
dominate the resulting ordination.

Standardisation is called for when response data are measured in
different units (i.e. when not species abundances), but may be desirable
for species abundances and in my experience is quite often warranted.

G

  
    
1 day later
#
On 22/04/10 02:21 AM, "Devoto Mariano" <mdevoto at agro.uba.ar> wrote:

            
Dear Devoto Mariano,

Not "just like Canoco does", but there is an improved way. Check ordistep()
for Canoco-style analysis, and add1.cca, drop1.cca for another ways.

Cheers, Jari Oksanen