Skip to content

Partitioning spatial effects using trend surface analysis or PCNM

6 messages · Tim Richter-Heitmann, Gavin Simpson, François Gillet

#
Hi,

i am working on a bacterial dataset on hundreds of plots, distributed over a
10x10 meter grid.

I was using the VarPart function of vegan to identify the pure spatial
effect, with the classical approach of calculating CCAs using orthogonal
polynomes of my x,y coordinates. I have a significant community shift in
some of the plots, and CCA was better depicting this general behaviour in
the ordination space than RDA. Therefore, i am using CCA.

My main way to deal with it is a general additive model using spatial
coordinates, but i would like to use varpart in CCA as a supporting
analysis, as well.

I wanted to compare that to the PCNM approach, but i guess i have messed it
up.
I followed tutorials with the mite dataset, as given in the vegan vignette
and as found on the web (written by  Benoit Gendreau-Berthiaume).

Usually i am using my relative bacteria counts, as CCA is supposed to do its
own data transformation, anyway. I am also limiting my columns (OTUs) to
obtain a matrix balanced between observations and samples, so rare species
are usually not a problem.

Here is what i do:

spat <- as.data.frame(poly(as.matrix(spatxy), degree=3))  

cca1_s <- cca(OTU~., data=spat)
#significances
anova(cca1_s)
anova(cca1_s, by="term", perm=999)

#forward selection for most parsimonious model
cca1_s.f <- ordistep(cca(OTI~1, data=spat), scope=formula(cca1_s),
direction="forward", pstep=1000)
sig1_s.f <- anova(cca1_s.f, by="term", perm=999)

The result is a significant CCA object. Spat is usuable in VarPart and
yields a low but significant value for overall autocorrelation.

For PCNM i do 

rs <- rowSums(OTU)/sum(OTU)
pcnmw <- pcnm(dist(spatxy), w = rs)
cca1_pcnm <- cca(acido1 ~ scores(pcnmw))

pcnmw consists of 250 vectors, and the result is a non-significant CCA
object, where i expected a "finer" spatial decomposition.

The same is true if i am using total count data (hellinger transformed or
not).

I am sure i am doing it wrong, so if you have advise to properly do the
calculation, please let me know. Thank you for the help.





--
View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Partitioning-spatial-effects-using-trend-surface-analysis-or-PCNM-tp7579427.html
Sent from the r-sig-ecology mailing list archive at Nabble.com.
#
Hi "trichter"
On 5 May 2015 at 13:34, trichter <trichter at uni-bremen.de> wrote:
<snip />
Don't think last analysis makes much sense; if you have a cubic polynomials
plus interactions you should only consider the interactions first for
removal, then decide if quadratic rather than cubic are needed
Again, as above, you have to be very careful with this. Just because you
made a matrix with 9 "covariates" it doesn't mean it makes sense to cherry
pick from these terms.
You are supposed to choose from among the set of PCNMs which explain the
species data best, not use them all in the model. The problem appears to be
that you have a model that is far too complex with lots of redundant axes
(or more likely too few constraints).

One suggestion is to use only those PCNMs that have positive spatial
correlation. Compute that using Moran's I of which there are a few
implementations around in various R packages. You can do CCA analysis with
the positive spatial correlation PCNMs separately from the negatively
correlated PCNMs if you wish.

You will probably need to do some type of forward selection but the
preferred method seems to be limited to RDA (because the adjusted R2
measure used in the global significance test isn't worked out for CCA). If
you skip the global test, you could just do forward selection on the
positive PCNMs, but you probably want to try to control for accepting too
many PCNMs by having low entry threshold for significance.

HTH

G

  
    
#
Thank you very much for adressing my problem.

Maybe i can re-formulate it in a different way - moving away from 
specific partial CCA/RDAs to the core of my task.

If said task is the approximate quantification of partial effects on my 
bacterial counts (for example edaphic soil properties, above-ground 
plant diversity, and spatial autocorrelation) via the varpart function 
in vegan, is the classical way of having orthogonal polynomes of the x,y 
axis as CCA/RDA constraints still considered valid?

If i understand you correctly, i would alternatively:
- generate PCNMs from my x,y coordinate matrices
- extract those with a Moran I >0
- perform a RDA/CCA with forward selection
- use the PCNM found to be significant in the varpart function?

I think the only forward selection for CCA would be ordistep function in 
vegan. What would be an acceptable treshold for entering into my final 
set of accepted significant PCNMs?

The other problem is that on the one hand, RDA is not able to separate 
my community shifts as well as CCA, on the other hand varpart is based 
on RDA. I wonder if i can justify using varpart when my ordination of 
choice is based on CCA. But i have never seen a dedicated variance 
partition function for CCA. I just read an old answer of yours:
http://r.789695.n4.nabble.com/partitioning-variation-using-the-Vegan-CCA-routine-td823966.html

So, i can basically transform my raw data to chi2 and use them in an RDA 
to have a CCA proxy?


Thank you very much! as you can see, i am not really trained in statistics.

Tim
On 06.05.2015 02:45, Gavin Simpson-2 [via r-sig-ecology] wrote:

  
    
#
Hi Tim
On 6 May 2015 at 03:47, trichter <trichter at uni-bremen.de> wrote:

            
It is and remains valid; it just isn't set-up to find as wide a range of
spatial patterns as PCNM. Now that may not be a bad thing in its entirety;
we don't have a good way of doing feature selection in constrained
ordination (and I don't consider the global R2 test followed by forward
selection via R2 as a "good" method, it's just better than bog standard
forward selection). Throwing a large set of PCNMs at an ordination sounds
like a recipe for data dredging, *unless* you are very careful.
Correct; the "newer" approach uses an adjusted R2 measure which has only
been worked out and implemented for the RDA case.

Rather than a 0.05 threshold as the baseline, I would go to say 0.01 as the
threshold for inclusion. Then you also need to account for multiple testing
so you adjust this p-value at each step in the forward selection process.
Yes; Pierre Legendre & Eugene Gallagher showed how this could be done in
their 2001 Oecologia paper on Ecologically Meaningful Transformations. You
won't get exactly a CCA by doing RDA on chi-square transformed data, but it
will be close. You can also use the Hellinger transformation which worked
well in the tests that Legendre & Gallagher did in their paper.
You're welcome,

G

  
    
#
Hi Tim,

As CA, CCA is known to be very sensitive to rare species and this is maybe
partly the reason of the "shift" you observe in your communities, due to
some OTUs. RDA is less sensitive to rare species and there is no need to
remove them.
To benefit from the advantages of RDA mentioned by Gavin, you should
pre-transform your community data frame with Hellinger (site profiles) or
chi-squared (double profiles, close to what is achieved in CCA) and check
with a PCA if you still observe a major shift along the first axis.
The choice of a method must not be guided by the pre-supposed better
results you want to get ;-)

All the best,

Fran?ois



-------------------------------------------------------------------------------
Prof. *Fran?ois Gillet*
Universit? de Franche-Comt? - CNRS
UMR 6249 Chrono-environnement
UFR Sciences et Techniques
16, Route de Gray
F-25030 Besan?on cedex
France
http://chrono-environnement.univ-fcomte.fr/
Phone: +33 (0)3 81 66 62 81
iPhone: +33 (0)7 88 37 07 76
Location: La Bouloie, B?t. Prop?deutique, -114L
-------------------------------------------------------------------------------
Associate Editor of* Plant Ecology and Evolution*
http://www.plecevo.eu
-------------------------------------------------------------------------------
Homepage: http://chrono-environnement.univ-fcomte.fr/spip.php?article530
ResearchID: http://www.researcherid.com/rid/B-6160-2008
Google Scholar: http://scholar.google.com.au/citations?user=a5xiIfQAAAAJ

-------------------------------------------------------------------------------

2015-05-06 11:47 GMT+02:00 trichter <trichter at uni-bremen.de>:

  
  
#
Hi,


thank you very much for your advise as well.
The point was, that i already saw the community shift by simple  
clustering/heatmapping the abundance tables. In fact, less than 20%  
OTUs are shared between the 10 samples featuring that dramatic  
community shift and the vast majority of the rest of the samples. So,  
indeed the dominating OTUs in these 10 samples represent rare species  
in the entire dataset, but they are very dominant in those 10.
I started with the rule of thumb to check for the axis length of the  
DCA; it was long enough to decide for CCA for the further downstream  
analysis. I have to admit that i never touched RDA until i realized  
that my variance partitioning approach (vegan's varpart) was actually  
using RDA rather than CCA; i thus calculated partial RDAs as well, but  
i realized that RDA couldnt separate my "outlying samples" from the  
rest. However, the separation is real in terms of OTU abundances, so i  
figured that RDA is not the best tool to visualize my data.
I now have learned that CCA was just very sensible to these highly  
different samples (in fact if you look at the ordination plot, cca1 is  
only separating the 10 samples from the rest, but is not able to  
resolve the remaining samples - these are separated by cca2).
I will now do a varpart with both RDA on hellinger and chi2  
transformed OTU tables and see if the results are dramatically  
different.

Thank you again!

Tim



Zitat von Fran?ois Gillet <francois.gillet at univ-fcomte.fr>: