Hi, i am working on a bacterial dataset on hundreds of plots, distributed over a 10x10 meter grid. I was using the VarPart function of vegan to identify the pure spatial effect, with the classical approach of calculating CCAs using orthogonal polynomes of my x,y coordinates. I have a significant community shift in some of the plots, and CCA was better depicting this general behaviour in the ordination space than RDA. Therefore, i am using CCA. My main way to deal with it is a general additive model using spatial coordinates, but i would like to use varpart in CCA as a supporting analysis, as well. I wanted to compare that to the PCNM approach, but i guess i have messed it up. I followed tutorials with the mite dataset, as given in the vegan vignette and as found on the web (written by Benoit Gendreau-Berthiaume). Usually i am using my relative bacteria counts, as CCA is supposed to do its own data transformation, anyway. I am also limiting my columns (OTUs) to obtain a matrix balanced between observations and samples, so rare species are usually not a problem. Here is what i do: spat <- as.data.frame(poly(as.matrix(spatxy), degree=3)) cca1_s <- cca(OTU~., data=spat) #significances anova(cca1_s) anova(cca1_s, by="term", perm=999) #forward selection for most parsimonious model cca1_s.f <- ordistep(cca(OTI~1, data=spat), scope=formula(cca1_s), direction="forward", pstep=1000) sig1_s.f <- anova(cca1_s.f, by="term", perm=999) The result is a significant CCA object. Spat is usuable in VarPart and yields a low but significant value for overall autocorrelation. For PCNM i do rs <- rowSums(OTU)/sum(OTU) pcnmw <- pcnm(dist(spatxy), w = rs) cca1_pcnm <- cca(acido1 ~ scores(pcnmw)) pcnmw consists of 250 vectors, and the result is a non-significant CCA object, where i expected a "finer" spatial decomposition. The same is true if i am using total count data (hellinger transformed or not). I am sure i am doing it wrong, so if you have advise to properly do the calculation, please let me know. Thank you for the help. -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Partitioning-spatial-effects-using-trend-surface-analysis-or-PCNM-tp7579427.html Sent from the r-sig-ecology mailing list archive at Nabble.com.
Partitioning spatial effects using trend surface analysis or PCNM
6 messages · Tim Richter-Heitmann, Gavin Simpson, François Gillet
Hi "trichter"
On 5 May 2015 at 13:34, trichter <trichter at uni-bremen.de> wrote:
<snip />
Here is what i do: spat <- as.data.frame(poly(as.matrix(spatxy), degree=3)) cca1_s <- cca(OTU~., data=spat) #significances anova(cca1_s) anova(cca1_s, by="term", perm=999)
Don't think last analysis makes much sense; if you have a cubic polynomials plus interactions you should only consider the interactions first for removal, then decide if quadratic rather than cubic are needed
#forward selection for most parsimonious model cca1_s.f <- ordistep(cca(OTI~1, data=spat), scope=formula(cca1_s), direction="forward", pstep=1000) sig1_s.f <- anova(cca1_s.f, by="term", perm=999)
Again, as above, you have to be very careful with this. Just because you made a matrix with 9 "covariates" it doesn't mean it makes sense to cherry pick from these terms.
The result is a significant CCA object. Spat is usuable in VarPart and yields a low but significant value for overall autocorrelation. For PCNM i do rs <- rowSums(OTU)/sum(OTU) pcnmw <- pcnm(dist(spatxy), w = rs) cca1_pcnm <- cca(acido1 ~ scores(pcnmw)) pcnmw consists of 250 vectors, and the result is a non-significant CCA object, where i expected a "finer" spatial decomposition.
You are supposed to choose from among the set of PCNMs which explain the species data best, not use them all in the model. The problem appears to be that you have a model that is far too complex with lots of redundant axes (or more likely too few constraints). One suggestion is to use only those PCNMs that have positive spatial correlation. Compute that using Moran's I of which there are a few implementations around in various R packages. You can do CCA analysis with the positive spatial correlation PCNMs separately from the negatively correlated PCNMs if you wish. You will probably need to do some type of forward selection but the preferred method seems to be limited to RDA (because the adjusted R2 measure used in the global significance test isn't worked out for CCA). If you skip the global test, you could just do forward selection on the positive PCNMs, but you probably want to try to control for accepting too many PCNMs by having low entry threshold for significance. HTH G
The same is true if i am using total count data (hellinger transformed or not). I am sure i am doing it wrong, so if you have advise to properly do the calculation, please let me know. Thank you for the help. -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Partitioning-spatial-effects-using-trend-surface-analysis-or-PCNM-tp7579427.html Sent from the r-sig-ecology mailing list archive at Nabble.com.
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Gavin Simpson, PhD [[alternative HTML version deleted]]
Thank you very much for adressing my problem. Maybe i can re-formulate it in a different way - moving away from specific partial CCA/RDAs to the core of my task. If said task is the approximate quantification of partial effects on my bacterial counts (for example edaphic soil properties, above-ground plant diversity, and spatial autocorrelation) via the varpart function in vegan, is the classical way of having orthogonal polynomes of the x,y axis as CCA/RDA constraints still considered valid? If i understand you correctly, i would alternatively: - generate PCNMs from my x,y coordinate matrices - extract those with a Moran I >0 - perform a RDA/CCA with forward selection - use the PCNM found to be significant in the varpart function? I think the only forward selection for CCA would be ordistep function in vegan. What would be an acceptable treshold for entering into my final set of accepted significant PCNMs? The other problem is that on the one hand, RDA is not able to separate my community shifts as well as CCA, on the other hand varpart is based on RDA. I wonder if i can justify using varpart when my ordination of choice is based on CCA. But i have never seen a dedicated variance partition function for CCA. I just read an old answer of yours: http://r.789695.n4.nabble.com/partitioning-variation-using-the-Vegan-CCA-routine-td823966.html So, i can basically transform my raw data to chi2 and use them in an RDA to have a CCA proxy? Thank you very much! as you can see, i am not really trained in statistics. Tim
On 06.05.2015 02:45, Gavin Simpson-2 [via r-sig-ecology] wrote:
Hi "trichter" On 5 May 2015 at 13:34, trichter <[hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=0>> wrote: <snip />
Here is what i do: spat <- as.data.frame(poly(as.matrix(spatxy), degree=3)) cca1_s <- cca(OTU~., data=spat) #significances anova(cca1_s) anova(cca1_s, by="term", perm=999)
Don't think last analysis makes much sense; if you have a cubic polynomials plus interactions you should only consider the interactions first for removal, then decide if quadratic rather than cubic are needed
#forward selection for most parsimonious model cca1_s.f <- ordistep(cca(OTI~1, data=spat), scope=formula(cca1_s), direction="forward", pstep=1000) sig1_s.f <- anova(cca1_s.f, by="term", perm=999)
Again, as above, you have to be very careful with this. Just because you made a matrix with 9 "covariates" it doesn't mean it makes sense to cherry pick from these terms.
The result is a significant CCA object. Spat is usuable in VarPart and yields a low but significant value for overall autocorrelation. For PCNM i do rs <- rowSums(OTU)/sum(OTU) pcnmw <- pcnm(dist(spatxy), w = rs) cca1_pcnm <- cca(acido1 ~ scores(pcnmw)) pcnmw consists of 250 vectors, and the result is a non-significant CCA object, where i expected a "finer" spatial decomposition.
You are supposed to choose from among the set of PCNMs which explain the species data best, not use them all in the model. The problem appears to be that you have a model that is far too complex with lots of redundant axes (or more likely too few constraints). One suggestion is to use only those PCNMs that have positive spatial correlation. Compute that using Moran's I of which there are a few implementations around in various R packages. You can do CCA analysis with the positive spatial correlation PCNMs separately from the negatively correlated PCNMs if you wish. You will probably need to do some type of forward selection but the preferred method seems to be limited to RDA (because the adjusted R2 measure used in the global significance test isn't worked out for CCA). If you skip the global test, you could just do forward selection on the positive PCNMs, but you probably want to try to control for accepting too many PCNMs by having low entry threshold for significance. HTH G
The same is true if i am using total count data (hellinger
transformed or
not). I am sure i am doing it wrong, so if you have advise to properly do the calculation, please let me know. Thank you for the help. -- View this message in context:
Sent from the r-sig-ecology mailing list archive at Nabble.com.
_______________________________________________ R-sig-ecology mailing list [hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=1> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
--
Gavin Simpson, PhD
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list [hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=2> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology ------------------------------------------------------------------------ If you reply to this email, your message will be added to the discussion below: http://r-sig-ecology.471788.n2.nabble.com/Partitioning-spatial-effects-using-trend-surface-analysis-or-PCNM-tp7579427p7579428.html To unsubscribe from Partitioning spatial effects using trend surface analysis or PCNM, click here <http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7579427&code=dHJpY2h0ZXJAdW5pLWJyZW1lbi5kZXw3NTc5NDI3fDU5NDI0ODAyNg==>. NAML <http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
Tim Richter-Heitmann (M.Sc.) PhD Candidate International Max-Planck Research School for Marine Microbiology University of Bremen Microbial Ecophysiology Group (AG Friedrich) FB02 - Biologie/Chemie Leobener Stra?e (NW2 A2130) D-28359 Bremen Tel.: 0049(0)421 218-63062 Fax: 0049(0)421 218-63069 -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Partitioning-spatial-effects-using-trend-surface-analysis-or-PCNM-tp7579427p7579431.html Sent from the r-sig-ecology mailing list archive at Nabble.com. [[alternative HTML version deleted]]
Hi Tim
On 6 May 2015 at 03:47, trichter <trichter at uni-bremen.de> wrote:
Thank you very much for adressing my problem. Maybe i can re-formulate it in a different way - moving away from specific partial CCA/RDAs to the core of my task. If said task is the approximate quantification of partial effects on my bacterial counts (for example edaphic soil properties, above-ground plant diversity, and spatial autocorrelation) via the varpart function in vegan, is the classical way of having orthogonal polynomes of the x,y axis as CCA/RDA constraints still considered valid?
It is and remains valid; it just isn't set-up to find as wide a range of spatial patterns as PCNM. Now that may not be a bad thing in its entirety; we don't have a good way of doing feature selection in constrained ordination (and I don't consider the global R2 test followed by forward selection via R2 as a "good" method, it's just better than bog standard forward selection). Throwing a large set of PCNMs at an ordination sounds like a recipe for data dredging, *unless* you are very careful.
If i understand you correctly, i would alternatively: - generate PCNMs from my x,y coordinate matrices - extract those with a Moran I >0 - perform a RDA/CCA with forward selection - use the PCNM found to be significant in the varpart function? I think the only forward selection for CCA would be ordistep function in vegan. What would be an acceptable treshold for entering into my final set of accepted significant PCNMs?
Correct; the "newer" approach uses an adjusted R2 measure which has only been worked out and implemented for the RDA case. Rather than a 0.05 threshold as the baseline, I would go to say 0.01 as the threshold for inclusion. Then you also need to account for multiple testing so you adjust this p-value at each step in the forward selection process.
The other problem is that on the one hand, RDA is not able to separate my community shifts as well as CCA, on the other hand varpart is based on RDA. I wonder if i can justify using varpart when my ordination of choice is based on CCA. But i have never seen a dedicated variance partition function for CCA. I just read an old answer of yours: http://r.789695.n4.nabble.com/partitioning-variation-using-the-Vegan-CCA-routine-td823966.html So, i can basically transform my raw data to chi2 and use them in an RDA to have a CCA proxy?
Yes; Pierre Legendre & Eugene Gallagher showed how this could be done in their 2001 Oecologia paper on Ecologically Meaningful Transformations. You won't get exactly a CCA by doing RDA on chi-square transformed data, but it will be close. You can also use the Hellinger transformation which worked well in the tests that Legendre & Gallagher did in their paper.
Thank you very much! as you can see, i am not really trained in statistics.
You're welcome, G
Tim On 06.05.2015 02:45, Gavin Simpson-2 [via r-sig-ecology] wrote:
Hi "trichter" On 5 May 2015 at 13:34, trichter <[hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=0>> wrote: <snip />
Here is what i do: spat <- as.data.frame(poly(as.matrix(spatxy), degree=3)) cca1_s <- cca(OTU~., data=spat) #significances anova(cca1_s) anova(cca1_s, by="term", perm=999)
Don't think last analysis makes much sense; if you have a cubic polynomials plus interactions you should only consider the interactions first for removal, then decide if quadratic rather than cubic are needed
#forward selection for most parsimonious model cca1_s.f <- ordistep(cca(OTI~1, data=spat), scope=formula(cca1_s), direction="forward", pstep=1000) sig1_s.f <- anova(cca1_s.f, by="term", perm=999)
Again, as above, you have to be very careful with this. Just because you made a matrix with 9 "covariates" it doesn't mean it makes sense to cherry pick from these terms.
The result is a significant CCA object. Spat is usuable in VarPart and yields a low but significant value for overall autocorrelation. For PCNM i do rs <- rowSums(OTU)/sum(OTU) pcnmw <- pcnm(dist(spatxy), w = rs) cca1_pcnm <- cca(acido1 ~ scores(pcnmw)) pcnmw consists of 250 vectors, and the result is a non-significant CCA object, where i expected a "finer" spatial decomposition.
You are supposed to choose from among the set of PCNMs which explain the species data best, not use them all in the model. The problem appears to be that you have a model that is far too complex with lots of redundant axes (or more likely too few constraints). One suggestion is to use only those PCNMs that have positive spatial correlation. Compute that using Moran's I of which there are a few implementations around in various R packages. You can do CCA analysis with the positive spatial correlation PCNMs separately from the negatively correlated PCNMs if you wish. You will probably need to do some type of forward selection but the preferred method seems to be limited to RDA (because the adjusted R2 measure used in the global significance test isn't worked out for CCA). If you skip the global test, you could just do forward selection on the positive PCNMs, but you probably want to try to control for accepting too many PCNMs by having low entry threshold for significance. HTH G
The same is true if i am using total count data (hellinger
transformed or
not). I am sure i am doing it wrong, so if you have advise to properly do the calculation, please let me know. Thank you for the help. -- View this message in context:
Sent from the r-sig-ecology mailing list archive at Nabble.com.
_______________________________________________ R-sig-ecology mailing list [hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=1> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
--
Gavin Simpson, PhD
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list [hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=2> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology ------------------------------------------------------------------------ If you reply to this email, your message will be added to the discussion below:
To unsubscribe from Partitioning spatial effects using trend surface analysis or PCNM, click here <
http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7579427&code=dHJpY2h0ZXJAdW5pLWJyZW1lbi5kZXw3NTc5NDI3fDU5NDI0ODAyNg==
. NAML <
http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
-- Tim Richter-Heitmann (M.Sc.) PhD Candidate International Max-Planck Research School for Marine Microbiology University of Bremen Microbial Ecophysiology Group (AG Friedrich) FB02 - Biologie/Chemie Leobener Stra?e (NW2 A2130) D-28359 Bremen Tel.: 0049(0)421 218-63062 Fax: 0049(0)421 218-63069 -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Partitioning-spatial-effects-using-trend-surface-analysis-or-PCNM-tp7579427p7579431.html Sent from the r-sig-ecology mailing list archive at Nabble.com. [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Gavin Simpson, PhD [[alternative HTML version deleted]]
Hi Tim, As CA, CCA is known to be very sensitive to rare species and this is maybe partly the reason of the "shift" you observe in your communities, due to some OTUs. RDA is less sensitive to rare species and there is no need to remove them. To benefit from the advantages of RDA mentioned by Gavin, you should pre-transform your community data frame with Hellinger (site profiles) or chi-squared (double profiles, close to what is achieved in CCA) and check with a PCA if you still observe a major shift along the first axis. The choice of a method must not be guided by the pre-supposed better results you want to get ;-) All the best, Fran?ois ------------------------------------------------------------------------------- Prof. *Fran?ois Gillet* Universit? de Franche-Comt? - CNRS UMR 6249 Chrono-environnement UFR Sciences et Techniques 16, Route de Gray F-25030 Besan?on cedex France http://chrono-environnement.univ-fcomte.fr/ Phone: +33 (0)3 81 66 62 81 iPhone: +33 (0)7 88 37 07 76 Location: La Bouloie, B?t. Prop?deutique, -114L ------------------------------------------------------------------------------- Associate Editor of* Plant Ecology and Evolution* http://www.plecevo.eu ------------------------------------------------------------------------------- Homepage: http://chrono-environnement.univ-fcomte.fr/spip.php?article530 ResearchID: http://www.researcherid.com/rid/B-6160-2008 Google Scholar: http://scholar.google.com.au/citations?user=a5xiIfQAAAAJ ------------------------------------------------------------------------------- 2015-05-06 11:47 GMT+02:00 trichter <trichter at uni-bremen.de>:
Thank you very much for adressing my problem. Maybe i can re-formulate it in a different way - moving away from specific partial CCA/RDAs to the core of my task. If said task is the approximate quantification of partial effects on my bacterial counts (for example edaphic soil properties, above-ground plant diversity, and spatial autocorrelation) via the varpart function in vegan, is the classical way of having orthogonal polynomes of the x,y axis as CCA/RDA constraints still considered valid? If i understand you correctly, i would alternatively: - generate PCNMs from my x,y coordinate matrices - extract those with a Moran I >0 - perform a RDA/CCA with forward selection - use the PCNM found to be significant in the varpart function? I think the only forward selection for CCA would be ordistep function in vegan. What would be an acceptable treshold for entering into my final set of accepted significant PCNMs? The other problem is that on the one hand, RDA is not able to separate my community shifts as well as CCA, on the other hand varpart is based on RDA. I wonder if i can justify using varpart when my ordination of choice is based on CCA. But i have never seen a dedicated variance partition function for CCA. I just read an old answer of yours: http://r.789695.n4.nabble.com/partitioning-variation-using-the-Vegan-CCA-routine-td823966.html So, i can basically transform my raw data to chi2 and use them in an RDA to have a CCA proxy? Thank you very much! as you can see, i am not really trained in statistics. Tim On 06.05.2015 02:45, Gavin Simpson-2 [via r-sig-ecology] wrote:
Hi "trichter" On 5 May 2015 at 13:34, trichter <[hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=0>> wrote: <snip />
Here is what i do: spat <- as.data.frame(poly(as.matrix(spatxy), degree=3)) cca1_s <- cca(OTU~., data=spat) #significances anova(cca1_s) anova(cca1_s, by="term", perm=999)
Don't think last analysis makes much sense; if you have a cubic polynomials plus interactions you should only consider the interactions first for removal, then decide if quadratic rather than cubic are needed
#forward selection for most parsimonious model cca1_s.f <- ordistep(cca(OTI~1, data=spat), scope=formula(cca1_s), direction="forward", pstep=1000) sig1_s.f <- anova(cca1_s.f, by="term", perm=999)
Again, as above, you have to be very careful with this. Just because you made a matrix with 9 "covariates" it doesn't mean it makes sense to cherry pick from these terms.
The result is a significant CCA object. Spat is usuable in VarPart and yields a low but significant value for overall autocorrelation. For PCNM i do rs <- rowSums(OTU)/sum(OTU) pcnmw <- pcnm(dist(spatxy), w = rs) cca1_pcnm <- cca(acido1 ~ scores(pcnmw)) pcnmw consists of 250 vectors, and the result is a non-significant CCA object, where i expected a "finer" spatial decomposition.
You are supposed to choose from among the set of PCNMs which explain the species data best, not use them all in the model. The problem appears to be that you have a model that is far too complex with lots of redundant axes (or more likely too few constraints). One suggestion is to use only those PCNMs that have positive spatial correlation. Compute that using Moran's I of which there are a few implementations around in various R packages. You can do CCA analysis with the positive spatial correlation PCNMs separately from the negatively correlated PCNMs if you wish. You will probably need to do some type of forward selection but the preferred method seems to be limited to RDA (because the adjusted R2 measure used in the global significance test isn't worked out for CCA). If you skip the global test, you could just do forward selection on the positive PCNMs, but you probably want to try to control for accepting too many PCNMs by having low entry threshold for significance. HTH G
The same is true if i am using total count data (hellinger
transformed or
not). I am sure i am doing it wrong, so if you have advise to properly do the calculation, please let me know. Thank you for the help. -- View this message in context:
Sent from the r-sig-ecology mailing list archive at Nabble.com.
_______________________________________________ R-sig-ecology mailing list [hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=1> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
--
Gavin Simpson, PhD
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list [hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=2> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology ------------------------------------------------------------------------ If you reply to this email, your message will be added to the discussion below:
To unsubscribe from Partitioning spatial effects using trend surface analysis or PCNM, click here <
http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7579427&code=dHJpY2h0ZXJAdW5pLWJyZW1lbi5kZXw3NTc5NDI3fDU5NDI0ODAyNg==
. NAML <
http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
-- Tim Richter-Heitmann (M.Sc.) PhD Candidate International Max-Planck Research School for Marine Microbiology University of Bremen Microbial Ecophysiology Group (AG Friedrich) FB02 - Biologie/Chemie Leobener Stra?e (NW2 A2130) D-28359 Bremen Tel.: 0049(0)421 218-63062 Fax: 0049(0)421 218-63069 -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Partitioning-spatial-effects-using-trend-surface-analysis-or-PCNM-tp7579427p7579431.html Sent from the r-sig-ecology mailing list archive at Nabble.com. [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Hi, thank you very much for your advise as well. The point was, that i already saw the community shift by simple clustering/heatmapping the abundance tables. In fact, less than 20% OTUs are shared between the 10 samples featuring that dramatic community shift and the vast majority of the rest of the samples. So, indeed the dominating OTUs in these 10 samples represent rare species in the entire dataset, but they are very dominant in those 10. I started with the rule of thumb to check for the axis length of the DCA; it was long enough to decide for CCA for the further downstream analysis. I have to admit that i never touched RDA until i realized that my variance partitioning approach (vegan's varpart) was actually using RDA rather than CCA; i thus calculated partial RDAs as well, but i realized that RDA couldnt separate my "outlying samples" from the rest. However, the separation is real in terms of OTU abundances, so i figured that RDA is not the best tool to visualize my data. I now have learned that CCA was just very sensible to these highly different samples (in fact if you look at the ordination plot, cca1 is only separating the 10 samples from the rest, but is not able to resolve the remaining samples - these are separated by cca2). I will now do a varpart with both RDA on hellinger and chi2 transformed OTU tables and see if the results are dramatically different. Thank you again! Tim Zitat von Fran?ois Gillet <francois.gillet at univ-fcomte.fr>:
Hi Tim, As CA, CCA is known to be very sensitive to rare species and this is maybe partly the reason of the "shift" you observe in your communities, due to some OTUs. RDA is less sensitive to rare species and there is no need to remove them. To benefit from the advantages of RDA mentioned by Gavin, you should pre-transform your community data frame with Hellinger (site profiles) or chi-squared (double profiles, close to what is achieved in CCA) and check with a PCA if you still observe a major shift along the first axis. The choice of a method must not be guided by the pre-supposed better results you want to get ;-) All the best, Fran?ois ------------------------------------------------------------------------------- Prof. *Fran?ois Gillet* Universit? de Franche-Comt? - CNRS UMR 6249 Chrono-environnement UFR Sciences et Techniques 16, Route de Gray F-25030 Besan?on cedex France http://chrono-environnement.univ-fcomte.fr/ Phone: +33 (0)3 81 66 62 81 iPhone: +33 (0)7 88 37 07 76 Location: La Bouloie, B?t. Prop?deutique, -114L ------------------------------------------------------------------------------- Associate Editor of* Plant Ecology and Evolution* http://www.plecevo.eu ------------------------------------------------------------------------------- Homepage: http://chrono-environnement.univ-fcomte.fr/spip.php?article530 ResearchID: http://www.researcherid.com/rid/B-6160-2008 Google Scholar: http://scholar.google.com.au/citations?user=a5xiIfQAAAAJ ------------------------------------------------------------------------------- 2015-05-06 11:47 GMT+02:00 trichter <trichter at uni-bremen.de>:
Thank you very much for adressing my problem. Maybe i can re-formulate it in a different way - moving away from specific partial CCA/RDAs to the core of my task. If said task is the approximate quantification of partial effects on my bacterial counts (for example edaphic soil properties, above-ground plant diversity, and spatial autocorrelation) via the varpart function in vegan, is the classical way of having orthogonal polynomes of the x,y axis as CCA/RDA constraints still considered valid? If i understand you correctly, i would alternatively: - generate PCNMs from my x,y coordinate matrices - extract those with a Moran I >0 - perform a RDA/CCA with forward selection - use the PCNM found to be significant in the varpart function? I think the only forward selection for CCA would be ordistep function in vegan. What would be an acceptable treshold for entering into my final set of accepted significant PCNMs? The other problem is that on the one hand, RDA is not able to separate my community shifts as well as CCA, on the other hand varpart is based on RDA. I wonder if i can justify using varpart when my ordination of choice is based on CCA. But i have never seen a dedicated variance partition function for CCA. I just read an old answer of yours: http://r.789695.n4.nabble.com/partitioning-variation-using-the-Vegan-CCA-routine-td823966.html So, i can basically transform my raw data to chi2 and use them in an RDA to have a CCA proxy? Thank you very much! as you can see, i am not really trained in statistics. Tim On 06.05.2015 02:45, Gavin Simpson-2 [via r-sig-ecology] wrote:
Hi "trichter" On 5 May 2015 at 13:34, trichter <[hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=0>> wrote: <snip />
Here is what i do: spat <- as.data.frame(poly(as.matrix(spatxy), degree=3)) cca1_s <- cca(OTU~., data=spat) #significances anova(cca1_s) anova(cca1_s, by="term", perm=999)
Don't think last analysis makes much sense; if you have a cubic polynomials plus interactions you should only consider the interactions first for removal, then decide if quadratic rather than cubic are needed
#forward selection for most parsimonious model cca1_s.f <- ordistep(cca(OTI~1, data=spat), scope=formula(cca1_s), direction="forward", pstep=1000) sig1_s.f <- anova(cca1_s.f, by="term", perm=999)
Again, as above, you have to be very careful with this. Just because you made a matrix with 9 "covariates" it doesn't mean it makes sense to cherry pick from these terms.
The result is a significant CCA object. Spat is usuable in VarPart and yields a low but significant value for overall autocorrelation. For PCNM i do rs <- rowSums(OTU)/sum(OTU) pcnmw <- pcnm(dist(spatxy), w = rs) cca1_pcnm <- cca(acido1 ~ scores(pcnmw)) pcnmw consists of 250 vectors, and the result is a non-significant CCA object, where i expected a "finer" spatial decomposition.
You are supposed to choose from among the set of PCNMs which explain the species data best, not use them all in the model. The problem appears to be that you have a model that is far too complex with lots of redundant axes (or more likely too few constraints). One suggestion is to use only those PCNMs that have positive spatial correlation. Compute that using Moran's I of which there are a few implementations around in various R packages. You can do CCA analysis with the positive spatial correlation PCNMs separately from the negatively correlated PCNMs if you wish. You will probably need to do some type of forward selection but the preferred method seems to be limited to RDA (because the adjusted R2 measure used in the global significance test isn't worked out for CCA). If you skip the global test, you could just do forward selection on the positive PCNMs, but you probably want to try to control for accepting too many PCNMs by having low entry threshold for significance. HTH G
The same is true if i am using total count data (hellinger
transformed or
not). I am sure i am doing it wrong, so if you have advise to properly do the calculation, please let me know. Thank you for the help. -- View this message in context:
Sent from the r-sig-ecology mailing list archive at Nabble.com.
_______________________________________________ R-sig-ecology mailing list [hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=1> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
--
Gavin Simpson, PhD
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list [hidden email] </user/SendEmail.jtp?type=node&node=7579428&i=2> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology ------------------------------------------------------------------------ If you reply to this email, your message will be added to the discussion below:
To unsubscribe from Partitioning spatial effects using trend surface analysis or PCNM, click here <
http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=7579427&code=dHJpY2h0ZXJAdW5pLWJyZW1lbi5kZXw3NTc5NDI3fDU5NDI0ODAyNg==
. NAML <
http://r-sig-ecology.471788.n2.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
-- Tim Richter-Heitmann (M.Sc.) PhD Candidate International Max-Planck Research School for Marine Microbiology University of Bremen Microbial Ecophysiology Group (AG Friedrich) FB02 - Biologie/Chemie Leobener Stra?e (NW2 A2130) D-28359 Bremen Tel.: 0049(0)421 218-63062 Fax: 0049(0)421 218-63069 -- View this message in context: http://r-sig-ecology.471788.n2.nabble.com/Partitioning-spatial-effects-using-trend-surface-analysis-or-PCNM-tp7579427p7579431.html Sent from the r-sig-ecology mailing list archive at Nabble.com. [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology