An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20101111/9a86000d/attachment.pl>
ANOSIM in vegan
7 messages · Soumi Ray, Jari Oksanen, Michael Gerisch +1 more
On 12/11/10 02:23 AM, "Soumi Ray" <soumiray74 at gmail.com> wrote:
Hi, I have a dataset consisting of species collected from the same location during 2 time periods - i want to see if the community composition is similar during the two time periods. My entire dataset is presence/absence (0/1) data. There are around 23 species and 400 samples (during each time period, so a total of 800 samples). Will ANOSIM from the vegan package be an right test to apply? I was going through some papers online where they have used methods like db-RDA in similar situations. Would it be right to use it for qualitative data? Any suggestion would be of great help.
Soumi, I only comment the db-RDA/anosim choice: if you can use one, you can use the other. They are very similar and have the same limitation and "assumptions". Both are based on dissimilarity measures, and you can use the same dissimilarities in both methods. They also handle the dissimilarities very similarly. Overall tests for db-RDA by terms (as implemented in anova(..., by = "term") for vegan::capscale) and adonis tests give very similar results. However, they are not identical. The difference is that for non-Euclidean dissimilarities you will have some negative eigenvalues. These are ignored in db-RDA (capscale), but they are taken into account in adonis. Which method to use depends on your questions, and what else you want to do with your data than get the test statistics. Cheers, Jari Oksanen
3 days later
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20101115/963360e3/attachment.pl>
7 days later
On Mon, 2010-11-15 at 10:09 -0500, Soumi Ray wrote:
Thanks Jari, for your explanation!
Could anyone explain me what does the axes scores ("points" in metaMDS
results) represents. Are they similar to PCA factor loadings. What do these
numerical values represent.
The best example I can come up with is that these are coordinates, like map coordinates, in the ordination space, that is all. They are certainly not "axis scores" in the sense of PCA et al implying that they are independent, you need both (in 2D solution, all in k-D solutions) coordinates to represent the "distances" between your samples in terms of species composition. The scores are the "best" mapping of the n-dimensional dissimilarity matrix (n == number of sites or samples) in a k-dimensional space. Where "best" means i) subject to convergence to a suitable global minimum in the algorithm, and ii) the mapping is in regards to the /rank/ ordering of the dissimilarities not their actual values. HTH G
Thank you once again, Soumi On Thu, Nov 11, 2010 at 7:23 PM, Soumi Ray <soumiray74 at gmail.com> wrote:
Hi, I have a dataset consisting of species collected from the same location during 2 time periods - i want to see if the community composition is similar during the two time periods. My entire dataset is presence/absence (0/1) data. There are around 23 species and 400 samples (during each time period, so a total of 800 samples). Will ANOSIM from the vegan package be an right test to apply? I was going through some papers online where they have used methods like db-RDA in similar situations. Would it be right to use it for qualitative data? Any suggestion would be of great help. Thanks, Soumi
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Hi Soumi, it just came into my mind - no idea if i am wrong: For me it sounds like your sampling design is not independent, as you sample 400 plots twice and you will have high probability that the community of a pair of similar sites is more equal just because it is the same site and not because of some environmental factors. Call it pseudoreplication or else, but i think large part of the "similarity" will be due to this fact. Unless you have something like an extreme event between the dates which sort of "reset" your communites...maybe then its appropriate. But i don`t know if anosim has the assumption of independence anyway. Maybe there is "reanypeated measure" variant? Or maybe i am totally wrong... Sorry, did not want to confuse, but it would also be interesting for me. cheers michael
On Friday 12 November 2010 01:23:41 Soumi Ray wrote:
Hi, I have a dataset consisting of species collected from the same location during 2 time periods - i want to see if the community composition is similar during the two time periods. My entire dataset is presence/absence (0/1) data. There are around 23 species and 400 samples (during each time period, so a total of 800 samples). Will ANOSIM from the vegan package be an right test to apply? I was going through some papers online where they have used methods like db-RDA in similar situations. Would it be right to use it for qualitative data? Any suggestion would be of great help. Thanks, Soumi [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Michael Gerisch Helmholtz-Zentrum f?r Umweltforschung - UFZ Helmholtz Centre for Environmental Research - UFZ Department Naturschutzforschung (Conservation Biology) Permoserstra?e 15 04318 Leipzig, Germany Phone: ++49 - (0)341-235 1643 Fax: ++49 - (0)341-235 3191 E-mail: michael.gerisch at ufz.de http://www.ufz.de/index.php?en=15479 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Please use enrypted mail transfer whenever possible. Public key is stored on my webpage. Try linux.
On Wed, 2010-11-24 at 10:21 +0100, Michael Gerisch wrote:
Hi Soumi, it just came into my mind - no idea if i am wrong: For me it sounds like your sampling design is not independent, as you sample 400 plots twice and you will have high probability that the community of a pair of similar sites is more equal just because it is the same site and not because of some environmental factors. Call it pseudoreplication or else, but i think large part of the "similarity" will be due to this fact. Unless you have something like an extreme event between the dates which sort of "reset" your communites...maybe then its appropriate. But i don`t know if anosim has the assumption of independence anyway. Maybe there is "reanypeated measure" variant? Or maybe i am totally wrong...
I don't think ANOSIM has many assumptions at all - we test using permutations and so long as you can generate an appropriate Null distribution that respects the dependence structures in the data, we can provide a test. Generating the appropriate Null requires doing a lot by hand at the moment if you want anything but random permutations (or random within blocks). Think of Soumi's Q the other way round. We want to test if there is a difference in community composition in sample between times A and B. The TIME variable would explain all the temporal dependence structure. If there is no other dependence structure (say spatial, or spatially correlated environmental dependence - the effect of an env var across the survey sites), then we could assume a Null distribution for the permutation via random permutation - the residuals are assumed random in that case. The issue to hand (in your email) is dependence in the residuals - in general in statistical methods. If you model this dependence such that the residuals meet the assumptions of the method, you are fine. It is only an issue when you fail to (or can't) model the dependence structure as part of the statistical method itself. Because then the residuals will not be independent, identically distributed etc. HTH G
Sorry, did not want to confuse, but it would also be interesting for me. cheers michael On Friday 12 November 2010 01:23:41 Soumi Ray wrote:
Hi, I have a dataset consisting of species collected from the same location during 2 time periods - i want to see if the community composition is similar during the two time periods. My entire dataset is presence/absence (0/1) data. There are around 23 species and 400 samples (during each time period, so a total of 800 samples). Will ANOSIM from the vegan package be an right test to apply? I was going through some papers online where they have used methods like db-RDA in similar situations. Would it be right to use it for qualitative data? Any suggestion would be of great help. Thanks, Soumi [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
On Thu, 2010-11-11 at 19:23 -0500, Soumi Ray wrote:
Hi, I have a dataset consisting of species collected from the same location during 2 time periods - i want to see if the community composition is similar during the two time periods. My entire dataset is presence/absence (0/1) data. There are around 23 species and 400 samples (during each time period, so a total of 800 samples). Will ANOSIM from the vegan package be an right test to apply? I was going through some papers online where they have used methods like db-RDA in similar situations. Would it be right to use it for qualitative data? Any suggestion would be of great help. Thanks, Soumi
Soumi, You can't do what you want with these methods. Think about what you are trying to do? You want to know if the community composition is the same (or similar enough) at two time points. This is a test of equivalence and is the exact opposite of what we normally test in classical statistics. We normally test for a difference in the response due to one or more covariates. In other words, the one or more covariates can explain variation in the response. We test whether the amount of variation explained is large relative to the unexplained variation. If there is no difference, then there is nothing to explain. If you were to run your analysis through adonis() or similar, we would test the amount of variation in the response explained by TIME to see if it were as larger or larger than some extreme quantiles of a permutation-derived null distribution of variations explained when there is no difference between the community composition of the two time periods. This means we set up the Null of no difference and the Alternative of some difference. We see if the test statistic is likely to have arisen under the Null. If it is, we say we have evidence *against* the Null and reject it in favour of the Alternative. We have not tested the Alternative. You are interested in the opposite; that there *isn't* a difference. But, you can't use the fact that you get an insignificant result from the above test as evidence in support of the Null as the permutation p-value (just like any other p-value) tells you nothing about the probability of the Null hypothesis (the thing you are actually interested in) being TRUE - it is a uniform random variable in such circumstances. In short, you can test for a difference, but not for no difference in community composition. Andrew Robinson has a package on CRAN to do equivalence tests: http://cran.r-project.org/web/packages/equivalence/index.html but it is not set-up to do the sort of analysis you want with ANOSIM. Whether you could process your species data in some way so that you could use his code is another matter, but beyond the scope of an email list for help. Please note I have no idea what papers you referred to above and the comments I make are not comments on their methodology. For all I know, their Alternative was one of difference and thus they were right to use normal methods. HTH G
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%