An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20120425/c9a93204/attachment.pl>
Testing difference between diversity indices with vegan::oecosimu
9 messages · Kay Cichini, Chris Howden, David Valentim Dias +3 more
Why not try some type of ANOVA style glm? Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) chris at trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement.
On 26/04/2012, at 7:19, Kay Cichini <kay.cichini at gmail.com> wrote:
Hello all,
I'd like to test if total diversity differs between two communities. For
each community several samples were taken and abundances collapsed over
groups to compute total diversity for each group. I tried to use
vegan::oecosimu to test non-randomness of my statisitc (difference in
Simpson-Diversity indices of collapsed abundances) - however, I am not
quite sure if I oversee posssible pitfalls:
library(vegan)
data(dune)
# a grouping variable:
gr <- gl(2, nrow(dune)/2)
divdiff <- function(x) abs(diversity(colSums(x[gr == "1", ]), "simp") -
diversity(colSums(x[gr == "2", ]), "simp"))
# testing function:
divdiff(dune)
oecosimu(dune, divdiff, "r2dtable", nsimul = 1999)
# oecosimu with 1999 simulations
# simulation method r2dtable
# alternative hypothesis: true mean is not equal to the statistic
# statistic z 2.5% 50% 97.5% Pr(sim.)
# statistic 0.00275 -0.20996 0.00013 0.00280 0.01 0.98
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20120426/a707d381/attachment.pl>
Standard Hypothesis statistical testing often starts with the null hypothesis that 2 things are identical, or that 2 population means are identical. The p value is then used to reject this null and accept the alternative, that they are indeed different. Practically we're actually asking if we have enough information to indeed say they are different. I do agree though that stopping there is a bit silly. If there is a statistical difference then we next need to look at the effect size or in other words the magnitude of the difference and decide if this is ecologically meaningful. Chris Howden B.Sc. (Hons) GStat. Founding Partner Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax) +612 4782 9023 chris at trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information.?If you are not the named or intended recipient, please delete this communication and contact us immediately.?Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. -----Original Message----- From: r-sig-ecology-bounces at r-project.org [mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of David Valentim Dias Sent: Thursday, 26 April 2012 2:36 PM To: r-sig-ecology at r-project.org Subject: Re: [R-sig-eco] Testing difference between diversity indices with vegan::oecosimu Hello Cichini, I cannot help with your code but seems like you have a silly hypothesis. Think about it: Probability of two communities to be identical? You need to restate it in some more useful way. We already know most things are different but with what magnitude? Which factors are causing these changes? How these changes matter from the environment and us? 2012/4/25 Chris Howden <chris at trickysolutions.com.au>
Why not try some type of ANOVA style glm? Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) chris at trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. On 26/04/2012, at 7:19, Kay Cichini <kay.cichini at gmail.com> wrote:
Hello all, I'd like to test if total diversity differs between two communities. For each community several samples were taken and abundances collapsed over groups to compute total diversity for each group. I tried to use vegan::oecosimu to test non-randomness of my statisitc (difference in Simpson-Diversity indices of collapsed abundances) - however, I am not quite sure if I oversee posssible pitfalls: library(vegan) data(dune) # a grouping variable: gr <- gl(2, nrow(dune)/2) divdiff <- function(x) abs(diversity(colSums(x[gr == "1", ]), "simp")
-
diversity(colSums(x[gr == "2", ]), "simp")) # testing function: divdiff(dune) oecosimu(dune, divdiff, "r2dtable", nsimul = 1999) # oecosimu with 1999 simulations # simulation method r2dtable # alternative hypothesis: true mean is not equal to the statistic # statistic z 2.5% 50% 97.5% Pr(sim.) # statistic 0.00275 -0.20996 0.00013 0.00280 0.01 0.98 [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
-- Currmculo: http://lattes.cnpq.br/7541377569511492
On Thu, 2012-04-26 at 00:36 -0400, David Valentim Dias wrote:
Hello Cichini, I cannot help with your code but seems like you have a silly hypothesis. Think about it: Probability of two communities to be identical? You need to restate it in some more useful way. We already know most things are different but with what magnitude? Which factors are causing these changes? How these changes matter from the environment and us?
Surely if we knew the two things were different there would be no need to test if they were? Most statistics assumes a Null model as we can say something specific about the magnitude of the difference (it is zero) and we can then see if the observations are consistent with that model. I agree that subsequent analysis is required to understand why there are differences, but we still need a mechanism to say, given the data collected and the error processes, are the diversities of these two "samples" the same? G
2012/4/25 Chris Howden <chris at trickysolutions.com.au>
Why not try some type of ANOVA style glm? Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) chris at trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. On 26/04/2012, at 7:19, Kay Cichini <kay.cichini at gmail.com> wrote:
Hello all,
I'd like to test if total diversity differs between two communities. For
each community several samples were taken and abundances collapsed over
groups to compute total diversity for each group. I tried to use
vegan::oecosimu to test non-randomness of my statisitc (difference in
Simpson-Diversity indices of collapsed abundances) - however, I am not
quite sure if I oversee posssible pitfalls:
library(vegan)
data(dune)
# a grouping variable:
gr <- gl(2, nrow(dune)/2)
divdiff <- function(x) abs(diversity(colSums(x[gr == "1", ]), "simp") -
diversity(colSums(x[gr == "2", ]), "simp"))
# testing function:
divdiff(dune)
oecosimu(dune, divdiff, "r2dtable", nsimul = 1999)
# oecosimu with 1999 simulations
# simulation method r2dtable
# alternative hypothesis: true mean is not equal to the statistic
# statistic z 2.5% 50% 97.5% Pr(sim.)
# statistic 0.00275 -0.20996 0.00013 0.00280 0.01 0.98
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
On Thu, Apr 26, 2012 at 12:17 AM, Kay Cichini <kay.cichini at gmail.com> wrote:
Hello all, I'd like to test if total diversity differs between two communities. For each community several samples were taken and abundances collapsed over groups to compute total diversity for each group. I tried to use vegan::oecosimu to test non-randomness of my statisitc (difference in Simpson-Diversity indices of collapsed abundances) - however, I am not quite sure if I oversee posssible pitfalls: library(vegan) data(dune) # a grouping variable: gr <- gl(2, nrow(dune)/2) divdiff <- function(x) abs(diversity(colSums(x[gr == "1", ]), "simp") - ? ? ? ? ? ? ? ? ? ? ? ? ? diversity(colSums(x[gr == "2", ]), "simp")) # testing function: divdiff(dune) oecosimu(dune, divdiff, "r2dtable", nsimul = 1999) # oecosimu with 1999 simulations # simulation method r2dtable # alternative hypothesis: true mean is not equal to the statistic # ? ? ? ? ? statistic ? ? ? ?z ? ? 2.5% ? ? ?50% 97.5% Pr(sim.) # statistic ? 0.00275 -0.20996 ?0.00013 ?0.00280 ?0.01 ? ? 0.98
Dear Kay, I am not sure about any possible pitfalls with your approach, but I have tested the same data using the randomisation functions of the "rich" library, and found that neither the Simpson diversity nor the simple species richness differ significantly among the defined groups. Here are the results following your example: library(rich) # prepare data one <- as.data.frame(dune[gr == "1", ]) two <- as.data.frame(dune[gr == "2", ]) data <- list(one, two) # compare cumulative species richness c2cv(com1=data[[1]],com2=data[[2]],nrandom=1999) #$res # #cv1 27.0000 #cv2 28.0000 #cv1-cv2 -1.0000 #p 0.4220 # N.S. #quantile 0.025 -4.0000 #quantile 0.975 4.0000 #randomized cv1-cv2 0.0225 #nrandom 1999.0000 # compare the Simpson diversity simp.one <- diversity(dune[gr == "1", ], "simp") simp.two <- diversity(dune[gr == "2", ], "simp") c2m(pop1=simp.one,pop2=simp.two,nrandom=1999,verbose=FALSE) #done. #$res # #mv1 8.630e-01 #mv2 8.773e-01 #mv1-mv2 -1.439e-02 #p 2.440e-01 # N.S. #quantile 0.025 -3.456e-02 #quantile 0.975 3.351e-02 #randomized mv1-mv2 3.899e-04 #nrandom 1.999e+03 ######################### The possible pitfalls might be hidden under the different results ;-) Cheers, Ivailo
UBUNTU: a person is a person through other persons.
On 26 Apr 2012, at 0:19, Kay Cichini wrote:
Hello all,
I'd like to test if total diversity differs between two communities. For
each community several samples were taken and abundances collapsed over
groups to compute total diversity for each group. I tried to use
vegan::oecosimu to test non-randomness of my statisitc (difference in
Simpson-Diversity indices of collapsed abundances) - however, I am not
quite sure if I oversee posssible pitfalls:
library(vegan)
data(dune)
# a grouping variable:
gr <- gl(2, nrow(dune)/2)
divdiff <- function(x) abs(diversity(colSums(x[gr == "1", ]), "simp") -
diversity(colSums(x[gr == "2", ]), "simp"))
# testing function:
divdiff(dune)
oecosimu(dune, divdiff, "r2dtable", nsimul = 1999)
# oecosimu with 1999 simulations
# simulation method r2dtable
# alternative hypothesis: true mean is not equal to the statistic
# statistic z 2.5% 50% 97.5% Pr(sim.)
# statistic 0.00275 -0.20996 0.00013 0.00280 0.01 0.98
Kay, I think that Gav's suggestion is the most natural one: permute your classification vector and compare your observed difference to the permutation values. Null models can be problematic, and you must very carefully think what kind of null model you need and what is the null hypothesis under each null model. Quantitative null models are even trickier. I see the following possible problems with your idea: - You used "r2dtable" null model which fixes both row and column totals (but not frequencies). This means that for all simulations the overall gamma diversity is fixed: Simpson index is found from species totals, and these are fixed. When you also fix row totals, the generated null models can be too similar to each other, and this in turn gives too low P-values. I think that when analysing overall diversities from marginal sums, you should use a null model that allows those marginal sums to vary. This may not be possible with the release version of vegan, but the development version in R-Forge has a completely redesigned null model engine with several new quantitative null models and allows plugging in your own null models (which could even include permutation models). - If usual null models can be painful, the quantitative null models give you double trouble. One problem is that they produce too evenly distributed data. For "r2dtable" this holds in two ways: the method fixes marginal totals, but not marginal frequencies (= number of non-zero cells). Typically the number of zeros is much lower than in real data, and the variance of rows and columns is lower than in any real data. Moreover, the simulated samples are often much more similar to each other than real re-sampling of Nature. This is like using Poisson glm for abundance data: the data are regularly over-dispersed to Poisson, and therefore the P-values are too low. You have just the same danger with these null models: the simulation variation is too low, and therefore your P-values are too low. - The "r2dtable" method requires that your data are individuals: they are individuals that are swapped between cells. You used Dutch Dune meadow data in your example. Technically this works, since the data are integers, but they are cover class values and not individual, and therefore the swapping of integer pieces of cover classes has no meaning. If you want to consider null models, you should again switch to R-Forge version of vegan (currently there at version 2.1-15) which allows some models that apply to data that is not made of individuals, and also some methods that can retain the original marginal variances of the data. There are many things that you need to consider if you want to use null models. However, I think that permutation of classification vector saves a lot of trouble, and is more easily understood and communicated. Cheers, Jari Oksanen
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20120426/04462e39/attachment.pl>
4 days later
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20120430/314ab78d/attachment.pl>