Relating abundance and cover data
Dear Karen, seconding the comments of Phil and Etienne: One key question is whether you can assume no error on the values of your predictors (i.e. run a model 1-regression). If you can, Ben Bolker's comments point in the right way; if you cannot, my heart goes out for the "simplistic" approach of Etienne and try to pad your results with a bit of "robustness testing". (E.g. perturb/jitter your values and see if it makes a difference to your regression. This may not be "official" stats, but should show clear differences when the pattern is not robust. For example, the many 0s in your data may be caused by detection problems (rather than true absences) and hence giving them a random low cover/abundance (e.g. 1/2 of the respective minimum value) should NOT change your results. If it does, I would interpret this as the data not supporting a clear correlation between abundance and cover.) HTH, Carsten
On 26.10.10 11:27, Karen Kotschy wrote:
Dear list This seems like something I really should know by now, but I'm getting so confused, I'd really appreciate a little help! I am trying to model the relationship between relative abundance (%) and relative cover (%) data for plant species. I want to know to what extent the 2 measures correlate, and to compare the extent of this correlation at different sites. Obviously, both sets of data are zero-inflated and highly skewed. The "traditional" thing to do would be to log-transform both of them and use lm(). However, a recent paper (O'Hara& Kotze, 2010) argues that a much better approach is to use glm() and to specify Poisson or negative binomial models, rather than using transformations. This does make a lot of sense, I think! I have tried using "quasipoisson" and "quasibinomial" families in glm(), but I am left with a number of questions: 1) Should relative abundance and relative cover be treated as "count" data, given that the values are not actually integers but rather percentages? 2) Which parts of the output of glm(...family=quasipoisson(link=log)) do I use to evaluate the fit? Just residual deviance and the p value? 3) How do I plot the data so as to graphically represent the model? If I am using a log link should I use log axes for x and y? Thanks so much for any help! Karen --- Karen Kotschy Centre for Water in the Environment University of the Witwatersrand, Johannesburg Tel: +2711 717-6425
Dr. Carsten F. Dormann Department of Computational Landscape Ecology Helmholtz Centre for Environmental Research-UFZ (Department Landschafts?kologie) (Helmholtz Zentrum f?r Umweltforschung - UFZ) Permoserstr. 15 04318 Leipzig Germany Tel: ++49(0)341 2351946 Fax: ++49(0)341 2351939 Email: carsten.dormann at ufz.de internet: http://www.ufz.de/index.php?de=4205 Registered Office/Sitz der Gesellschaft: Leipzig Commercial Register Number/Registergericht: Amtsgericht Leipzig, Handelsregister Nr. B 4703 Chairman of the Supervisory Board/Vorsitzender des Aufsichtsrats: MinR Wilfried Kraus Scientific Managing Director/Wissenschaftlicher Gesch?ftsf?hrer: Prof. Dr. Georg Teutsch Administrative Managing Director/Administrativer Gesch?ftsf?hrer: Dr. Andreas Schmidt