?No No No No No!
The log likelihood of the Poisson and the Gaussian are not comparable.
One is a discrete distribution and the other continuous, you can get
into all
sorts of trouble there and not just pathological cases. They are on
totally different scales.
You need to make a decision if you want to model the MEAN species
richness as
continuous, and not worry about answers like 3.1 species. You are
modeling the mean.
Or go with a discrete distribution like Poisson or quasi-Poisson, you
can test
for overdispersion within a discrete family of distributions. ?As
someone
mentioned before if your counts are away from zero, the Poisson is very
symmetric,
and goes asymptotically to a normal. But for practical purposes your
results
should be similar. For small samples, ie with categorical predictors and
few
counts per cell, it can make a difference.
So, if you want to do model selection, you have to first choose
discrete or continuous, then within that set compare log likelihoods.
(you are on firmer ground if the models are somehow nested).
Nicholas
Message: 10
Date: Fri, 02 Oct 2009 08:29:10 +0200
From: Carsten Dormann <carsten.dormann at ufz.de>
Subject: Re: [R-sig-eco] Negative binomial
To: "Canning-Clode, Joao" <Canning-ClodeJ at si.edu>
Cc: "r-sig-ecology at r-project.org" <r-sig-ecology at r-project.org>
Message-ID: <4AC59DB6.9030001 at ufz.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Dear Joao,
I propose you do the following (and wait for the outcry-responses to
this email to see if it is a reasonable proposal):
Fit your model with different types of distributions and compare their
logLik-values:
logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=gaussian))
logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=poisson))
logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=quasipoisson))
logLik(glm.nb(y ~ x1+x2+x3+I(x1^2) + x1:x3)) # require(MASS)
The model with the highest log-Likelihood is the distribution of choice
and you can defend it against reviewer.
A few notes:
1. You obviously cannot do this when one of the models uses transformed
responses (e.g. log(y)), because the LL will then be completely
different.
2. When you use a more complex model (say a GLMM), you can approximate
the neg.bin through a two-step procedure: 1. fit a (wrongly structured)
glm.nb and extract the theta value from the summary of the model, say
theta=4.5 (that is the second parameter of the neg.bin distribution).
Then fit the GLMM again, giving as family the argument:
negative.binomial(theta=4.5) (again from package MASS). The same holds
for GAMs and other models requiring a specification of family.
3. You may want to dig around for books recommending the above
procedure. I think I got this as advice from someone else, but haven't
bothered yet to look it up (obviously MASS would be a good starting
place, in their description of the neg.bin). I saw a paper that does
this (using the minimum AIC but otherwise this approach), but it is not
a statistical, but rather an ecological paper (although the analyst in
the author group is a biometrician whom I full trust): Weigelt, A.,
Schumacher, J., Walther, T. Bartelheimer, M., Steinlein, T., Beyschlag,
W. (2006) Identifying mechanisms of competition in multispecies
communities. Journal of Ecology 95:53-64
HTH,
Carsten
Canning-Clode, Joao wrote:
Hi all,
1st time user here!
I am an ecologist working with marine fouling assemblages. I just got a paper back for revision. I am working with count data (species richness). I have used a linear model but the reviewers are recommending the use of negative binomial or Poisson. As far as I could understand from the literature these complex models should be used and the distribution is skewed left (lots of zeros). Well, my data is perfectly normal distributed. My main question is: can I still use negative binomial or poisson even if my data is normal? Does that make sense?
Thanks in advance
Jo?o Canning Clode, PhD
Postdoctoral Fellow
Marine Invasions Research Lab
Smithsonian Environmental Research Center
647 Contees Wharf Road
Edgewater, MD 21037
Email: canning-clodej at si.edu
Web: www.canning-clode.com
Tel: 443-482-2354