R-sig-ecology Digest, Vol 23, Issue 2

1. Low counts (Tore Chr Michaelsen)
    2. Re: Low counts (Miltinho Astronauta)
    3. Re: Low counts (Maarten de Groot)

----------------------------------------------------------------------

Message: 1
Date: Mon, 1 Feb 2010 15:04:22 +0100
From: "Tore Chr Michaelsen"<tore.michaelsen at bio.uib.no>
To:<r-sig-ecology at r-project.org>
Subject: [R-sig-eco] Low counts
Message-ID:<000801caa347$74c62a70$5e527f50$@michaelsen at bio.uib.no>
Content-Type: text/plain;	charset="us-ascii"

Dear members;

1) I have fitted a glm to count data (using quasipoisson to correct for
disp.). In the final model, the relationship between Res and Fitted (i.e.
the line going through the plot) and QQ looks fine, but I am worried that
low count (one to five) could violate some assumption of the glm/poisson:
Although the line in the Res vs Fitted plot looks nice, the values show a
clear pattern (five diagonal lines = the counts). Crawley/R book says it
should look like the sky at night with no patterns. I assume patterns are
not visible with large counts (e.g. 0-100), but highly visible with low
counts as in this case. I still assume this is reason for some concern about
the model, or is the concern not justified?

Tore,

I'm actually trying to write a paper on exactly the same (low numbers) 
problem. But it doesn't go very fast. The first thing you have to ask 
yourself is whether the fact that there are no zeros is because you 
cannot have zeros...or is it just by chance? In the first case, consider 
zero truncated GLMs. The problem that I face myself with clutch size 
data with values between 1 and 5 is underdispersion. 
Hence....underdispersed zero truncated GLMs. And that brings you to 
generalized Poisson GLMs. Yes....there is always more shit. Now...I 
noticed that the zero truncation is not a real problem (i.e. similar 
SEs) as long as the fitted values are around 4 or 5 (or higher). In the 
snake carcasses data in  Chapter 11 of our mixed modelling book, the 
mean was between 1 and 2..and in that case differences between SEs of 
Poisson GLM and trunctated Poisson GLMs were about a factor 3.

As to your diagonal lines...those are due to your discrete values... In 
fact...those "lines" are always present..also in linear regression..but 
then you don't notice them. The extreme case is binary data.

So...summarising...think first about truncation....then check for 
underdispersion because you have a small range of observed values.

Alain
2) Any recommendations on literature regarding model inspection in R.

Thank you for reading this mail!

Best wishes;
Tore

------------------------------

Message: 2
Date: Mon, 1 Feb 2010 12:54:46 -0500
From: Miltinho Astronauta<milton.reco at gmail.com>
To: Tore Chr Michaelsen<tore.michaelsen at bio.uib.no>
Cc: r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] Low counts
Message-ID:
	<30c7555b1002010954p76f125f4jf07fa121936886e0 at mail.gmail.com>
Content-Type: text/plain

Hi Tore,

I put my 2cents on Zuur et al 2009's book - Mixed effect models...
See Zero-Inflated examples in there.

cheers

milton

2010/2/1 Tore Chr Michaelsen<tore.michaelsen at bio.uib.no>

Dear members;

1) I have fitted a glm to count data (using quasipoisson to correct for
disp.). In the final model, the relationship between Res and Fitted (i.e.
the line going through the plot) and QQ looks fine, but I am worried that
low count (one to five) could violate some assumption of the glm/poisson:
Although the line in the Res vs Fitted plot looks nice, the values show a
clear pattern (five diagonal lines = the counts). Crawley/R book says it
should look like the sky at night with no patterns. I assume patterns are
not visible with large counts (e.g. 0-100), but highly visible with low
counts as in this case. I still assume this is reason for some concern
about
the model, or is the concern not justified?

2) Any recommendations on literature regarding model inspection in R.

Thank you for reading this mail!

Best wishes;
Tore

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

	[[alternative HTML version deleted]]

------------------------------

Message: 3
Date: Tue, 02 Feb 2010 08:06:46 +0100
From: Maarten de Groot<Maarten.deGroot at nib.si>
To: Miltinho Astronauta<milton.reco at gmail.com>
Cc: r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] Low counts
Message-ID:<4B67CF06.4010905 at nib.si>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi Tore,

How does your count distribution look like? Doe you have more zero's
than expected (use zero inflated models), no zero's (use zero truncated
models) or is there no problem with zero's? If it is the latter, it
might be useful to try the negative binomial models (glm.nb()). Zuur et
al (2009) gives a nice example that they still find a pattern in the
residuals with a quasi poison model but no pattern with a negative
binomial model.

Kind regards,

Maarten

Miltinho Astronauta wrote:

Hi Tore,

I put my 2cents on Zuur et al 2009's book - Mixed effect models...
See Zero-Inflated examples in there.

cheers

milton

2010/2/1 Tore Chr Michaelsen<tore.michaelsen at bio.uib.no>

Dear members;

1) I have fitted a glm to count data (using quasipoisson to correct for
disp.). In the final model, the relationship between Res and Fitted (i.e.
the line going through the plot) and QQ looks fine, but I am worried that
low count (one to five) could violate some assumption of the glm/poisson:
Although the line in the Res vs Fitted plot looks nice, the values show a
clear pattern (five diagonal lines = the counts). Crawley/R book says it
should look like the sky at night with no patterns. I assume patterns are
not visible with large counts (e.g. 0-100), but highly visible with low
counts as in this case. I still assume this is reason for some concern
about
the model, or is the concern not justified?

2) Any recommendations on literature regarding model inspection in R.

Thank you for reading this mail!

Best wishes;
Tore

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

------------------------------

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

End of R-sig-ecology Digest, Vol 23, Issue 2
********************************************

Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7

2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9

3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3

Other books: http://www.highstat.com/books.htm

Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com