1. Low counts (Tore Chr Michaelsen)
2. Re: Low counts (Miltinho Astronauta)
3. Re: Low counts (Maarten de Groot)
----------------------------------------------------------------------
Message: 1
Date: Mon, 1 Feb 2010 15:04:22 +0100
From: "Tore Chr Michaelsen"<tore.michaelsen at bio.uib.no>
To:<r-sig-ecology at r-project.org>
Subject: [R-sig-eco] Low counts
Message-ID:<000801caa347$74c62a70$5e527f50$@michaelsen at bio.uib.no>
Content-Type: text/plain; charset="us-ascii"
Dear members;
1) I have fitted a glm to count data (using quasipoisson to correct for
disp.). In the final model, the relationship between Res and Fitted (i.e.
the line going through the plot) and QQ looks fine, but I am worried that
low count (one to five) could violate some assumption of the glm/poisson:
Although the line in the Res vs Fitted plot looks nice, the values show a
clear pattern (five diagonal lines = the counts). Crawley/R book says it
should look like the sky at night with no patterns. I assume patterns are
not visible with large counts (e.g. 0-100), but highly visible with low
counts as in this case. I still assume this is reason for some concern about
the model, or is the concern not justified?
Tore, I'm actually trying to write a paper on exactly the same (low numbers) problem. But it doesn't go very fast. The first thing you have to ask yourself is whether the fact that there are no zeros is because you cannot have zeros...or is it just by chance? In the first case, consider zero truncated GLMs. The problem that I face myself with clutch size data with values between 1 and 5 is underdispersion. Hence....underdispersed zero truncated GLMs. And that brings you to generalized Poisson GLMs. Yes....there is always more shit. Now...I noticed that the zero truncation is not a real problem (i.e. similar SEs) as long as the fitted values are around 4 or 5 (or higher). In the snake carcasses data in Chapter 11 of our mixed modelling book, the mean was between 1 and 2..and in that case differences between SEs of Poisson GLM and trunctated Poisson GLMs were about a factor 3. As to your diagonal lines...those are due to your discrete values... In fact...those "lines" are always present..also in linear regression..but then you don't notice them. The extreme case is binary data. So...summarising...think first about truncation....then check for underdispersion because you have a small range of observed values. Alain
2) Any recommendations on literature regarding model inspection in R. Thank you for reading this mail! Best wishes; Tore ------------------------------ Message: 2 Date: Mon, 1 Feb 2010 12:54:46 -0500 From: Miltinho Astronauta<milton.reco at gmail.com> To: Tore Chr Michaelsen<tore.michaelsen at bio.uib.no> Cc: r-sig-ecology at r-project.org Subject: Re: [R-sig-eco] Low counts Message-ID: <30c7555b1002010954p76f125f4jf07fa121936886e0 at mail.gmail.com> Content-Type: text/plain Hi Tore, I put my 2cents on Zuur et al 2009's book - Mixed effect models... See Zero-Inflated examples in there. cheers milton 2010/2/1 Tore Chr Michaelsen<tore.michaelsen at bio.uib.no>
Dear members; 1) I have fitted a glm to count data (using quasipoisson to correct for disp.). In the final model, the relationship between Res and Fitted (i.e. the line going through the plot) and QQ looks fine, but I am worried that low count (one to five) could violate some assumption of the glm/poisson: Although the line in the Res vs Fitted plot looks nice, the values show a clear pattern (five diagonal lines = the counts). Crawley/R book says it should look like the sky at night with no patterns. I assume patterns are not visible with large counts (e.g. 0-100), but highly visible with low counts as in this case. I still assume this is reason for some concern about the model, or is the concern not justified? 2) Any recommendations on literature regarding model inspection in R. Thank you for reading this mail! Best wishes; Tore
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[[alternative HTML version deleted]] ------------------------------ Message: 3 Date: Tue, 02 Feb 2010 08:06:46 +0100 From: Maarten de Groot<Maarten.deGroot at nib.si> To: Miltinho Astronauta<milton.reco at gmail.com> Cc: r-sig-ecology at r-project.org Subject: Re: [R-sig-eco] Low counts Message-ID:<4B67CF06.4010905 at nib.si> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Hi Tore, How does your count distribution look like? Doe you have more zero's than expected (use zero inflated models), no zero's (use zero truncated models) or is there no problem with zero's? If it is the latter, it might be useful to try the negative binomial models (glm.nb()). Zuur et al (2009) gives a nice example that they still find a pattern in the residuals with a quasi poison model but no pattern with a negative binomial model. Kind regards, Maarten Miltinho Astronauta wrote:
Hi Tore,
I put my 2cents on Zuur et al 2009's book - Mixed effect models...
See Zero-Inflated examples in there.
cheers
milton
2010/2/1 Tore Chr Michaelsen<tore.michaelsen at bio.uib.no>
Dear members; 1) I have fitted a glm to count data (using quasipoisson to correct for disp.). In the final model, the relationship between Res and Fitted (i.e. the line going through the plot) and QQ looks fine, but I am worried that low count (one to five) could violate some assumption of the glm/poisson: Although the line in the Res vs Fitted plot looks nice, the values show a clear pattern (five diagonal lines = the counts). Crawley/R book says it should look like the sky at night with no patterns. I assume patterns are not visible with large counts (e.g. 0-100), but highly visible with low counts as in this case. I still assume this is reason for some concern about the model, or is the concern not justified? 2) Any recommendations on literature regarding model inspection in R. Thank you for reading this mail! Best wishes; Tore
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
------------------------------
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology End of R-sig-ecology Digest, Vol 23, Issue 2 ********************************************
Dr. Alain F. Zuur First author of: 1. Analysing Ecological Data (2007). Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p. URL: www.springer.com/0-387-45967-7 2. Mixed effects models and extensions in ecology with R. (2009). Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer. http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9 3. A Beginner's Guide to R (2009). Zuur, AF, Ieno, EN, Meesters, EHWG. Springer http://www.springer.com/statistics/computational/book/978-0-387-93836-3 Other books: http://www.highstat.com/books.htm Statistical consultancy, courses, data analysis and software Highland Statistics Ltd. 6 Laverock road UK - AB41 6FN Newburgh Tel: 0044 1358 788177 Email: highstat at highstat.com URL: www.highstat.com URL: www.brodgar.com