Dear list: this is a continuation of a previous question. To recap,after
input from several listserv members, I have been using glmmADMB to model
snag (dead tree) densities before, after, and 5 years after forest (factor
variable 'time'). (I apologize for disregarding the advice to use 'BUGS'
approaches -- I didn't have $ for new books or time for assimilation of new
paradigm). The data sampling was stratified by 'Ecozone' ('East' densities
are much lower than 'West'). Because there are many plots with no snags, I
evaluated use of Poisson vs. Negative binomial distribution by finding mean
& variance for each combination of time & Ecozone. AIC for the NB
distribution was vastly smaller (more favorable) than for the Poisson.
Variance tended to be much higher prior to treatment (because a few plots
have large numbers of snags and loggers are instructed to knock most of
these down and simply retain a few).
My statement is glmmadmb(snag~time+Ecozone + (1|SettingID), data=.,
family="nbinom1"). ('SettingID' is a plot identifier).
So, I have these observations/questions. First, when I experiment with
'family="poisson", AIC for the glmm is only 2 AIC units greater than for the
negative binomial model, yet with AIC estimated for individual cells, AIC
Poisson >> AIC NegBinom. Any idea why this might be? (I've tried
zero-inflation for poisson & NB, doesn't seem to make much difference)
It's tricky to interpret the log-transformed coefficients and associated
confidence intervals, but they don't seem to reflect the decrease in snags
after treatment that I see when I examine the cell means (e.g., mean East
density at time1 (pre-treatment) is 1.7, dropping to 0.7 at time2 (after
treatment).) Here's part of the summary statement:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.764 0.359 -2.13 0.0334 *
time2 -0.588 0.186 -3.16 0.0016 **
time3 -0.604 0.189 -3.20 0.0014 **
EcozoneW 1.249 0.487 2.57 0.0103 *
exp(coef(.))
(Intercept) time2 time3 EcozoneW
0.466 0.555 0.547 3.488
I have tried the full-interaction model ("time*Ecozone"), although it is not
justified from the standpoint of AIC values. This improves the situation
slightly, although confidence intervals do not indicate a significant
decrease in snags after treatment, contrary to what common sense suggests.
Do I need to somehow be modeling the big decrease in variance between time1
and time2 in order to make sense of these data? Data below.
--Seth W. Bigelow
"SettingID","Ecozone","time","snag"
"0506530270068","E","1",0
"0506530290166","E","1",0
"0506530290181","E","1",0
"0506530300115","E","1",0
"0506530300124","E","1",0
"0506530300136","E","1",2
"0506530380015","E","1",0
"0506530380037","E","1",0
"0506530390006","E","1",0
"0506530390119","E","1",0
"0506530400036","E","1",0
"0506530400049","E","1",0
"050653040280049","E","1",0
"0506530430040","E","1",0
"0506530430047","E","1",0
"0506530430065","E","1",0
"0506530430089","E","1",0
"0506580130060","E","1",2
"0506580130155","E","1",0
"0506580180062","E","1",2
"0506581900343","E","1",2
"0506581900346","E","1",4
"0511511060010","E","1",0
"0511511210049","E","1",0
"0511515070003","E","1",6
"0511515070018","E","1",0
"0511515070034","E","1",6
"0511515070054","E","1",0
"0511515160010","E","1",0
"0511515160013","E","1",4
"0511522070132","E","1",0
"0511522070151","E","1",1
"0511522070152","E","1",2
"0511525010008","E","1",0
"0511525010022","E","1",4
"0511525170005","E","1",4
"0517566200042","E","1",12
"0517566200044","E","1",2
"0517566300087","E","1",0
"0517566300095","E","1",4
"0517566350026","E","1",6
"0517566400015","E","1",8
"0517566400086","E","1",0
"0506512840152","W","1",0
"0506512840186","W","1",0
"0506513810068","W","1",0
"0511526022084","W","1",0
"0511526022085","W","1",0
"0511526030034","W","1",12
"0511526050084","W","1",12
"05115260516B0084","W","1",3
"05115260516G0084","W","1",7
"05115260516H0084","W","1",2
"051152605A0084","W","1",8
"0511526130477","W","1",0
"0511526130543","W","1",5
"0511526130619","W","1",2
"0511526190087","W","1",0
"0511534210306","W","1",0
"0511534210307","W","1",2
"0511534210308","W","1",2
"0511534210312","W","1",10
"0517567400160","W","1",4
"0517567400182","W","1",3
"0506530270068","E","2",0
"0506530290166","E","2",0
"0506530290181","E","2",0
"0506530300115","E","2",0
"0506530300124","E","2",0
"0506530300136","E","2",0
"0506530380015","E","2",0
"0506530380037","E","2",0
"0506530390006","E","2",0
"0506530390119","E","2",0
"0506530400036","E","2",0
"0506530400049","E","2",0
"050653040280049","E","2",0
"0506530430040","E","2",0
"0506530430047","E","2",0
"0506530430065","E","2",0
"0506530430089","E","2",0
"0506580130060","E","2",0
"0506580130155","E","2",0
"0506580180062","E","2",0
"0506581900343","E","2",0
"0506581900346","E","2",0
"0511511060010","E","2",0
"0511511210049","E","2",0
"0511515070003","E","2",4
"0511515070018","E","2",2
"0511515070034","E","2",4
"0511515070054","E","2",0
"0511515160010","E","2",0
"0511515160013","E","2",4
"0511522070132","E","2",0
"0511522070151","E","2",1
"0511522070152","E","2",2
"0511525010008","E","2",0
"0511525010022","E","2",2
"0511525170005","E","2",1
"0517566200042","E","2",2
"0517566200044","E","2",2
"0517566300087","E","2",0
"0517566300095","E","2",0
"0517566350026","E","2",0
"0517566400015","E","2",4
"0517566400086","E","2",0
"0506512840152","W","2",0
"0506512840186","W","2",0
"0506513810068","W","2",0
"0511526022084","W","2",0
"0511526022085","W","2",0
"0511526030034","W","2",4
"0511526050084","W","2",12
"05115260516B0084","W","2",2
"05115260516G0084","W","2",2
"05115260516H0084","W","2",2
"051152605A0084","W","2",2
"0511526130477","W","2",0
"0511526130543","W","2",5
"0511526130619","W","2",2
"0511526190087","W","2",0
"0511534210306","W","2",2
"0511534210307","W","2",0
"0511534210308","W","2",0
"0511534210312","W","2",8
"0517567400160","W","2",2
"0517567400182","W","2",4
"0506530270068","E","3",0
"0506530290166","E","3",0
"0506530290181","E","3",0
"0506530300115","E","3",0
"0506530300124","E","3",0
"0506530300136","E","3",0
"0506530380015","E","3",0
"0506530380037","E","3",0
"0506530390006","E","3",0
"0506530390119","E","3",0
"0506530400036","E","3",0
"0506530400049","E","3",0
"050653040280049","E","3",0
"0506530430040","E","3",2
"0506530430047","E","3",0
"0506530430065","E","3",0
"0506530430089","E","3",0
"0506580130060","E","3",0
"0506580130155","E","3",1
"0506580180062","E","3",0
"0506581900343","E","3",0
"0506581900346","E","3",0
"0511511060010","E","3",0
"0511511210049","E","3",0
"0511515070003","E","3",2
"0511515070018","E","3",0
"0511515070034","E","3",4
"0511515070054","E","3",0
"0511515160010","E","3",0
"0511515160013","E","3",4
"0511522070132","E","3",1
"0511522070151","E","3",0
"0511522070152","E","3",0
"0511525010008","E","3",3
"0511525010022","E","3",2
"0511525170005","E","3",0
"0517566200042","E","3",2
"0517566200044","E","3",0
"0517566300087","E","3",0
"0517566300095","E","3",0
"0517566350026","E","3",0
"0517566400015","E","3",4
"0517566400086","E","3",0
"0506512840152","W","3",0
"0506512840186","W","3",0
"0506513810068","W","3",0
"0511526022084","W","3",0
"0511526022085","W","3",0
"0511526030034","W","3",4
"0511526050084","W","3",10
"05115260516B0084","W","3",3
"05115260516G0084","W","3",0
"05115260516H0084","W","3",6
"051152605A0084","W","3",4
"0511526130477","W","3",0
"0511526130543","W","3",9
"0511526130619","W","3",0
"0511526190087","W","3",0
"0511534210306","W","3",0
"0511534210307","W","3",2
"0511534210308","W","3",0
"0511534210312","W","3",10
"0517567400160","W","3",2
"0517567400182","W","3",3
-----Original Message-----
From: Seth W. Bigelow [mailto:seth at swbigelow.net]
Sent: Friday, October 05, 2012 6:44 PM
To: 'Highland Statistics Ltd'; 'r-sig-mixed-models at r-project.org'
Subject: RE: [R-sig-ME] Fwd: Re: Mixed model and negative binomial
distribution
Alain, thank you very much for the advice regarding analysis of my (negative
binomial vs. poisson-distributed) snag data, and my warmest congratulations
on your nuptials
-Seth W. Bigelow
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Highland
Statistics Ltd
Sent: Friday, October 05, 2012 1:04 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Fwd: Re: Mixed model and negative binomial distribution
-------- Original Message --------
Subject: Re: Mixed model and negative binomial distribution
Date: Fri, 05 Oct 2012 12:37:28 +0200
From: Highland Statistics Ltd <highstat at highstat.com>
To: r-sig-mixed-models at r-project.org <r-sig-mixed-models at r-project.org>
------------------------------
Message: 2
Date: Fri, 5 Oct 2012 04:32:36 +0000 (UTC)
From: Ben Bolker <bbolker at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Mixed model and negative binomial distribution
Message-ID: <loom.20121005T062854-986 at post.gmane.org>
Content-Type: text/plain; charset=us-ascii
Seth W. Bigelow <seth at ...> writes:
Dear mixed-model brain trust:
I am comparing snag (dead tree) densities 1 year and 5 years after
silvicultural treatment in forest plots to densities prior to treatment.
In
nlme, my model is
lme(snagnum~treatment, random=(~1|plot), correlation=corExp(form=~year)).
{Treatment is a factor with values of Pre/1-year post/5-year post}. This
gives reasonable output, but I'm having a niggling doubt that I should be
using something akin to a negative binomial distribution, since about half
of the values are zeros (i.e., many plots had no snags prior to treatment,
and did not gain additional snags as a result of treatment). Can anyone
suggest an appropriate package and associated syntax for doing this mixed
model based on an alternative probability density function?
Negative binomial would be a reasonable distribution; the other
answer gives you some methods for doing this. *However*, incorporating
both serial correlation and non-Gaussian errors in a model of this form
is a bit of a nuisance.
The model you want might be something like
where the variance-covariance matrix gives you both some extra-Poisson
variation (to handle overdispersion) and some correlation between
observations. I'm hoping Alain Zuur will pop up shortly to point you
to a reference in his new book that will tell you how to do this in
WinBUGS ...
-----------
Yesterday, Alain was on a nice beach in the Caribbean enjoying his wedding.
Now he is close to the north pole.
Yes....this sounds MCMC. And code for NB GLMM with AR1 correlation and zero
inflation is indeed in the 2012 book.
However...if the zeros are sequential in time then the correlation and zero
inflation components may fight
for the same information. And the NB distribution may try to take its share
of zeros as well. Better start with
the Poisson version of it....and if a Poisson + correlation + ZIP still
shows problems, then consider the NB.
For irregular time series consider a CAR correlation.
As to the implementation..try WinBUGS, OpenBUGS, JAGS....code is nearly the
same. I recently downloaded STAN...looks promising too.
You may also want to have a look at "Generalized Additive Mixed Model
Analysis via gammSlice" by Pham and Wand.
GAMM is essentially a GLMM.... But I don;t think it can do zero
inflation....but perhaps it does NB GLMM. Not sure.
Alain
Dr. Alain F. Zuur
First author of:
1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7
2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9
3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3
4. Zero Inflated Models and Generalized Linear Mixed Models with R. (2012)
Zuur, Saveliev, Ieno.
http://www.highstat.com/book4.htm
Other books: http://www.highstat.com/books.htm
Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Seth--
It is not clear to me what the disconnect is between the GLMM results
and what you expect? For example:
"It's tricky to interpret the log-transformed coefficients and associated
confidence intervals, but they don't seem to reflect the decrease in snags
after treatment that I see when I examine the cell means (e.g., mean East
density at time1 (pre-treatment) is 1.7, dropping to 0.7 at time2 (after
treatment).)"
I fit your model, and here are the results including random-effects:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.764 0.359 -2.13 0.0334 *
as.factor(time)2 -0.588 0.186 -3.16 0.0016 **
as.factor(time)3 -0.604 0.189 -3.20 0.0014 **
EcozoneW 1.249 0.487 2.57 0.0103 *
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Number of observations: total=192, SettingID=64
Random effect variance(s):
Group=SettingID
Variance StdDev
(Intercept) 2.37 1.54
Negative binomial dispersion parameter: 1.812 (std. err.: 0.33774)
So, we can exponentiate these coefficients to get *conditional* rate
ratios, or we can include the random-effects to get marginal predictions
of snag from the model.
Conditional RR:
> cbind(exp(fixef(my.admb)))
[,1]
(Intercept) 0.4660185
as.factor(time)2 0.5551761
as.factor(time)3 0.5467793
EcozoneW 3.4879006
Thus, conditional on the random-effects, there is a 45% decrease in
snags at time 2. That seems reasonable given the data, I think?
Marginal predictions:
> exp(-0.764 + 2.37/2) # marginal intercept
[1] 1.523484
> exp(-0.764 - 0.588 + 2.37/2) # marginal prediction at time2
[1] 0.8461996
Again, seem sort of reasonable, with the caveat that I am a psychologist
who studies drug addiction and marital therapy... ;)
If some of your questions are general questions about how to interpret
the output from Poisson / NB mixed models, then you might take a look at
a tutorial article that we wrote (with R code), which also points to
other resources:
http://depts.washington.edu/cshrb/newweb/statstutorials.html
Atkins, D. C., Baldwin, S., Zheng, C., Gallop, R. J., & Neighbors, C.
(in press). A tutorial on count regression and zero-altered count models
for longitudinal addictions data.
Granted, the example data come from alcohol focused studies (but
interestingly enough, it seems the addictions researchers and ecologists
have similar types of outcomes / models).
If I have missed your question, please let me know.
cheers, Dave
Dave Atkins, PhD
University of Washington
datkins at u.washington.edu
http://depts.washington.edu/cshrb/
August 1 - October 30:
Universitat Zurich
Psychologisches Institut
Klinische Psychologie
Binzmuhlestrasse 14/23
CH-8050 Zurich
+41 44 635 71 75
Original post:
Dear list: this is a continuation of a previous question. To recap,after
input from several listserv members, I have been using glmmADMB to model
snag (dead tree) densities before, after, and 5 years after forest (factor
variable 'time'). (I apologize for disregarding the advice to use 'BUGS'
approaches -- I didn't have $ for new books or time for assimilation of new
paradigm). The data sampling was stratified by 'Ecozone' ('East' densities
are much lower than 'West'). Because there are many plots with no snags, I
evaluated use of Poisson vs. Negative binomial distribution by finding mean
& variance for each combination of time & Ecozone. AIC for the NB
distribution was vastly smaller (more favorable) than for the Poisson.
Variance tended to be much higher prior to treatment (because a few plots
have large numbers of snags and loggers are instructed to knock most of
these down and simply retain a few).
My statement is glmmadmb(snag~time+Ecozone + (1|SettingID), data=.,
family="nbinom1"). ('SettingID' is a plot identifier).
So, I have these observations/questions. First, when I experiment with
'family="poisson", AIC for the glmm is only 2 AIC units greater than for the
negative binomial model, yet with AIC estimated for individual cells, AIC
Poisson >> AIC NegBinom. Any idea why this might be? (I've tried
zero-inflation for poisson & NB, doesn't seem to make much difference)
It's tricky to interpret the log-transformed coefficients and associated
confidence intervals, but they don't seem to reflect the decrease in snags
after treatment that I see when I examine the cell means (e.g., mean East
density at time1 (pre-treatment) is 1.7, dropping to 0.7 at time2 (after
treatment).) Here's part of the summary statement:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.764 0.359 -2.13 0.0334 *
time2 -0.588 0.186 -3.16 0.0016 **
time3 -0.604 0.189 -3.20 0.0014 **
EcozoneW 1.249 0.487 2.57 0.0103 *
>> exp(coef(.))
(Intercept) time2 time3 EcozoneW
0.466 0.555 0.547 3.488
I have tried the full-interaction model ("time*Ecozone"), although it is not
justified from the standpoint of AIC values. This improves the situation
slightly, although confidence intervals do not indicate a significant
decrease in snags after treatment, contrary to what common sense suggests.
Do I need to somehow be modeling the big decrease in variance between time1
and time2 in order to make sense of these data? Data below.
--Seth W. Bigelow
Dave,
I skimmed your tutorial after you first mentioned it but now am going
through line by line. It explains everything beautifully and simply, I
strongly recommend
"Atkins, D. C., Baldwin, S., Zheng, C., Gallop, R. J., & Neighbors, C. (in
press). A tutorial on count regression and zero-altered count models for
longitudinal addictions data."
...for anyone seeking a compact introduction to poisson & negative
binomial-based mixed models, regardless of the field they are in.
Thank you for running my snag data and explaining how to interpret
coefficients of mixed-models based on poisson and negative binomial
distributions.
--Seth W. Bigelow
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of David Atkins
Sent: Tuesday, October 23, 2012 1:35 AM
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Mixed model and negative binomial distribution
Seth--
It is not clear to me what the disconnect is between the GLMM results
and what you expect? For example:
"It's tricky to interpret the log-transformed coefficients and associated
confidence intervals, but they don't seem to reflect the decrease in snags
after treatment that I see when I examine the cell means (e.g., mean East
density at time1 (pre-treatment) is 1.7, dropping to 0.7 at time2 (after
treatment).)"
I fit your model, and here are the results including random-effects:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.764 0.359 -2.13 0.0334 *
as.factor(time)2 -0.588 0.186 -3.16 0.0016 **
as.factor(time)3 -0.604 0.189 -3.20 0.0014 **
EcozoneW 1.249 0.487 2.57 0.0103 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Number of observations: total=192, SettingID=64
Random effect variance(s):
Group=SettingID
Variance StdDev
(Intercept) 2.37 1.54
Negative binomial dispersion parameter: 1.812 (std. err.: 0.33774)
So, we can exponentiate these coefficients to get *conditional* rate
ratios, or we can include the random-effects to get marginal predictions
of snag from the model.
Conditional RR:
> cbind(exp(fixef(my.admb)))
[,1]
(Intercept) 0.4660185
as.factor(time)2 0.5551761
as.factor(time)3 0.5467793
EcozoneW 3.4879006
Thus, conditional on the random-effects, there is a 45% decrease in
snags at time 2. That seems reasonable given the data, I think?
Marginal predictions:
> exp(-0.764 + 2.37/2) # marginal intercept
[1] 1.523484
> exp(-0.764 - 0.588 + 2.37/2) # marginal prediction at time2
[1] 0.8461996
Again, seem sort of reasonable, with the caveat that I am a psychologist
who studies drug addiction and marital therapy... ;)
If some of your questions are general questions about how to interpret
the output from Poisson / NB mixed models, then you might take a look at
a tutorial article that we wrote (with R code), which also points to
other resources:
http://depts.washington.edu/cshrb/newweb/statstutorials.html
Atkins, D. C., Baldwin, S., Zheng, C., Gallop, R. J., & Neighbors, C.
(in press). A tutorial on count regression and zero-altered count models
for longitudinal addictions data.
Granted, the example data come from alcohol focused studies (but
interestingly enough, it seems the addictions researchers and ecologists
have similar types of outcomes / models).
If I have missed your question, please let me know.
cheers, Dave
Dave Atkins, PhD
University of Washington
datkins at u.washington.edu
http://depts.washington.edu/cshrb/
August 1 - October 30:
Universitat Zurich
Psychologisches Institut
Klinische Psychologie
Binzmuhlestrasse 14/23
CH-8050 Zurich
+41 44 635 71 75
Original post:
Dear list: this is a continuation of a previous question. To recap,after
input from several listserv members, I have been using glmmADMB to model
snag (dead tree) densities before, after, and 5 years after forest (factor
variable 'time'). (I apologize for disregarding the advice to use 'BUGS'
approaches -- I didn't have $ for new books or time for assimilation of new
paradigm). The data sampling was stratified by 'Ecozone' ('East' densities
are much lower than 'West'). Because there are many plots with no snags, I
evaluated use of Poisson vs. Negative binomial distribution by finding mean
& variance for each combination of time & Ecozone. AIC for the NB
distribution was vastly smaller (more favorable) than for the Poisson.
Variance tended to be much higher prior to treatment (because a few plots
have large numbers of snags and loggers are instructed to knock most of
these down and simply retain a few).
My statement is glmmadmb(snag~time+Ecozone + (1|SettingID), data=.,
family="nbinom1"). ('SettingID' is a plot identifier).
So, I have these observations/questions. First, when I experiment with
'family="poisson", AIC for the glmm is only 2 AIC units greater than for the
negative binomial model, yet with AIC estimated for individual cells, AIC
Poisson >> AIC NegBinom. Any idea why this might be? (I've tried
zero-inflation for poisson & NB, doesn't seem to make much difference)
It's tricky to interpret the log-transformed coefficients and associated
confidence intervals, but they don't seem to reflect the decrease in snags
after treatment that I see when I examine the cell means (e.g., mean East
density at time1 (pre-treatment) is 1.7, dropping to 0.7 at time2 (after
treatment).) Here's part of the summary statement:
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.764 0.359 -2.13 0.0334 *
time2 -0.588 0.186 -3.16 0.0016 **
time3 -0.604 0.189 -3.20 0.0014 **
EcozoneW 1.249 0.487 2.57 0.0103 *
>> exp(coef(.))
(Intercept) time2 time3 EcozoneW
0.466 0.555 0.547 3.488
I have tried the full-interaction model ("time*Ecozone"), although it is not
justified from the standpoint of AIC values. This improves the situation
slightly, although confidence intervals do not indicate a significant
decrease in snags after treatment, contrary to what common sense suggests.
Do I need to somehow be modeling the big decrease in variance between time1
and time2 in order to make sense of these data? Data below.
--Seth W. Bigelow
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models