Question regarding the discrepancy between count model parameter estimates between "pscl" and "MASS"
On 29/08/2014 17:46, Nick Livingston wrote:
Thank you for your responses.
Since my previous attempt to manually truncate my DV didn't work, I'm very interested in trying again using the zerotrun() function. However, I attempted to install "countreg" but received the following notification:
Warning in install.packages :
unable to access index for repository http://R-Forge.R-project.org/bin/macosx/contrib/3.0
package ?countreg? is available as a source package but not as a binary
Warning in install.packages :
package ?countreg? is not available (for R version 3.0.3)
I received the same message when attempting to install it in version 3.1.0, and the latest version, 3.1.1. Am I missing something?
As the package does not contain compiled code, you can simply install from the sources. (I just did so in the GUI: just untick 'binary'.)
Thank you again. I appreciate your input. -Nick -------------------------------------------- On Fri, 8/29/14, Achim Zeileis <Achim.Zeileis at uibk.ac.at> wrote: Subject: Re: [R] Question regarding the discrepancy between count model parameter estimates between "pscl" and "MASS" To: "peter dalgaard" <pdalgd at gmail.com> Cc: "Nick Livingston" <nlivingston at ymail.com>, r-help at r-project.org Date: Friday, August 29, 2014, 5:26 AM On Fri, 29 Aug 2014, peter dalgaard wrote:
>
I'm no expert on hurdle models, but it seems that you are unaware that
> the negative binomial
and the truncated negative binomial are quite
> different things.
Yes. You can replicate the truncated count part of the hurdle model with the zerotrunc() function from the "countreg" package. The package is not yet on CRAN but can be easily installed from R-Forge.
> -pd > > > On 29 Aug 2014, at
05:57 , Nick Livingston <nlivingston at ymail.com> wrote:
>
>> I have
sought consultation online and in person, to no avail. I hope someone
>> on here might have
some insight. Any feedback would be most welcome.
>> >> I am
attempting to plot predicted values from a two-component hurdle model
>> (logistic [suicide
attempt yes/no] and negative binomial count [number of
>> attempts thereafter]). To do so, I
estimated each component separately using
>> glm (MASS). While I am able to
reproduce hurdle results for the logit
>> portion in glm, estimates for the
negative binomial count component are
>> different. >> >> Call: >>
hurdle(formula = Suicide. ~ Age + gender + Victimization * FamilySupport |
>> Age + gender +
Victimization * FamilySupport, dist = "negbin", link =
>> "logit") >> >> Pearson
residuals:
>> Min
1Q Median 3Q Max
>> -0.9816 -0.5187 -0.4094 0.2974
5.8820
>> >>
Count model coefficients (truncated negbin with log link):
>>
Estimate Std. Error z value
>> Pr(>|z|) >>
(Intercept) -0.29150
0.33127 -0.880 0.3789
>> Age
0.17068 0.07556 2.259 0.0239
>> * >> gender
0.28273
0.31614 0.894 0.3712
>> Victimization
1.08405 0.18157 5.971 2.36e-09
>>
***
>> FamilySupport
0.33629 0.29302 1.148 0.2511
>> Victimization:FamilySupport -0.96831
0.46841 -2.067 0.0387 *
>> Log(theta)
0.12245 0.54102 0.226 0.8209
>> Zero hurdle model coefficients
(binomial with logit link):
>>
Estimate Std. Error z value
>>
Pr(>|z|)
>> (Intercept)
-0.547051 0.215981 -2.533
0.01131
>> * >>
Age
-0.154493 0.063994 -2.414
>> 0.01577 * >>
gender
-0.030942 0.284868 -0.109
0.91350
>> Victimization
1.073956 0.338015 3.177 0.00149
>> ** >>
FamilySupport
-0.380360 0.247530 -1.537
0.12439
>>
Victimization\:FamilySupport -0.813329 0.399905 -2.034 0.04197 *
>> --- >> Signif.
codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>> >> Theta: count
= 1.1303
>> Number of iterations in
BFGS optimization: 23
>>
Log-likelihood: -374.3 on 25 Df
>>>
summary(logistic)
>> >> >> >> >> Call: >> glm(formula = SuicideBinary ~ Age +
gender = Victimization * FamilySupport,
>> family = "binomial") >> >> Deviance
Residuals:
>> Min
1Q Median 3Q
Max
>> -1.9948 -0.8470
-0.6686 1.1160 2.0805
>> >>
Coefficients:
>>
Estimate Std. Error z value
>>
Pr(>|z|)
>> (Intercept)
-0.547051 0.215981 -2.533 0.01131 *
>> Age
-0.154493 0.063994 -2.414 0.01577
>> * >> gender
-0.030942 0.284868 -0.109 0.91350
>> Victimization
1.073956 0.338014 3.177
0.00149
>> ** >>
FamilySupport -0.380360 0.247530 -1.537 0.12439
>> Victimization:FamilySupport
-0.813329 0.399904 -2.034 0.04197 *
>> --- >> Signif.
codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>> >> (Dispersion
parameter for binomial family taken to be 1)
>> >>
Null deviance: 452.54 on 359 degrees of freedom
>> Residual deviance: 408.24
on 348 degrees of freedom
>> (52 observations deleted
due to missingness)
>> AIC: 432.24 >> >> Number of
Fisher Scoring iterations: 4
>>
>>> summary(Count1)
>> >> >> >> >> >> >> Call: >>
glm(formula = NegBinSuicide ~ Age + gender + Victimization * FamilySupport,
>> family =
negative.binomial(theta = 1.1303))
>> >> Deviance
Residuals:
>> Min
1Q Median 3Q
Max
>> -1.6393 -0.4504
-0.1679 0.2350 2.1676
>> >>
Coefficients:
>>
Estimate Std. Error t value
>> Pr(>|t|) >>
(Intercept) 0.60820 0.13779 4.414 2.49e-05
>> *** >> Age
0.08836
0.04189 2.109 0.0373
>> * >> gender
0.10983 0.17873 0.615 0.5402
>> Victimization
0.73270 0.10776 6.799 6.82e-10
>> *** >> FamilySupport
0.10213 0.15979 0.639 0.5241
>>
Victimization:FamilySupport -0.60146 0.24532 -2.452 0.0159 *
>> --- >> Signif.
codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>> >> (Dispersion
parameter for Negative Binomial(1.1303) family taken to be
>> 0.4549082) >> >>
Null deviance: 76.159 on 115 degrees of freedom
>> Residual deviance: 35.101
on 104 degrees of freedom
>> (296 observations deleted
due to missingness)
>> AIC: 480.6 >> >> Number of
Fisher Scoring iterations: 15
>> >> >>
Alternatively, if there is a simpler way to plot hurdle regression output, or if anyone is award of another means of estimating NB models (I haven't had much luck with vglm from VGAM either), I would be happy to hear about that as well. I'm currently using the "visreg"
>> package for plotting. >> >> Thanks! >> >> >> >> >> >>
______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Brian D. Ripley, ripley at stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK