Correcting for overdispersion

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120709/e0f68324/attachment.pl>

Hello,

I am trying to determine LD50 and LD95 using dose.p in MASS however some of the Residual variance is larger than the degrees of freedom. Please can anyone help with any advice as to how i can correct for this?
Er, in what sense is that a problem? Your code is not reproducible, at least some output to look at might help.

-pd
Here is the model as inputted into R

y<-cbind(dead,n-dead)

model<-glm(y~log(conc),binomial)
summary(model)

xv<-seq(min(log(conc)-1),max(log(conc)+1),0.01)
lines(xv,predict(model,list(conc=exp(xv)),type="response"))

dose.p(model,p=c(0.10,0.25,0.5,0.75,0.90,0.95))

Thanks

Adaku

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120709/8121f515/attachment.pl>

Hello,
Thanks for getting back to me. I was of the impression that once the res. var. is larger than the df then the data was overdispersed and as such the model was not a best fit. Is this true?
Not without qualification. There are various schools, but if you ask me, I think that overdispersion models are used a bit too often without proper attention to what they actually mean. Sometimes the effect is (unwittingly) to paper over systematic lack of fit in the model (judging by your residuals, that's not likely the case here, though). 

To use such models you should have evidence of lack of fit and/or a plausible reason for the extra variation. 

Re. evidence, you have a deviance of 7.31 on 4 df which corresponds to a p value of 0.12 in the asymptotic chi-square distribution. So, not exactly convincing; also, you need to consider whether the expected counts are large enough for the asymptotics to hold.

Re. plausibility, you should ask yourself whether there is good reason to have have an extra random effect operating at the level of individual binomial distributions. This could be the case if you have an experiment of the sort where you give, say, a doses of pesticide to containers of 50 flies, and count the dead ones. In that case, there could be effects of getting the dose slightly wrong, the temperature of the container, and whatnot. If on the other hand, you inject a batch of rats with a dose from a randomly chosen vial, each of which contain a carefully and individually measured-out dose, then it could be quite hard to think of a reason for something increasing or decreasing the probability for all rats at the same dose.

That being said, as far as I can tell, there's no problem in principle with using dose.p on an overdispersed model, because it only depends on vcov(obj). An overdispersion parameter based on 4 df is the most worrying bit.

-pd
Here is an example of the output from R:
Call:
glm(formula = y ~ log(conc), family = binomial)
Deviance Residuals: 
       1         2         3         4         5         6  
 0.54568   1.08474   0.04561  -2.00959   0.05772   1.33891  
Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -5.52815    0.85916  -6.434 1.24e-10 ***
log(conc)    0.40457    0.05938   6.813 9.56e-12 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
(Dispersion parameter for binomial family taken to be 1)
    Null deviance: 78.811  on 5  degrees of freedom
Residual deviance:  7.311  on 4  degrees of freedom
AIC: 30.45
Number of Fisher Scoring iterations: 4
xv<-seq(min(log(conc)-1),max(log(conc)+1),0.01)
lines(xv,predict(model,list(conc=exp(xv)),type="response"))

dose.p(model,p=c(0.10,0.25,0.5,0.75,0.90))
               Dose        SE
p = 0.10:  8.233179 0.9810446
p = 0.25: 10.948665 0.6580127
p = 0.50: 13.664152 0.4703530
p = 0.75: 16.379638 0.5720159
p = 0.90: 19.095125 0.8665399
exp(13.664152)
[1] 859539.4
exp(13.664152+(1.96*0.4703530))
[1] 2160918
exp(13.664152-(1.96*0.04703530))
[1] 783842
BW
Adaku
________________________________________
From: peter dalgaard [pdalgd at gmail.com]
Sent: 09 July 2012 20:03
To: Lawrence, Adaku
Cc: r-help at r-project.org
Subject: Re: [R] Correcting for overdispersion

On Jul 9, 2012, at 20:23 , Lawrence, Adaku wrote:

Hello,

I am trying to determine LD50 and LD95 using dose.p in MASS however some of the Residual variance is larger than the degrees of freedom. Please can anyone help with any advice as to how i can correct for this?

Er, in what sense is that a problem? Your code is not reproducible, at least some output to look at might help.

-pd

Here is the model as inputted into R

y<-cbind(dead,n-dead)

model<-glm(y~log(conc),binomial)
summary(model)

xv<-seq(min(log(conc)-1),max(log(conc)+1),0.01)
lines(xv,predict(model,list(conc=exp(xv)),type="response"))

dose.p(model,p=c(0.10,0.25,0.5,0.75,0.90,0.95))

Thanks

Adaku

      [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com