Skip to content

GLM model selection

2 messages · Antonio Olinto, Brian Ripley

#
Dear R list members,

I'd like to know whether the AIC statistic used by step( ) in R is suitable for selecting GLM models with gamma error distribution and log link function.

A statistician friend of mine (S plus user) said me to take care because in gamma distribution phi (scale?) is not constant. Also in step( ) help page is written that "there is a potential problem in using glm fits with a variable scale as in that case the deviance is not simply related to the maximized log-likelihood".

If it's not the most appropriated way to select the model, which would be the best way to perform the selection?

Thanks,

Ant?nio Olinto
Fisheries Institute
Sao Paulo - BRAZIL
www.institutopesca.sp.gov.br











-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://stat.ethz.ch/pipermail/r-help/attachments/20010629/10823e4b/attachment.html
#
On Fri, 29 Jun 2001, Antonio Olinto wrote:

            
for selecting GLM models with gamma error distribution and log link function.

Look at
function (fit, scale = 0, k = 2, ...)
{
    n <- length(fit$residuals)
    edf <- n - fit$df.residual
    aic <- fit$aic
    c(edf, aic + (k - 2) * edf)
}

which uses the aic value from
function (y, n, mu, wt, dev)
{
    n <- sum(wt)
    disp <- dev/n
    -2 * sum(dgamma(y, 1/disp, mu * disp, log = TRUE) * wt) +
        2
}

That's not the correct AIC with unknown scale, since the scale estimate is
not the MLE.
gamma distribution phi (scale?) is not constant. Also in step( ) help page is
written that "there is a potential problem in using glm fits with a variable
scale as in that case the deviance is not simply related to the maximized
log-likelihood".
best way to perform the selection?

You could rewrite the AIC to use the MLE for the scale and the correct
formulae.