How to read the summary - R-help

Tue, Apr 28, 2009 6:09 AM #

How can I from the summary function, decide which glm (fit1, fit2 or fit3)
fits to data best? I don't know what to look after, so I would please
explain the important output.

Call:
glm(formula = Y ~ X, family = gaussian(link = "identity"))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.6619  -1.9693  -0.4119   2.0787   3.9664  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.4285     1.6213  -0.264 0.798258    
X             4.3952     0.7089   6.200 0.000259 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for gaussian family taken to be 6.784605)

    Null deviance: 315.081  on 9  degrees of freedom
Residual deviance:  54.277  on 8  degrees of freedom
AIC: 51.294

Number of Fisher Scoring iterations: 2

Call:
glm(formula = Y ~ X, family = gaussian(link = "log"))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.5489  -0.2960   0.4776   0.6353   1.2773  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.50537    0.16562   3.051   0.0158 *  
X            0.66352    0.05083  13.055 1.13e-06 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for gaussian family taken to be 1.083989)

    Null deviance: 315.0810  on 9  degrees of freedom
Residual deviance:   8.6718  on 8  degrees of freedom
AIC: 32.954

Number of Fisher Scoring iterations: 6

Call:
glm(formula = Y ~ X, family = Gamma(link = "log"))

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.35269  -0.09272   0.02550   0.13625   0.18018  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.85959    0.11244   7.645 6.04e-05 ***
X            0.53134    0.04916  10.808 4.74e-06 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for Gamma family taken to be 0.03262828)

    Null deviance: 4.31315  on 9  degrees of freedom
Residual deviance: 0.28385  on 8  degrees of freedom
AIC: 36.65

Number of Fisher Scoring iterations: 5

View this message in context: http://www.nabble.com/How-to-read-the-summary-tp23276848p23276848.html
Sent from the R help mailing list archive at Nabble.com.

K. Elo

Tue, Apr 28, 2009 9:13 AM #

Hi!

mathallan wrote:

Start with the AIC value (Akaike Information Criterion). The model
having the lowest AIC is the best (of the fitted models, of course).

So, in Your case, the AICs are:

Hence, the best model seems to be 'fit2'.

Kind regards,
Kimmo

Brian Ripley

Wed, Apr 29, 2009 12:00 AM #

On Tue, 28 Apr 2009, K. Elo wrote:

Except that fit3 did not use maximum likelihood to estimate the shape 
parameter and so that is not really a valid AIC value (and the actual 
AIC will be smaller since the maximized likelihood will be larger). 
Given that, and that AIC differences between non-nested models are 
highly variable I would see no clearcut difference between fit2 and 
fit3.  (Even for nested models an AIC difference of not more than 3.7 
would not be seen as a large difference.)

This is not really about the subject line at all: 'AIC' as printed 
here is computed by glm() and not summary.glm().  There is a warning 
about it on the ?glm help page (all the 'AIC' values quoted here do 
not take account of the estimation of the dispersion parameter), and 
AIC() does a slightly better job.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595