Skip to content

How to read the summary

3 messages · mathallan, K. Elo, Brian Ripley

#
How can I from the summary function, decide which glm (fit1, fit2 or fit3)
fits to data best? I don't know what to look after, so I would please
explain the important output.
Call:
glm(formula = Y ~ X, family = gaussian(link = "identity"))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.6619  -1.9693  -0.4119   2.0787   3.9664  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.4285     1.6213  -0.264 0.798258    
X             4.3952     0.7089   6.200 0.000259 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for gaussian family taken to be 6.784605)

    Null deviance: 315.081  on 9  degrees of freedom
Residual deviance:  54.277  on 8  degrees of freedom
AIC: 51.294

Number of Fisher Scoring iterations: 2
Call:
glm(formula = Y ~ X, family = gaussian(link = "log"))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.5489  -0.2960   0.4776   0.6353   1.2773  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.50537    0.16562   3.051   0.0158 *  
X            0.66352    0.05083  13.055 1.13e-06 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for gaussian family taken to be 1.083989)

    Null deviance: 315.0810  on 9  degrees of freedom
Residual deviance:   8.6718  on 8  degrees of freedom
AIC: 32.954

Number of Fisher Scoring iterations: 6
Call:
glm(formula = Y ~ X, family = Gamma(link = "log"))

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.35269  -0.09272   0.02550   0.13625   0.18018  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.85959    0.11244   7.645 6.04e-05 ***
X            0.53134    0.04916  10.808 4.74e-06 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for Gamma family taken to be 0.03262828)

    Null deviance: 4.31315  on 9  degrees of freedom
Residual deviance: 0.28385  on 8  degrees of freedom
AIC: 36.65

Number of Fisher Scoring iterations: 5
#
Hi!
mathallan wrote:
Start with the AIC value (Akaike Information Criterion). The model
having the lowest AIC is the best (of the fitted models, of course).

So, in Your case, the AICs are:
Hence, the best model seems to be 'fit2'.

Kind regards,
Kimmo
#
On Tue, 28 Apr 2009, K. Elo wrote:

            
Except that fit3 did not use maximum likelihood to estimate the shape 
parameter and so that is not really a valid AIC value (and the actual 
AIC will be smaller since the maximized likelihood will be larger). 
Given that, and that AIC differences between non-nested models are 
highly variable I would see no clearcut difference between fit2 and 
fit3.  (Even for nested models an AIC difference of not more than 3.7 
would not be seen as a large difference.)

This is not really about the subject line at all: 'AIC' as printed 
here is computed by glm() and not summary.glm().  There is a warning 
about it on the ?glm help page (all the 'AIC' values quoted here do 
not take account of the estimation of the dispersion parameter), and 
AIC() does a slightly better job.