Skip to content

akaike's information criterion

3 messages · Brian Ripley, Frank E Harrell Jr, Thomas Dick

#
On Thu, 13 Sep 2001, Thomas Dick wrote:

            
There's a book

     Sakamoto, Y., Ishiguro, M., and Kitagawa G. (1986). Akaike
     Information Criterion Statistics. D. Reidel Publishing Company.

for example.  And complete derivations and comments on the whole
family in chapter 2 of

     Ripley, B. D. (1996) Pattern Recognition and Neural Networks.
     Cambridge.
Those are extra parameters: add them in (unless the maximum occurs at
a range boundary).
Not at all: that is done for you in creating a regression model.
It's a minimum over a finite set of models.  Finite sets have no
concept of local minima.  However, one can have several models from which
all one-step changes (suitably defined) increase AIC.

AIC is a very general concept which arose in time series/single
processing (and was published by name in the IEEE Trans on Automatic
Control).  It's clear how to define it for regular maximum likelihood
problems (hence the boundary restriction above).
#
Especially if you are going to be doing formal statistical
inference but often even just for prediction,
model uncertainty of all types needs to be taken
into account.  The use of AIC to select from among
a small set of competing models or to select a single
"tuning constant" such as an overall shrinkage or
penalty factor does not cause many problems.  For
what you have suggested, it is possible to be
mislead by unrecognized model uncertainty when
entertaining many models and transformations.
The formula for AIC in many ways assumes that
the model specification was non-stochastic.

See

@ARTICLE{far92cos,
  author = {Faraway, J. J.},
  year = 1992,
  title = {The cost of data analysis},
  journal = J Comp Graphical Stat,
  volume = 1,
  pages = {213-229},
  annote = {bootstrap; validation; predictive accuracy; modeling
strategy;
           regression diagnostics;model uncertainty}
}
and

@ARTICLE{cha95mod,
  author = {Chatfield, C.},
  year = 1995,
  title = {Model uncertainty, data mining and statistical inference
(with
          discussion)},
  journal = JRSSA,
  volume = 158,
  pages = {419-466},
  annote = {bias by selecting model because it fits the data well; bias
in
           standard errors;P. 420: ... need for a better balance in the
           literature and in statistical teaching between {\em
techniques} and
           problem solving {\em strategies}. P. 421: It is `well known'
to be
           `logically unsound and practically misleading' (Zhang, 1992)
to
           make inferences as if a model is known to be true when it
has, in
           fact, been selected from the {\em same} data to be used for
           estimation purposes. However, although statisticians may
admit this
           privately (Breiman (1992) calls it a `quiet scandal'), they
(we)
           continue to ignore the difficulties because it is not clear
what
           else could or should be done. P. 421: Estimation errors for
           regression coefficients are usually smaller than errors from
           failing to take into account model specification. P. 422:
           Statisticians must stop pretending that model uncertainty
does not
           exist and begin to find ways of coping with it. P. 426: It is
           indeed strange that we often admit model uncertainty by
searching
           for a best model but then ignore this uncertainty by making
           inferences and predictions as if certain that the best
fitting
           model is actually true. P. 427: The analyst needs to assess
the
           model selection {\em process} and not just the best fitting
model.
           P. 432: The use of subset selection methods is well known to
           introduce alarming biases. P. 433: ... the AIC can be highly
biased
           in data-driven model selection situations. P. 434: Prediction
           intervals will generally be too narrow. In the discussion,
Jamal R.
           M. Ameen states that a model should be (a) satisfactory in
           performance relative to the stated objective, (b) logically
sound,
           (c) representative, (d) questionable and subject to on-line
           interrogation, (e) able to accommodate external or expert
           information and (f) able to convey information.}
}


Frank Harrell
Thomas Dick wrote:

  
    
#
Hello all,

i hope you don't mind my off topic question. i want to use the Akaike criterion
for variable selection in a regression model. Does anyone know some basic
literature about that topic?

Especially I'm interested in answers to the following questions:
1. Has (and if so how has) the criterion to be modified, if i estimate the
transformations of the variables too?

2. How is the usage of the criterion if i use dummy variables (for categorical
data) in the model?

3. does the criterion have only one minimum, or may i assume several local
minima?

Thank you in advance
Thomas
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._