Paulo In?cio de Knegt L?pez de Prado wrote:
Dear r-sig-ecology users
Here follow the messages I exchanged with Ben Bolker last week about the
likelihood and frequentist approaches. We both would like to open this
topic
for discussion in the list.
Best wishes
Paulo
----------------------------------------------------------------------------
Dear Dr. Bolker,
I am puzzled why some authors treat likelihood approaches as
frequentist, as it seems you did in page 13 of your book 'Ecological
Models
and Data'. This sounds odd to me because what brought my attention to
likelihood was
Richard Royall's book 'Statistical Evidence'. His framing of a paradigm
based on the likelihood principle, and the clear distinction he makes
between this paradigm and frequentist and Bayesian approaches looks
quite convincing to me.
Paulo,
The likelihood function is the central concept of statistical inference,
so working with the likelihood you can have Bayesian, frequentist
(better called sampling-distribution inference), or likelihoodist
inference, depending on what do you do with your likelihoods. In
Bayesian inference the likelihood function updates prior opinion by
bringing the data into the inference, in sampling-distribution inference
(a.k.a. frequentist) it allows the building of better confidence
intervals by finding in the sample space likelihood values that could
have occurred if data similar to the data you have had been obtained,
and in the direct-likelihood approach the likelihood is directly used to
compare two hypotheses or equivalently to build direct-likelihood
intervals. For example, the likelihood ratio test (not to be confused
with the pure likelihood ratio, or differences in support) based on a
limiting Chi-square distribution is a likelihood-based frequentist
method. Frequentist statisticians evaluate the likelihood from the
sample, and then proceed to evaluate the likelihood for other potential
samples, thus building their confidence intervals and p-values. On the
other hand Bayesian and likelihoodist statisticians only use the
likelihood evaluated at the actual sample that was obtained. From that
point of view one can say that Bayesian and likelihoodist are closer to
each other than to frequentists, however both Bayesian and frequentists
base their inference on probabilities (posterior probabilities or error
rates) whereas likelihoodists base their inference on, well, likelihood
only.
Royall's points are very convincing indeed, at least they were for me
too. Royall's concept of evidence in the sample about competing
hypotheses and on approximate likelihoods for problems with nuisance
parameters, plus Edwards' mathematical proofs of the properties of the
support function, plus Jim Lindsey's arguments about Akaike's index in
model selection, provide a complete theory of statistical inference,
based exclusively on the likelihood, IMHO.
I agree with him that we use likelihood criteria to identify, among
competing hypotheses, which one attribute the highest probability to a
given dataset. If I understood correctly, this is what Royal calls the
'evidence value' of a data set to a hypothesis 'vis a vis' other
hypotheses. I also like his idea that the role of statistics in science
is just to gauge this evidence value, no less, no more.
This approach differs from the frequentist because the sampling
space is irrelevant, that is, other datasets that might be observed do
not
affect the evidence value of the observed data set. My favourite
example is
the comparison of binomial and negative binomial experiments on coin
tossing, in the sections 1.11 and 1.12 of his book.
I am not an "orthodox likelihoodist"; on the contrary, I agree with the
pragmatic view you express in your book. I'd just like to understand
the key differences among the available statistical tools, in order to
make
a good pragmatic use of them. I'd really appreciate if you can help me
with this.
Best wishes
Paulo
"There is nothing more practical than a good theory". I'm not sure who
was the original author of that quote (in a book I read long ago it was
said that the author was Einstein) but it applies here. Likelihoodist,
frequentist, and Bayesian inferences are not compatible. Especially
likelihoodist and Bayesian versus frequentist, so the pragmatst who
change allegiance is making an error at some point.
Very well put. Royall, and Edwards (author of _Likelihood_, Johns
Hopkins 1992) are what I would call "pure", or "hard-core",
or "orthodox", likelihoodists. They are satisfied with a statement
of relative likelihood, and don't feel the need to attach a p-value
to the result in order to have a decision rule for hypothesis rejection.
Far more commonly, however, people impose (? add ?) an additional
layer of frequentist procedure on top of this basic structure, namely
using the likelihood ratio test to assess the statistical significance
of a given observed likelihood ratio and/or to set a cutoff value
for profile confidence intervals. Using the LRT puts the inference
back squarely into the frequentist domain, although the sample space
we are now dealing with (sample space of likelihoods derived from
coin-tossing experiments) is quite different from the one
we started with (sample space of outcomes of coin-tossing experiments).
As far as I can see, Edwards and Royall are almost alone in their
adherence to "pure" likelihood -- most of the rest of us pander
to the desire for p-values (or, less cynically, to the desire
for a probabilistically sound decision rule).
Two other great statisticians that subscribe to the likelihoodist school
of inference are Jim Lindsey
and John Nelder.
At least once a year I hear someone at a meeting say that there are two
modes of inference:
frequentist and Bayesian. That this sort of nonsense should be so
regularly propagated shows how
much we have to do. To begin with there is a flourishing school of
likelihood inference, to which I
belong.
I would also add that different scientists have different
goals (belief, prediction, decision, assessing evidence). I too
think Royall makes a good case for the primacy of
assessing strength-of-evidence, and he gives the clearest
explanation I have seen, but I wouldn't completely
rule out the other frameworks.
I tend to think there is a place for Bayesian inference in prediction.
Rub?n