Are likelihood approaches frequentist?
Paulo In?cio de Knegt L?pez de Prado wrote:
Dear r-sig-ecology users Here follow the messages I exchanged with Ben Bolker last week about the likelihood and frequentist approaches. We both would like to open this topic for discussion in the list. Best wishes Paulo ---------------------------------------------------------------------------- Dear Dr. Bolker, I am puzzled why some authors treat likelihood approaches as frequentist, as it seems you did in page 13 of your book 'Ecological Models and Data'. This sounds odd to me because what brought my attention to likelihood was Richard Royall's book 'Statistical Evidence'. His framing of a paradigm based on the likelihood principle, and the clear distinction he makes between this paradigm and frequentist and Bayesian approaches looks quite convincing to me.
Paulo, The likelihood function is the central concept of statistical inference, so working with the likelihood you can have Bayesian, frequentist (better called sampling-distribution inference), or likelihoodist inference, depending on what do you do with your likelihoods. In Bayesian inference the likelihood function updates prior opinion by bringing the data into the inference, in sampling-distribution inference (a.k.a. frequentist) it allows the building of better confidence intervals by finding in the sample space likelihood values that could have occurred if data similar to the data you have had been obtained, and in the direct-likelihood approach the likelihood is directly used to compare two hypotheses or equivalently to build direct-likelihood intervals. For example, the likelihood ratio test (not to be confused with the pure likelihood ratio, or differences in support) based on a limiting Chi-square distribution is a likelihood-based frequentist method. Frequentist statisticians evaluate the likelihood from the sample, and then proceed to evaluate the likelihood for other potential samples, thus building their confidence intervals and p-values. On the other hand Bayesian and likelihoodist statisticians only use the likelihood evaluated at the actual sample that was obtained. From that point of view one can say that Bayesian and likelihoodist are closer to each other than to frequentists, however both Bayesian and frequentists base their inference on probabilities (posterior probabilities or error rates) whereas likelihoodists base their inference on, well, likelihood only. Royall's points are very convincing indeed, at least they were for me too. Royall's concept of evidence in the sample about competing hypotheses and on approximate likelihoods for problems with nuisance parameters, plus Edwards' mathematical proofs of the properties of the support function, plus Jim Lindsey's arguments about Akaike's index in model selection, provide a complete theory of statistical inference, based exclusively on the likelihood, IMHO.
I agree with him that we use likelihood criteria to identify, among competing hypotheses, which one attribute the highest probability to a given dataset. If I understood correctly, this is what Royal calls the 'evidence value' of a data set to a hypothesis 'vis a vis' other hypotheses. I also like his idea that the role of statistics in science is just to gauge this evidence value, no less, no more. This approach differs from the frequentist because the sampling space is irrelevant, that is, other datasets that might be observed do not affect the evidence value of the observed data set. My favourite example is the comparison of binomial and negative binomial experiments on coin tossing, in the sections 1.11 and 1.12 of his book. I am not an "orthodox likelihoodist"; on the contrary, I agree with the pragmatic view you express in your book. I'd just like to understand the key differences among the available statistical tools, in order to make a good pragmatic use of them. I'd really appreciate if you can help me with this. Best wishes Paulo
"There is nothing more practical than a good theory". I'm not sure who was the original author of that quote (in a book I read long ago it was said that the author was Einstein) but it applies here. Likelihoodist, frequentist, and Bayesian inferences are not compatible. Especially likelihoodist and Bayesian versus frequentist, so the pragmatst who change allegiance is making an error at some point.
Very well put. Royall, and Edwards (author of _Likelihood_, Johns
Hopkins 1992) are what I would call "pure", or "hard-core",
or "orthodox", likelihoodists. They are satisfied with a statement
of relative likelihood, and don't feel the need to attach a p-value
to the result in order to have a decision rule for hypothesis rejection.
Far more commonly, however, people impose (? add ?) an additional
layer of frequentist procedure on top of this basic structure, namely
using the likelihood ratio test to assess the statistical significance
of a given observed likelihood ratio and/or to set a cutoff value
for profile confidence intervals. Using the LRT puts the inference
back squarely into the frequentist domain, although the sample space
we are now dealing with (sample space of likelihoods derived from
coin-tossing experiments) is quite different from the one
we started with (sample space of outcomes of coin-tossing experiments).
As far as I can see, Edwards and Royall are almost alone in their
adherence to "pure" likelihood -- most of the rest of us pander
to the desire for p-values (or, less cynically, to the desire
for a probabilistically sound decision rule).
Two other great statisticians that subscribe to the likelihoodist school of inference are Jim Lindsey and John Nelder. At least once a year I hear someone at a meeting say that there are two modes of inference: frequentist and Bayesian. That this sort of nonsense should be so regularly propagated shows how much we have to do. To begin with there is a flourishing school of likelihood inference, to which I belong.
I would also add that different scientists have different
goals (belief, prediction, decision, assessing evidence). I too
think Royall makes a good case for the primacy of
assessing strength-of-evidence, and he gives the clearest
explanation I have seen, but I wouldn't completely
rule out the other frameworks.
I tend to think there is a place for Bayesian inference in prediction. Rub?n