Skip to content

terminology for binomial regression

3 messages · Matthew Forister, Ben Bolker

#
On 11-03-05 02:59 PM, Matthew Forister wrote:
In my opinion, it would be reasonable to use 'logistic regression' to
mean any GLM (generalized linear model) with a logit link, although very
most probably with the binomial family. My impression is that people
most commonly use 'logistic regression' to mean a GLM with
*binary* data and a logit link and 'binomial regression' to denote
non-binary data, but I don't have any references.
I would suggest Gelman and Hill for this, but these are statements of
changes on the logit scale ("log-odds" is a synonym).  Unfortunately,
the interpretation in terms of probability outcomes depends on the
baseline probability.  Rules of thumb are:

 (1) for small (near zero) baseline probabilities, the logistic
resembles an exponential and so the interpretation of logit-scale and
log-scale coefficients are similar, i.e. for small changes they can be
interpreted as proportional changes.  For your example above, this would
correspond to a PROPORTIONAL decline of approximately 14% per year for a
species that was already fairly rare.  (More precisely a decline of
(1-exp(-0.14))=0.13.)  (I want to emphasize that this is a change
relative to the original frequency of the species.)

 (2) for baseline probabilities near 0.5, the rule of thumb is that the
change in probability of occurrence is about r/4, so if your species
were originally present in about half of the samples a coefficient of
-0.14 would correspond to a decline of about 3.5% per year (this is
absolute rather than proportional).

 (3) For baseline probabilities near 1.0 (common species), #1 applies
but this time to the probability of non-occurrence. For example, suppose
we have a species that occurs 95% of the time.

## transform to logit scale
 qlogis(0.95)  ## 2.944, call it approx 2.95
 plogis(2.95-0.14) ## 0.943

## compare this with the change in the original probability of
## non-occurrence (0.05), which *increases* by 14%
1-0.05*1.14  ## 0.943
#
Ben, thank you.  I did not realize the interpretation was dependent on the
baseline probabilities, but I think I get it now.  One follow up question...

Assume for minute that I'm not interested in converting those values into
statements of probability.  Rather, I'm interested in making comparisons
among species.  For example, a species with a value of -0.25 (for the
coefficient associated with years) is in more severe decline than a species
with a value of -0.14.

Empirically, this seems to work out just fine.  If you take a look at the
attached pdf, you'll see examples of the fit of the binomial regression
models.  The numbers on the outside are the years-coefficients.  Seems to me
that those numbers do a good job at indicating the rate of decline, even
though the starting frequencies are different for different species.

Am I making any mistake in thinking about comparisons among species based on
the years-coefficient like this?

thanks!
Matt
On Sat, Mar 5, 2011 at 12:31 PM, Ben Bolker <bbolker at gmail.com> wrote: