Hi there,
I'm pretty new to the field of fitting (anything). I try to fit a
distribution with mle, because my real data seems to follow a
zero-inflated poisson distribution. So far, I tried a simple example
to see whether I understand how to do it or not:
# example count data
x <- 0:10
y <- dpois(x, lambda = 1.4)
# zero-inflated poisson
zip <- function(x, lambda, prop) {
(1 - prop)*dpois(x,0) + prop*dpois(x,lambda)
}
ll <- function(lambda = 2, prop = 0.9) {
y.fit <- zip(x, lambda, prop)
sum( (y - y.fit)^2 )
}
fit <- mle(ll)
So far, so good. The result gives me
lambda prop
1.4 1.0
which is pretty nice.
But what goes wrong if I want to display confidence intervals? I get a
lot of warnings but I simply don't know why...
confint(fit)
Has it something to do with constraints for my parameters (lambda
should be > than zero and prop should range from 0 to 1)? Do I have to
put it into the ll-function?
Is there any general comment on what I'm doing?
Antje
Warning with mle
2 messages · Antje, Ben Bolker
Antje Niederlein <niederlein-rstat <at> yahoo.de> writes: [snip]
But what goes wrong if I want to display confidence intervals? I get a lot of warnings but I simply don't know why... confint(fit) Has it something to do with constraints for my parameters (lambda should be > than zero and prop should range from 0 to 1)? Do I have to put it into the ll-function? Is there any general comment on what I'm doing?
You are exactly right, it is caused by violations of the
bounds on the parameters. The reason you see warnings when
you ask for confidence intervals and not for the original fit
is that confint() is computing profile confidence intervals,
which force it to evaluate the likelihood function over a
much wider range of possibilities.
There are four reasonable solutions to your problems:
1. ignore the warnings, as long as they are all of the
same type (NaNs/NAs being produced by dbinom or dpois),
and as long as the final results look sensible.
2. use method="L-BFGS-B" and set lower and upper bounds
on your parameters (this can be a little bit finicky because
L-BFGS-B will often try parameters *on* the boundary, and
it can't handle NAs or infinities, so you may have to set
the lower and upper bounds a little bit in from their theoretical
limits (e.g. 0.002 instead of 0).
3. Fit your parameters on the transformed scale (typically logit
for probabilities, log for Poisson intensities). This will cause
problems if the parameter really lies on the boundary, e.g.
if the best estimate of your zero-inflation parameter is zero
or very close to it.
4. Use the pscl package, which has reasonably robust and
efficient built-in functions for fitting zero-inflated (and
hurdle) models.
good luck,
Ben Bolker