Skip to content

Problem with MASS::fitdistr().

13 messages · Abby Spurdle, Rolf Turner, Peter Dalgaard +3 more

#
For some reason fitdistr() does not seem to be passing on the "..." 
argument "lower" to optim() in the proper manner, and as result
falls over.

Here is my example; note that data are attached in the file "x.txt".

dhse <- function(i,alpha,beta,topn) {
    x <- seq(0,1,length=topn+2)[-c(1,topn+2)]
    p <- dbeta(x,alpha,beta)
    if(any(!is.finite(p))) browser()
    (p/sum(p))[i]
}

lwr  <- rep(sqrt(.Machine$double.eps),2)
par0 <- c(alpha=1.010652,beta=1.929018)
x    <- dget("x.txt")
fit  <- MASS::fitdistr(x,densfun=dhse,topn=5,start=as.list(par0),
                       lower=lwr)

The browser() in dhse() allows you to see that alpha has gone negative,
taking a value:
Continuing causes fitdistr() to fall over with the error message:
If I eschew using fitdistr() and "roll-my-own" as follows:

foo <- function(par,x,topn){-sum(log(dhse(i=x,alpha=par[1],
                                           beta=par[2],
                                           topn=topn)))}

fit <- optim(par0,fn=foo,method="L-BFGS-B",lower=lwr,topn=5,x=x)

then optim() returns a result without complaint.

Am I somehow messing up the syntax for fitdistr()?

cheers,

Rolf Turner

P. S. I've tried supplying the "method" argument, method="L-BFGS-B" 
explicitly to fitdistr(); doesn't seem to help.

R.T.
#
I haven't run your example.
I may try tomorrow-ish if no one else answers.

But one question: Are you sure the "x" and "i" are correct in your function?
It looks like a typo...
On Sun, Apr 26, 2020 at 2:14 PM Rolf Turner <r.turner at auckland.ac.nz> wrote:
#
I ran your example.
It's possible that it's another bug in the optim function.

Here's the optim call (from within fitdistr):

stats::optim(x = c(1, 4, 1, 2, 3, 1, 1, 1, 2, 2, 2, 2, 1, 1,
1, 1, 1, 4, 4, 3, 1, 2, 2, 1, 1, 3, 1, 1, 1, 4, 1, 1, 1, 1, 1, #more lines...
1, 4, 1, 1, 1, 5, 5, 5, 4, 5, 2, 5, 5, 5, 5, 3, 3, 5, 4, 5, 2, #removed...
4, 5, 5), topn = 5, lower = lwr, par = list(alpha = 1.010652,
    beta = 1.929018), fn = function(parm, ...) -sum(log(dens(parm, ...))),
    hessian = TRUE, method = "L-BFGS-B")

And here's dens:

function (parm, x, ...)
densfun(x, parm[1], parm[2], ...)

I can't see any reason why it should call dens with parm[1] < lower[1].
On Sun, Apr 26, 2020 at 5:50 PM Abby Spurdle <spurdle.a at gmail.com> wrote:
#
fitdistr computes a Hessian matrix.
I think optim ignores the lower value computing the Hessian.

Here's the result after removing the Hessian and Hessian-dependent info:
List of 3
 $ estimate: Named num [1:2] 0.0000000149 1.0797684972
  ..- attr(*, "names")= chr [1:2] "alpha" "beta"
 $ loglik  : num -2039
 $ n       : int 1529
 - attr(*, "class")= chr "fitdistr"
On Sun, Apr 26, 2020 at 7:02 PM Abby Spurdle <spurdle.a at gmail.com> wrote:
#
Dear Ms. Spurdle ,

Thanks for looking into this.  I think that you are correct in that it 
is a problem with the hessian calculation.  It seems that fitdistr() 
explicitly sets hessian=TRUE, with no possibility of opting out.

It also seems that optim() ignores the "lower" argument when computing 
the hessian.  I tried setting hessian=TRUE in my roll-your-own call to 
optim() and got the same crash, with alpha  = -0.001999985, which of 
course results in disaster.

Prof. Nash from time to time points out, on this and similar lists, that 
optim() --- which I believe he wrote --- is subject to problems. 
Perhaps this is one of them.  It makes sense that there would be 
difficulties with computing a hessian when the parameters are subject to 
constraints.

It's not at all clear to me how or if these difficulties can be 
circumvented.

I figured that it was kind of nice that fitdistr() provides standard 
errors for the parameter estimates, but this isn't really vital for what 
I am trying to do --- and if trying to find these standard errors makes 
the estimation procedure fall over, then clearly  standard errors have 
to be ditched.

Thanks again for looking into my problem.

cheers,

Rolf
On 26/04/20 7:29 pm, Abby Spurdle wrote:

            
#
The optim() function has trouble calculating derivatives at/near the boundary, because it is using a simplistic finite-difference formula centered on the parameter. optimx::optimr() may work better.

-pd

  
    
#
Peter is correct. I was about to reply when I saw his post.

It should be possible to suppress the Hessian call. I try to do this
generally in my optimx package as computing the Hessian by finite differences
uses a lot more compute-time than solving the optimization problem that
precedes the usual Hessian calculation.

It may be time to consider some update to optim() in that regard, or to
put in some exception handling. It isn't likely in any main-line part
of optim() but in the post-solution phase of the routines. I'd welcome
some discussion on that with a view to improvement, but am not sure if
it should be R-devel or R-package-dev lists or off-list.

John Nash
On 2020-04-26 4:53 a.m., peter dalgaard wrote:
#
I usually refer to myself as "He".
(But then, that's not the whole story...)

I'm not an expert on maximum likelihood approaches.
So, I apologize if the following suggestion is a poor one.

Does your likelihood function have a limit, as alpha approaches zero (say zero)?
If so, the limit of the log-likelihood would be -Inf, would it not.

You could create a function representing the likelihood or
log-likelihood by wrapping your density function.
The function could allow alpha/beta values equal to or below zero, and
return the limit.
This is mathematically incorrect, but may be sufficient for
permissible estimates of the second-order partial derivatives.
Depending on the shape of the likelihood function these
pseudo-likelihoods maybe able to be improved...?

You could then do a small modification on the source code for
MASS::fitdistr, such that the user specifies the likelihood function
or log-likelihood function, rather than the density...

The fitdistr function is relatively complex, however, you would only
need to modify a couple of lines, the lines that create the fn
function...
#
On 27/04/20 3:01 am, J C Nash wrote:
:
Thanks to Peter and John for this advice.

As to "suppressing the Hessian call" --- if I understand you correctly, 
this is not a problem with optim(); in fact by default optim() does not 
attempt to calculate the hessian.  The problem arises with 
MASS::fitdistr() which sets hessian=TRUE in the call to optim(), 
willy-nilly.

cheers,

Rolf
#
I thought about this some more and realized my last suggestion is
unlikely to work.
Another possibility would be to create a new function to compute the
Hessian with a smaller step size, but I suspect there will be more
problems.

Possibly a much simpler approach would be to:

Modify the source for fitdistr.
(Copy the source and create a new function, say fitdistr2).

Modify it not compute the Hessian in the optim call.
Then after the optim call, test the parameter estimates.
If they're very close to the boundaries (here zero), then they're
flagged as near-boundary cases and the fitdistr2 function returns the
parameter estimates without the Hessian and related info.
(Possibly generating a warning).

If they're sufficiently distant, the Hessian and related info can be
computed in separate steps and returned.
(Equivalent to what it does currently).

I note that there's at least one R package (numDeriv), and maybe more,
for computing the Hessian, numerically.
On Mon, Apr 27, 2020 at 9:31 AM Abby Spurdle <spurdle.a at gmail.com> wrote:
#
it's been a looooooooong time but I vaguely remember Rvmminb computing
gradients ( and possibly hessians )
subject to constraints. John can say more about this but, if one is going
to go through the anguish of
creating a fitdstr2, then you may want to have it call Rvmminb instead of
whatever is currently
being called.
On Sun, Apr 26, 2020 at 8:55 PM Abby Spurdle <spurdle.a at gmail.com> wrote:

            

  
  
#
After looking at MASS::fitdistr and fitdistrplus::fitdist, the latter seems to have
code to detect (near-)singular hessian that is almost certainly the "crash site" for
this thread. Was that package tried in this work?

I agree with Mark that writing one's own code for this is a lot of work, and I know
the folk who worked on fitdistrplus did a lot more distribution fitting problems
than I ever did, and I suspect they encountered this issue on occasions.

JN
On 2020-04-26 9:18 p.m., Mark Leeds wrote:
#
Hello,

Inline.

?s 21:30 de 27/04/20, J C Nash escreveu:
I tried it. I didn't post the results because I thought that others had 
probably also tried it and I had no real contribution to give to this 
thread. Except maybe that fitdistrplus::fitdist also hardcodes

hessian = TRUE

in the calls to optim.

And that I had to add 'topn' in the 'start' list, not just 'par0'.
Here is what I got:


library(fitdistrplus)

fitdist(x, distr = dhse, start = as.list(c(par0, topn = 5)))
Called from: dhse(c(1, 4, 1, 2, 3, 1, 1, 1, 2, 2, 2, 2, 1, 1, 1, 1, 1, 4,
4, 3, 1, 2, 2, 1, 1, 3, 1, 1, 1, 4, 1, 1, 1, 1, 1, 1, 3, 4, 1,

[...]

5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 3, 5, 5, 5, 1, 2, 2, 2, 3, 1,
3, 2, 3, 1, 4, 1, 1, 3, 4, 2, 1, 1, 4, 3, 5, 5, 1, 1, 4, 1, 1,
1, 5, 5, 5, 4, 5, 2, 5, 5, 5, 5, 3, 3, 5, 4, 5, 2, 4, 5, 5),
     alpha = -0.0865702222222222, beta = 1.7577217037037, topn = 
5.77314814814815)
Browse[1]> Q


Hope this helps,

Rui Barradas