Skip to content

nls problem: singular gradient

6 messages · Jonas Stein, Duncan Murdoch, Peter Dalgaard

#
Why fails nls with "singular gradient" here?
I post a minimal example on the bottom and would be very 
happy if someone could help me.
Kind regards,

###########

# define some constants
smallc <- 0.0001
t <- seq(0,1,0.001)
t0 <- 0.5
tau1 <- 0.02

# generate yy(t)

yy <- 1/2 * ( 1- tanh((t - t0)/smallc) * exp(-t / tau1) ) + rnorm(length(t))*0.01

# show the curve

plot(x=t, y=yy, pch=18)

# prepare data

dd <- data.frame(y=yy, x=t)

nlsfit <- nls(data=dd,  y ~  1/2 * ( 1- tanh((x - ttt)/smallc) * exp(-x / tau2) ), start=list(ttt=0.4, tau2=0.1) , trace=TRUE)

# get error:
# Error in nls(data = dd, y ~ 1/2 * (1 - tanh((x - ttt)/smallc) * exp(-x/tau2)),  : 
#   singular gradient
#
On 11/07/2012 11:04 AM, Jonas Stein wrote:
Take a look at the predicted values at your starting fit:  there's a 
discontinuity at 0.4, which sure makes it look as though overflow is 
occurring.  I'd recommend expanding tanh() in terms of exponentials and 
rewrite the prediction in a way that won't overflow.

Duncan Murdoch
#
Hi Duncan,
Thank you for your suggestion. I wrote a function "mytanh" and 
nls terminates a bit later with another error message:

Error in nls(data = dd, y ~ 1/2 * (1 - mytanh((x - ttt)/1e-04) * exp(-x/tau2)),  : 
  number of iterations exceeded maximum of 50

How can i fix that?
Kind regards,
Jonas

============================ R CODE STARTS HERE =======

mytanh <- function(x){
  return(x - x^3/3 + 2*x^5 /15 - 17 * x^7/315)
}

t <- seq(0,1,0.001)
t0 <- 0.5
tau1 <- 0.02

yy <- 1/2 * ( 1- tanh((t - t0)/0.0001) * exp(-t / tau1) ) + rnorm(length(t))*0.001

plot(x=t, y=yy, pch=18)

dd <- data.frame(y=yy, x=t)

nlsfit <- nls(data=dd,  y ~  1/2 * ( 1- mytanh((x - ttt)/0.0001) * exp(-x / tau2) ), start=list(ttt=0.5, tau2=0.02) , trace=TRUE)

============================ R CODE ENDS HERE =======
#
On 12-07-11 2:34 PM, Jonas Stein wrote:
That looks like it would overflow as soon as abs(x-ttt) got large, just 
like the original.  You might be able to fix it by following the advice 
I gave last time, or maybe you need to rescale the parameters.  In most 
cases optimizers work best when the uncertainty in the parameters is all 
on the same scale, typically around 1.

Duncan Murdoch
#
On 07/12/2012 01:39 AM, Duncan Murdoch wrote:
I am not shure what you mean with "rescale paramaeters", but i changed 
ttt and tau2 to 1 but nls still fails. Do you mean i can only use 
functions with tau2 and ttt close to 1?

Is there a better fit function then nls for R? Even "origin" can find 
the parameters without any problems.

nlsfit <- nls(data=dd,  y ~  1/2 * ( 1- mytanh((x - ttt)/0.0001) * 
exp(-x / tau2) ), start=list(ttt=1, tau2=1) , trace=TRUE, control = 
list(maxiter = 100))
#
On Jul 11, 2012, at 20:34 , Jonas Stein wrote:

            
Ouch! Your original had something that was very nearly a Heaviside function, i.e. a function that instantaneously switches from 1 to 0 at x=ttt. Replace that with a polynomial and I bet that your fitted curves has no resemblance to the original.

Change-point estimation is notoriously tricky because the sum of squares changes discontinuously as points cross "to the other side" and there will be regions where the objective function is constant, because you can move the change point and still have the same sets of points on each side of. Your function tries to remedy this by using a soft threshold, but it is still quite an abrupt change: if you put ttt in the middle of an interval of length 0.001. Then abs((x-ttt)/0.001) will be at least 5 and
[1] 0.9999092

Furthermore, the derivative of tanh at that point is roughly
[1] -0.0001815834

I.e. moving the change point yields only a very small change in the fitted value at the neighboring x values and hardly a change at all at any other point. This is where your singular gradient comes from. 

Pragmatically, you could try a larger value than 0.0001, but I suspect it would be wise to supplement any gradient technique with a more direct search procedure.

Overall, I'd say that you need a bit more "Fingerspitzgef?hl" with this sort of optimization problem. Try this, for instance:
-pd