Hi,
I'm trying to fit some data using a logistic function defined as
y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)
My data is below:
x <- 1:100
y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5,
5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21,
24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169,
178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268,
268,270,272,272,272,273,275,275,275,276)
My first attempt was to use nls as below:
d <- data.frame(x=x, y=y)
model <- nls(y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)), data=d,
start=list(a=277,m=100,n=101,tau=10),
algorithm='port', trace=TRUE,
control=nls.control(maxiter=5000, minFactor=1/2048))
Running the above code I get the following error message:
Convergence failure: function evaluation limit reached without
convergence (9).
To investigate this further I used nlminb() to get a set of starting
parameters. Thus I did:
func <- function( par ) {
a = par[1]
m = par[2]
n = par[3]
tau = par[4]
a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
}
est <- nlminb( c(277, 100,101, 10), objective=func,
control=list(eval.max=400, iter.max=1000))
I get absolute convergence and a set of parameter values. Plugging these
into the nls call and trying again still gives me
Convergence failure: function evaluation limit reached without
convergence (9)
I have tried a number of different starting values for the nls() call
but I usually end up getting the following error:
Convergence failure: false convergence (8)
After reading the PORT library docs, I see that this error can mean
1) gradient is calculated incorrectly
2) stopping tolerances are too tight
3) gradient is discontinous near some iterate
However, since nls() usually reports the above error after 30 to 40
iterations, the PORT docs suggest that it is not problem 1. I'm not sure
about (3) - I have other data which are somewhat similar to the above
data, but they lead to a straightforward fit.
In the end I tried a different starting value and lowered the tolerances
a little, and I got a valid fit
My questions are:
1) Why would the parameters that lead nlminb() to converge to work in
nls() (since I'm using the PORT algorithm in both cases)?
2) Is there a strategy to obtain starting values? In my case I know that
a should be around 277, but for the others I'm not sure.
3) Is there a quick way to check whether the gradient is discontinous at
a point, numerically (rather than calculating the analytical
derivatives)? I did
plot(diff(y))
and it certainly looks messy, but I also have other y vectors which look
equally jagged (though the jaggedness occurs at lower x)
Any suggestions would be appreciated.
Thanks,
-------------------------------------------------------------------
Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
After a number of decimal places, nobody gives a damn.
suggestions for nls error: false convergence
6 messages · Rajarshi Guha, Spencer Graves, Christian Ritz +1 more
4 days later
I generally prefer "optim" to "nlminb", because it will optionally
return the hessian, and a review of the eigenvalues and vectors can help
diagnose problems like this. When I tried "optim" with your "func", I
got an error message:
> est. <- optim( c(277, 100,101, 10), func,
+ hessian=TRUE)
Error in optim(c(277, 100, 101, 10), func, hessian = TRUE) :
objective function in optim evaluates to length 100 not 1
For both "optim" and "nlminb", the second argument is a "function to
be minimized", which must return a scalar. Your "func" returned a
vector of length 100 = length(x); "nlminb" tried to minimize only the
first component of that vector without reference to y!
I therefore modified your "func" as follows and tried it with
"nlminb" and with three methods of "optim":
func1 <- function( par,y, x ) {
a = par[1]
m = par[2]
n = par[3]
tau = par[4]
y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
sum((y-y.)^2)
}
est.n <- nlminb( c(277, 100,101, 10), objective=func1,
control=list(eval.max=400, iter.max=1000), y=y, x=x)
est.o1 <- optim( c(277, 100,101, 10), func1,
hessian=TRUE, y=y, x=x)
est.o2 <- optim( c(277, 100,101, 10), func1,
method="BFGS", hessian=TRUE, y=y, x=x)
est.o3 <- optim( c(277, 100,101, 10), func1,
method="CG", hessian=TRUE, y=y, x=x)
est.o4 <- optim( c(277, 100,101, 10), func1,
method="SANN", hessian=TRUE, y=y, x=x)
These 5 optimizers found the following "minima", returning
"convergence" messages as follows:
nlminb: $objective=808.7282; $convergence=0, "relative convergence (4)"
optim(method="Nelder-Mead"(default)): $value=9032.814 (11 times
nlminb); $convergence=1 (the iteration limit 'maxit' had been reached.)
optim(method="BFGS"): $value=1189625 (1471 times nlminb);
$convergence=0 ("successful convergence")
optim(method="CG"): $value=1189574 (1471 times nlminb); $convergence=1
(the iteration limit 'maxit' had been reached.)
optim(method="SANN"): $value=42819.06 (53 times nlminb);
$convergence=0 ("successful convergence")
Clearly, nlminb got the smallest residual sum of squares. To get the
hessian there, I fed the nlminb output into optim as follows:
est.no1 <- optim(est.n$par, func1,
hessian=TRUE, y=y, x=x)
est.no.hess.eig <- eigen(est.no1$hessian, symmetric=TRUE)
signif(est.no.hess.eig$values, 2)
[1] 6.9e+05 3.2e+01 1.6e-04 -1.0e-07
Since the smallest eigenvalue is negative, we might think that the
hessian is indefinite. However, the smallest eigenvalue is less than
1e-12 times the largest in absolute value, which really means that the
smallest eigenvalue is essentially zero. Moreover, since the third
eigenvalue is less than 1e-9 times the largest, the true rank of the
hessian is at most 2. If we also compare the first and second
eigenvalues, we that the first is over 2000 times the second. This says
that with your model and your data, you really can only estimate one
parameter; everything else looks like noise, to me.
The eigenvectors provide more detail:
> round(est.no.hess.eig$vectors, 2)
[,1] [,2] [,3] [,4]
[1,] -0.01 1.00 0.00 0.00
[2,] 0.00 0.00 1.00 0.01
[3,] 0.00 0.00 -0.01 1.00
[4,] 1.00 0.01 0.00 0.00
>
This says that your data can estimate the fourth parameter very well,
as the largest eigenvalue is associated almost exclusively with that
parameter. You might also be able to estimate the second parameter
somewhat, as it is associated almost exclusively with the second
eigenvalue. However, you have to move the second and third parameters
much more than the first and fourth to see much change in the residual
sum of squares.
For more information, I suggest you consult the following:
Seber and Wild (1988) Nonlinear Regression (Wiley; reprinted in 2003,
esp. ch. 7 on "Growth Models")
Bates and Watts (1988) Nonlinear Regression Analysis and Its
Applications (Wiley, esp. ch. 6 on "Graphical Summaries")
hope this helps.
spencer graves
Rajarshi Guha wrote:
Hi,
I'm trying to fit some data using a logistic function defined as
y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)
My data is below:
x <- 1:100
y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5,
5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21,
24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169,
178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268,
268,270,272,272,272,273,275,275,275,276)
My first attempt was to use nls as below:
d <- data.frame(x=x, y=y)
model <- nls(y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)), data=d,
start=list(a=277,m=100,n=101,tau=10),
algorithm='port', trace=TRUE,
control=nls.control(maxiter=5000, minFactor=1/2048))
Running the above code I get the following error message:
Convergence failure: function evaluation limit reached without
convergence (9).
To investigate this further I used nlminb() to get a set of starting
parameters. Thus I did:
func <- function( par ) {
a = par[1]
m = par[2]
n = par[3]
tau = par[4]
a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
}
est <- nlminb( c(277, 100,101, 10), objective=func,
control=list(eval.max=400, iter.max=1000))
I get absolute convergence and a set of parameter values. Plugging these
into the nls call and trying again still gives me
Convergence failure: function evaluation limit reached without
convergence (9)
I have tried a number of different starting values for the nls() call
but I usually end up getting the following error:
Convergence failure: false convergence (8)
After reading the PORT library docs, I see that this error can mean
1) gradient is calculated incorrectly
2) stopping tolerances are too tight
3) gradient is discontinous near some iterate
However, since nls() usually reports the above error after 30 to 40
iterations, the PORT docs suggest that it is not problem 1. I'm not sure
about (3) - I have other data which are somewhat similar to the above
data, but they lead to a straightforward fit.
In the end I tried a different starting value and lowered the tolerances
a little, and I got a valid fit
My questions are:
1) Why would the parameters that lead nlminb() to converge to work in
nls() (since I'm using the PORT algorithm in both cases)?
2) Is there a strategy to obtain starting values? In my case I know that
a should be around 277, but for the others I'm not sure.
3) Is there a quick way to check whether the gradient is discontinous at
a point, numerically (rather than calculating the analytical
derivatives)? I did
plot(diff(y))
and it certainly looks messy, but I also have other y vectors which look
equally jagged (though the jaggedness occurs at lower x)
Any suggestions would be appreciated.
Thanks,
-------------------------------------------------------------------
Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
-------------------------------------------------------------------
After a number of decimal places, nobody gives a damn.
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915
Hi.
An alternative is to use the package 'drc' on CRAN to fit your data!
x <- 1:100
y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5,
5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21,
24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169,
178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268,
268,270,272,272,272,273,275,275,275,276)
## Defining the function (in a bit different notation)
logi <- function(dose, parm){parm[, 1] * (1+parm[, 2]*exp(-dose/parm[, 3])) / (1+parm[, 4]*exp(-dose/parm[, 3]))}
## Fitting the function to the data (see ?multdrc for details)
library(drc)
model <- multdrc(y~x, fct = list(logi, NULL, c("a", "m", "tau", "n")), startVal=c(277, 100, 10, 101))
summary(model)
plot(model, log="")
Hope this helps?
Christian
Hi, Christian:
That's very interesting. May I ask how "multdrc" converged and found
all 4 parameters statistically significant when "nls" failed and "optim"
told me the hessian was singular?
To try to study this myself, I first compared your numbers with what
I got and found that you had swapped the third and fourth parameters.
When I swapped them back, I got the same estimates but substantially
different standard errors from your function:
logi <- function(dose, parm){
parm[, 1] * (1+parm[, 2]*exp(-dose/parm[, 3])) / (1+parm[,
4]*exp(-dose/parm[, 3]))}
logi. <- function(dose, parm){
parm[, 1] * (1+parm[, 2]*exp(-dose/parm[, 4])) / (1+parm[,
3]*exp(-dose/parm[, 4]))}
## Fitting the function to the data (see ?multdrc for details)
library(drc)
model <- multdrc(y~x, fct = list(logi, NULL, c("a", "m", "tau", "n")),
startVal=c(277, 100, 10, 101))
model. <- multdrc(y~x, fct = list(logi., NULL, c("a", "m", "n", "tau")),
startVal=c(277, 100, 101, 10))
signif(summary(model)$estimates, 2)
Estimate Std. Error t-value p-value
a:(Intercept) 2.8e+02 9.6e-01 290.0 4.5e-143
m:(Intercept) 1.5e+03 6.0e+02 2.6 1.2e-02
tau:(Intercept) 5.6e+00 9.6e-02 59.0 2.7e-77
n:(Intercept) 2.3e+05 4.7e+04 4.8 5.1e-06
> signif(summary(model.)$estimates, 2)
Estimate Std. Error t-value p-value
a:(Intercept) 2.8e+02 8.5e-01 320.0 4.2e-148
m:(Intercept) 1.5e+03 4.8e+02 3.2 1.7e-03
n:(Intercept) 2.3e+05 3.3e+04 6.9 6.0e-10
tau:(Intercept) 5.6e+00 6.8e-02 83.0 1.2e-91
I then tried "optim" again, rescaling all parameters by your
estimated standard errors. If your code does what I thought it might,
would expect that the residual variance times the main diagonal of the
inverse of the Hessian should be all 1's. That's not what I got, so I'm
confused. Here's what I did:
rescale <- summary(model.)$estimates[,2]
func2 <- function( par,y, x, rescale ) {
par <- rescale*par
a = par[1]
m = par[2]
n = par[3]
tau = par[4]
y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
sum((y-y.)^2)
}
est.no2 <- optim(est.n$par/rescale,
func2, hessian=TRUE, y=y, x=x, rescale=rescale)
round(var.resid <- est.no2$value/96, 2)
[1] 8.42
est.no2.hessEig <-
eigen(est.no2$hessian, symmetric=TRUE)
> round(est.no2.hessEig$values, 1)
[1] 6134.1 52.4 24.2 4.7
# MUCH better than before.
hessInv <-
with(est.no2.hessEig, vectors %*%(t(vectors)/values))
> var.par <- (est.no2$value/96)*hessInv
> round(sqrt(diag(var.par)), 2)
[1] 0.72 0.76 0.79 0.77
What am I missing?
Thanks again for your work in the "drc" package and for your earlier
reply to my comments on this example.
Spencer Graves
Christian Ritz wrote:
Hi.
An alternative is to use the package 'drc' on CRAN to fit your data!
x <- 1:100
y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5,
5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21,
24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169,
178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268,
268,270,272,272,272,273,275,275,275,276)
## Defining the function (in a bit different notation)
logi <- function(dose, parm){parm[, 1] * (1+parm[, 2]*exp(-dose/parm[, 3])) / (1+parm[, 4]*exp(-dose/parm[, 3]))}
## Fitting the function to the data (see ?multdrc for details)
library(drc)
model <- multdrc(y~x, fct = list(logi, NULL, c("a", "m", "tau", "n")), startVal=c(277, 100, 10, 101))
summary(model)
plot(model, log="")
Hope this helps?
Christian
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915
Hi Spencer.
When using 'optim' and the first try fails you could:
1) try some other methods: Nelder-Mead, BFGS, ...
2) increase the maximum number of iterations (argument maxit in the control list)
3) specify the argument parscale in the control list, in order to have all parameters of same magnitude during
optimisation (this is useful if the parameters are suspected to be of different magnitudes).
Using the default method (Nelder-Mead) with maxit=1000 results in convergence, and essentially the same estimates are
obtained if you use the method BFGS and set maxit=1000 and parscale=c(277, 100, 101, 10) (the initial starting values):
x <- 1:100
y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5,
5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21,
24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169,
178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268,
268,270,272,272,272,273,275,275,275,276)
func2 <- function( par,y, x, rescale ) {
par <- rescale*par
a = par[1]
m = par[2]
n = par[3]
tau = par[4]
y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
sum((y-y.)2)
}
est.no2 <- optim(c(277, 100, 101, 10), func2, hessian=TRUE, y=y, x=x, rescale=1, control=list(maxit=1000))
est.no3 <- optim(c(277, 100, 101, 10), func2, hessian=TRUE, method="BFGS", y=y, x=x, rescale=1,
control=list(maxit=1000, parscale=c(277, 100, 101, 10)))
The optimisation in the package 'drc' uses BFGS with the maxit and parscale arguments specified.
Best wishes
Christian
Sometimes its just one parameter that's the culprit so just use a grid to get the starting value. Since its known that in logistic growth the saturation level is a problem we conjecture that in this one m is the culprit and grid over it. Note that with this approach we did not need to extend the allowable iterations at all so the control arg has been removed simplifying the call:
for(m in seq(50, 150, 25)) {
+ model <- try(nls(y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)),
+ data=d, start = list(a = 277, m = m, n = 101, tau = 10),
+ algorithm='port'))
+ if (!inherits(model, "try-error")) print(model)
+ }
Nonlinear regression model
model: y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau))
data: d
a m n tau
2.757817e+02 1.594652e+03 2.312661e+05 5.617307e+00
residual sum-of-squares: 808.7282
Nonlinear regression model
model: y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau))
data: d
a m n tau
2.757817e+02 1.594652e+03 2.312661e+05 5.617307e+00
residual sum-of-squares: 808.7282
Error in nls(y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau)), data = d, :
Convergence failure: iteration limit reached without convergence (9)
Error in nls(y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau)), data = d, :
Convergence failure: singular convergence (7)
Nonlinear regression model
model: y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau))
data: d
a m n tau
2.757817e+02 1.594652e+03 2.312661e+05 5.617307e+00
residual sum-of-squares: 808.7282
On 12/19/05, Christian Ritz <ritz at bioassay.dk> wrote:
Hi Spencer.
When using 'optim' and the first try fails you could:
1) try some other methods: Nelder-Mead, BFGS, ...
2) increase the maximum number of iterations (argument maxit in the control list)
3) specify the argument parscale in the control list, in order to have all parameters of same magnitude during
optimisation (this is useful if the parameters are suspected to be of different magnitudes).
Using the default method (Nelder-Mead) with maxit=1000 results in convergence, and essentially the same estimates are
obtained if you use the method BFGS and set maxit=1000 and parscale=c(277, 100, 101, 10) (the initial starting values):
x <- 1:100
y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5,
5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21,
24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169,
178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268,
268,270,272,272,272,273,275,275,275,276)
func2 <- function( par,y, x, rescale ) {
par <- rescale*par
a = par[1]
m = par[2]
n = par[3]
tau = par[4]
y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau))
sum((y-y.)2)
}
est.no2 <- optim(c(277, 100, 101, 10), func2, hessian=TRUE, y=y, x=x, rescale=1, control=list(maxit=1000))
est.no3 <- optim(c(277, 100, 101, 10), func2, hessian=TRUE, method="BFGS", y=y, x=x, rescale=1,
control=list(maxit=1000, parscale=c(277, 100, 101, 10)))
The optimisation in the package 'drc' uses BFGS with the maxit and parscale arguments specified.
Best wishes
Christian
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html