Skip to content

problem with nls starting values

9 messages · Ben Bolker, Bert Gunter, Berend Hasselman +1 more

#
On 12-09-27 05:34 PM, Bert Gunter wrote:
I absolutely agree that overparameterization can lead to nonsense
results, either because one quotes point estimates without noting that
the confidence intervals are effectively infinite, or because the
optimizer does something weird (mis-converging, mis-diagnosing the CI)
without warning. I've certainly seen lots of bad examples.

  On the other hand: there's an important difference between 'true'
overparameterization (strong unidentifiability) and more general weak
overparameterization. I claim that there do exist times when it's useful
to be able to fit, say, a 3-parameter model to 4 or 5 data points. In
addition to the bad overparameterization examples cited above, I've also
seen lots of examples where people (although lacking in numerical
sophistication/chops) had trouble fitting with nls and were told "well,
you're just trying to do something silly" -- when they weren't necessarily.
1 day later
#
Hi

I would like to fit a non-linear regression to the follwoing data:

quantiles<-c(seq(.05,.95,0.05))
slopes<-c( 0.000000e+00,  1.622074e-04 , 3.103918e-03 , 2.169135e-03 , 
9.585523e-04
,1.412327e-03 , 4.288103e-05, -1.351171e-04 , 2.885810e-04 ,-4.574773e-04
, -2.368968e-03, -3.104634e-03, -5.833970e-03, -6.011945e-03, -7.737697e-03
, -8.203058e-03, -7.809603e-03, -6.623985e-03, -9.414477e-03)
plot(slopes~quantiles)

I want to fit two models: asymptotic decay  and logistic decay(s-shaped).
I tried self-starting functions (SSlogis and SSasymp) like this:

dframe<-data.frame(cbind(slopes,quantiles))
names(dframe)<-c("slopes","quantiles")
summary(mod1<-nls(slopes ~ SSlogis( quantiles, Asym, xmid, 
scal),data=dframe))
summary(mod1<-nls(slopes ~ SSasymp( quantiles, Asym, resp0, 
lrc),data=dframe))

and I tried to specify the starting values myself. But I usually don't 
even get the nls started. It's always some singular gradient error or 
some other related error message (stopped after 50 iterations,etc.). 
When I leave out some values from the middle quantiles I manage to fit a 
3-parameter logistic model, but if I use all the values it doesn't work 
any longer.
Then I simulated perfect asymptotic decay data and tried to to fit an 
nls() with the correct parameter values, but it won't work either. What 
am I doing wrong?

Any help would be most appreciated

Best

benedikt
#
My guess:

You probably are overfitting your data. A straight line does about as
well as anything except for the 3 high leverage points, which the
minimization is probably having trouble with.

-- Bert



On Thu, Sep 27, 2012 at 10:43 AM, Benedikt Gehr
<benedikt.gehr at ieu.uzh.ch> wrote:

  
    
#
thanks for your reply

I agree that an lm model would fit just as well, however the expectation 
from a mechanistic point of view would be a non-linear relationship.

Also when I "simulate" data as in

y_val<-115-118*exp(-0.12*(seq(1,100)+rnorm(100,0,0.8)))
x_val<-seq(1:100)
plot(y_val~x_val)
summary(mod1<-nls(y_val~a-b*exp(-c*x_val),start=list(a=115,b=118,c=0.12)))

I do not get a convergence. I obviously must be doing something wrong.

Thanks for the help

Benedikt

Am 27.09.2012 20:00, schrieb Bert Gunter:

  
    
#
On 27-09-2012, at 21:15, Benedikt Gehr <benedikt.gehr at ieu.uzh.ch> wrote:

            
I do get convergence:
Nonlinear regression model
  model:  y_val ~ a - b * exp(-c * x_val) 
   data:  parent.frame() 
       a        b        c 
115.0420 117.0529   0.1192 
 residual sum-of-squares: 181.6

Number of iterations to convergence: 3 
Achieved convergence tolerance: 1.436e-07
R version 2.15.1 Patched (2012-09-11 r60679)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     


Berend
#
now I feel very silly! I swear I was trying this for a long time and it 
didn't work. Now that I closed R and restarted it it works also on my 
machine.

So is the only problem that my model is overparametrized with the data I 
have? however shouldn't it be possible to fit an nls to these data?

thanks for the help

Am 27.09.2012 21:27, schrieb Berend Hasselman:

  
    
#
On Thu, Sep 27, 2012 at 12:43 PM, Benedikt Gehr
<benedikt.gehr at ieu.uzh.ch> wrote:
Probably.
(Obviously) no.

I suggest you do a little reading up on optimization.
Over-parameterization creates high dimensional ridges.

-- Bert

  
    
#
Bert Gunter <gunter.berton <at> gene.com> writes:
However, I will also point out that (from my experience and
others') nls is not the most robust optimizer ... you might consider
nlsLM (in the minpack.lm package), nls2 package, and/or doing nonlinear
least-squares by brute force using bbmle::mle2 as a convenient wrapper
for optim() or optimx().

  cheers
    Ben Bolker
#
Good point, Ben.

I followed up my earlier reply offline with a brief note to Benedikt
pointing out that "No" was the wrong answer: "maybe, maybe not" would
have been better.

Nevertheless, the important point here is that even if you do get
convergence, the over-parameterization means that the estimators don't
mean anything: they are poorly determined/imprecise. This is a
tautology, of course, but it is an important one. My experience is, as
here, the poster wants to fit the over-parameterized model because
"theory" demands it. That is, he wants to interpret the parameters
mechanistically. But the message if the data is: "Sorry about that
guys. Your theory may be fine, but the data do not contain the
information to tell you what the parameters are in any useful way."
We gloss over this distinction at our peril, as well as that of the
science.

Cheers,
Bert
On Thu, Sep 27, 2012 at 2:17 PM, Ben Bolker <bbolker at gmail.com> wrote: