Skip to content

nonlinear fitting when both x and y having measurement error?

8 messages · Etsushi Kato, Brian Ripley, Martyn Plummer +2 more

#
Dear r-help,

I want to conduct nonlinear fitting to a data frame having x and y
variables.  Because both x and y have measurement error, I want to
include error term of x variable in the model.  I'm not sure but I 
think ordinary nls model only consider error term of y variable.

How can I do this kind of nonlinear fitting in R.  Is there any
examples in nls package?

Thanks in advance,
#
On Wed, 12 Sep 2001, Etsushi Kato wrote:

            
That is not a least-squares problem.  Even in the simple linear case (one
x one y) it's a hard problem, one that cannot be solved without more
information (for example on the ratio of the error variances, or knowing
one of them). As far as I know there is no software available in R for
that case (although it's not hard to write).

I think you need to write down a suitable likelihood and optimize it
numerically (with optim).

This topic is not well covered in regression texts, although it is in that
by Sprent, for example.  There is a specialist text by W. A. Fuller (1987)
Measurement Error Models.
#
On 12-Sep-2001 Prof Brian Ripley wrote:
Non-linear measurement error models are covered by:

Carroll, R.J., Ruppert, D. and Stefanski, L.A.  Measurement Error
in Nonlinear Models, Chapman & Hall 1995 

Martyn
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Thank you very much Prof. Ripley and Martyn Plummer.
Also I got one suggestion of using ORDPACK at http://www.netlib.org/.
My data is allometric relationship of biological data, and perhaps it is
OK to assume both variables having equal accuracies.  So I was just
considering using orthogonal distance to the nonlinear curve.
It's a bit difficult for me to construct likelihood function, because
right now there is no books you suggested around me...
For the moment, I'm considering using ORDPACK.  I'll try to construct
R code later.


Best Regards,
3 days later
#
I've seen the answers that point in the measurement model
direction, but I wonder if there is not a more direct approach.

In my copy of Pindyck and Rubinfeld's Econometric Models and
Economic Forecasts, it outlines an instrumental variable
approach in which the x with error is replaced by an instrument,
a predicted value from an auxiliary model in which x is
regressed on other exogenous/predetermined variables.  They
prove the parameter estimates are consistent, which (I believe)
is about the best we can hope for. 

One advantage of that strategy is that one need not assume a
specific distribution for the error terms involved, only
something general like E(e)=0 and constant variance.  The ML
approach will require the choice of a precise distribution. Not
so? 

They don't show that approach works when the relationship
between x and y is nonlinear.  Come to think of it, I don't
recall a treatment of IV applications for nonlinear equations.

This is a great question and I'm interested to hear more about
how the project works out in the end.
pj
Etsushi Kato wrote:

  
    
#
On Sat, 15 Sep 2001 pauljohn at ukans.edu wrote:

            
It's an asymptotic notion.  It may be better to be a little biased and a
lot more precise in small samples.
Not so!  There are sum-of-squares formulations and those were thought of
first (in the 19th century).  If the error variances are known to be
constant and known to be equal, the MLE under normality minimizes the sum
of squared perpendicular distances to the curve.  ML theory helps when
those assumptions are not true.

But these arguments are slippery. Least-squares regression does not assume
a normal model.  But it is optimal for a normal model (and for nothing
else), and seriously sub-optimal for very small departures from the normal
model (enter Robust Statistics).  So you can only justify using it if
normality has been tested to be a plausible working assumption and the
damaging departures have been guarded against. Or of course, if no better
tools are available.  That's why I believe projects like are R so
important: to make better tools available.


One major problem with the IV approach as I understand it is that you need
a good instrument. Both in this problem (allometric relationship of
biological data, in my understanding) and the ones I am familiar with
(calibration in chemistry, line-fitting in radio astronomy) I do not think
there are any plausible instruments: everything known is `known' with
considerable error, and all errors on one unit will be correlated.

  
    
1 day later
#
If you post your question on sci.math.num-analysis you might get pointers
to free code (probably fortran). This might be what you want:
http://www.netlib.org/odrpack/

For the linear errors-in-variables problem, you might want to look at:
http://www.pma.caltech.edu/~glenn/glove/glove.html
I think the source code is available.

If you post your question on sci.math.num-analysis you might get pointers
to free code (probably fortran).

They say:
"[Glove gives  the] Only correct linear fitting analysis, that we could
find, for data with errors in both variables. "
They don't actually say what algorithm they implement.

I think someone already pointed to a book on nonlinear errors-in-variables
models.

Here are some refs on the linear case.

Press et al Numerical recipes in C. Cambridge Univ Press p.666-670. They
also give code and some refs.

R. Lupton (1993) Statistics in theory and practice. Princeton Univ Press
pp 92-97. He also gives some futher refs.

Bill

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Here are some programs:

http://astro.u-strasbg.fr/~fmurtagh//mda-sw/
Errors-in-variables regression, 2-dimensional (Fortran) 
      leiv1.f, York (Can. J. Phys. 44, 1079-1086, 1966) 
      leiv2.f, Fasano and Vio 
      leiv3.f, Ripley [R's own BD Ripley!]
All these programs are set up with sample driver routines and sample
data. They should perform exactly the same task. 

Least squares linear and nonlinear parameter estimation with errors in the
predictor variables and the dependent variable:
http://lib.stat.cmu.edu/apstat/286

Bill

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._