Skip to content

Coefficient of Determination for nonlinear function

6 messages · Uwe Wolfram, Bert Gunter, Liaw, Andy +1 more

#
Dear Subscribers,

I did fit an equation of the form 1 = f(x1,x2,x3) using a minimization
scheme. Now I want to compute the coefficient of determination. Normally
I would compute it as

r_square = 1- sserr/sstot with sserr = sum_i (y_i - f_i) and sstot =
sum_i (y_i - mean(y))

sserr is clear to me but how can I compute sstot when there is no such
thing than differing y_i. These are all one. Thus mean(y)=1. Therefore,
sstot is 0. 

Thank you very much for your efforts,

Uwe
#
The coefficient of determination, R^2, is a measure of how well your
model fits versus a "NULL" model, which is that the data are constant.
In nonlinear models, as opposed to linear models, such a null model
rarely makes sense. Therefore the coefficient of determination is
generally not meaningful in nonlinear modeling.

Yet another way in which linear and nonlinear models fundamentally differ.

-- Bert
On Fri, Mar 4, 2011 at 5:40 AM, Uwe Wolfram <uwe.wolfram at uni-ulm.de> wrote:

  
    
#
As far as I can tell, Uwe is not even fitting a model, but instead just
solving a nonlinear equation, so I don't know why he wants a R^2.  I
don't see a statistical model here, so I don't know why one would want a
statistical measure.

Andy
Notice:  This e-mail message, together with any attachme...{{dropped:11}}
#
Andy,

You may well be right. I assumed "fitting an equation" means that he
had data to which the equation was being fitted. Maybe that's wrong --
re-reading the post still does not clarify the point for me. In any
case, either way, fitting R^2 makes no sense.

-- Bert
On Fri, Mar 4, 2011 at 8:44 AM, Liaw, Andy <andy_liaw at merck.com> wrote:

  
    
#
Dear Bert, dear Andy,

thanks for your answers! I am quite aware that I do not fit a linear
model, so r^2 in Pearson's sens is indeed meaningless. Instead, I am
"fitting" an equation - or rather using an optimisation - were the
experimentally derived point cloud (x1, x2, x3) should deliver something
like 1 = f(x1, x2, x3). What I am trying to estimate is the quality of
the fit. One thing I computed so far is the standard error of the
equation (SEE) which is fine. My former question pointed in the
direction of how I could compute a coefficient of determination to
estimate a goodness of fit. Calling it r^2 may mislead but there must be
something similar in nonlinear regressions.

Thanks for your efforts,

Uwe


Am Freitag, den 04.03.2011, 11:44 -0500 schrieb Liaw, Andy:
#
----------------------------------------
The quality of the fit is determined by how much additional funding it allows
you to secure :) Obviously I'm being facetious but there are two real issues
here. You may in fact be modeling revenue numbers as another poster here 
explicitly intended. Money or not, the quality is related to some underlying
system you are presumably attempting to understand. Non-linear being a classification
of exclusion it is quite open ended and any generic goodness measure may not
be of much use to you. The other side of my first sentence would be that it is always easy
to shop for a result you want for some purpose other than understanding your data.
You may not state this, but you will likely find many ways to measure your 
results and then end up picking the one that agrees the most with what you
want to believe. 

The great thing about R is that ad hoc exploratory work is easy and you
may find simply plotting residuals and doing simple sensitivity tests by
perturbing the data can be of some use. Or you may want a specific test
to determine if you have ( say your nonlinear equation is a fit to a spectrum of some kind)
a bunch of gaussian or lorentzian lines for example. 
I think I can say with reasonable certainty, "it depends."