Skip to content

Constrained non linear regression using ML

11 messages · Ravi Varadhan, Gabor Grothendieck, Corrado Topi +1 more

#
Dear R users,

I have to fit the non linear regression:

y~1-exp(-(k0+k1*p1+k2*p2+ .... +kn*pn))

where ki>=0 for each i in [1 .... n] and pi are on R+.

I am using, at the moment, nls, but I would rather use a Maximum 
Likelhood based algorithm. The error is not necessarily normally 
distributed.

y is approximately beta distributed, and the volume of data is medium to 
large (the y,pi may have ~ 40,000 elements).

I have studied the packages in the task views Optimisation and Robust 
Statistical Methods, but I did look like what I was looking for was 
there. Maybe I am wrong.

The nearest thing was nlrob, but even that does not allow for 
constraints, as far as I can understand.

Any suggestion?

Regards
#
I have an algorithm that can perform nonlinear optimization, with
linear/nonlinear, equality and/or inequality constraints.  From your
description, it seems like this algorithm would work for your problem.
Contact me if you are interested and I will send you the code.

Ravi.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Corrado
Sent: Tuesday, March 16, 2010 2:59 PM
To: r-help at r-project.org
Subject: [R] Constrained non linear regression using ML

Dear R users,

I have to fit the non linear regression:

y~1-exp(-(k0+k1*p1+k2*p2+ .... +kn*pn))

where ki>=0 for each i in [1 .... n] and pi are on R+.

I am using, at the moment, nls, but I would rather use a Maximum 
Likelhood based algorithm. The error is not necessarily normally 
distributed.

y is approximately beta distributed, and the volume of data is medium to 
large (the y,pi may have ~ 40,000 elements).

I have studied the packages in the task views Optimisation and Robust 
Statistical Methods, but I did look like what I was looking for was 
there. Maybe I am wrong.

The nearest thing was nlrob, but even that does not allow for 
constraints, as far as I can understand.

Any suggestion?

Regards
#
Check out the betareg package.
On Tue, Mar 16, 2010 at 2:58 PM, Corrado <ct529 at york.ac.uk> wrote:
#
Dear Gabor, dear R users,

I had already read the betareg documentation. As far as I can understand 
from the help, it does not allow for constrained regression.

Regards
Gabor Grothendieck wrote:

  
    
#
Try it anyways -- maybe none of your constraints are active.
On Wed, Mar 17, 2010 at 6:01 AM, Corrado <ct529 at york.ac.uk> wrote:
#
Dear Gabor,

1) The constraints are active, at least from a formal point view.
3) I have tried several times to run betareg.fit on the data, and the 
only thing I can obtain is the very strange error:

Error in dimnames(x) <- dn :  length of 'dimnames' [2] not equal to 
array extent

The error is strange because, because the function dimnames is not 
called anywhere.  

Regards
Gabor Grothendieck wrote:

  
    
#
Contact the maintainer regarding problems with the package.  Not sure
if this is acceptable but if you get it to run you could consider just
dropping the variables from your model that correspond to active
constraints.

Also try the maxLik package.  You will have to define the likelihood
yourself but it does support constraints.
On Wed, Mar 17, 2010 at 9:07 AM, Corrado <ct529 at york.ac.uk> wrote:
#
On 17 March 2010 14:22, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
Yes. And specifying the likelihood function is probably (depending on
your distributional assumptions) not too complicated.

BTW: Even if your y follows a beta distribution, it does not mean that
your error term also follows a beta distribution. And it the
distribution of the error term which is crucial for specifying the
likelihood function.

/Arne

  
    
#
Dear Arne, Gabor,

I solved the problem with betareg (downloaded the package). I run it on 
my data, and unfortunately the  constraint is definitively active, if I 
remove the active variables, I then remove the most significant variables!

Of course the error is important, not the distribution of the variable.

In this case, one of the assumptions is that the error may be 
distributed ~ beta. I think that betareg makes this assumption, am I right?

I am finding it difficult to solve two problems:

1) write the maximum likelihood function (what do you suggest?)
2) deal with the fact that a few factors actually have values of y (the 
response) at the extremes: that is 0 and 1. But that mean that the link 
function returns Infinite values in that case ....
3) the error is dependent on E(y).

PS: Additional silly question: what is the discrete equivalent of beta? 
binomial?
Arne Henningsen wrote:
#
For specific questions on the betareg package contact the maintainer.
If the likelihood based approaches are giving too much difficulty try
moving to a Bayesian framework (WinBUGS/R2WinBUGS, JAGS/r2jags, etc.)
On Wed, Mar 17, 2010 at 10:03 AM, Corrado <ct529 at york.ac.uk> wrote:
1 day later
#
Dear Gabor, Arne, Ravi, R users,

I am firstly trying the maximum likelihood approach, then will try the 
Bayesian approach.

The likelihood function, and the log likelihood function, will depend on 
the pdf of the error e in the formula:

y=f(theta*x)+e

Now let's say that e is  Gaussian distributed, then I can use LS which 
is the same as ML in this case, and the residuals would be distributed 
Gaussian. Is that right?

If e is distributed differently (for example: beta, in the continuous 
case,  or binomial, in the discrete case), then I am better off by using 
Maximum Likelihood. How would the residual be distributed? Should they 
not be distributed the same as e?

Best,
Gabor Grothendieck wrote: