Skip to content

Bad optimization solution

13 messages · apjaworski@mmm.com, Paul Smith, Sundar Dorai-Raj +3 more

#
Dear All

I am trying to perform the below optimization problem, but getting
(0.5,0.5) as optimal solution, which is wrong; the correct solution
should be (1,0) or (0,1).

Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux).

Thanks in advance,

Paul

------------------------------------------------------
myfunc <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  abs(x1-x2)
}

optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=list(fnscale=-1))
#
Paul,

I think the problem is the starting point.  I do not remember the details
of the BFGS method, but I am almost sure the (.5, .5) starting point is
suspect, since the abs function is not differentiable at 0.  If you perturb
the starting point even slightly you will have no problem.

Andy

__________________________________
Andy Jaworski
518-1-01
Process Laboratory
3M Corporate Research Laboratory
-----
E-mail: apjaworski at mmm.com
Tel:  (651) 733-6092
Fax:  (651) 736-3122


                                                                           
             "Paul Smith"                                                  
             <phhs80 at gmail.com                                             
             >                                                          To 
             Sent by:                  R-help <r-help at stat.math.ethz.ch>   
             r-help-bounces at st                                          cc 
             at.math.ethz.ch                                               
                                                                   Subject 
                                       [R] Bad optimization solution       
             05/07/2007 04:30                                              
             PM                                                            
                                                                           
                                                                           
                                                                           
                                                                           




Dear All

I am trying to perform the below optimization problem, but getting
(0.5,0.5) as optimal solution, which is wrong; the correct solution
should be (1,0) or (0,1).

Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux).

Thanks in advance,

Paul

------------------------------------------------------
myfunc <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  abs(x1-x2)
}

optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=list(fnscale=-1))


______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
On 5/7/07, apjaworski at mmm.com <apjaworski at mmm.com> wrote:
Yes, with (0.2,0.9), a correct solution comes out. However, how can
one be sure in general that the solution obtained by optim is correct?
In ?optim says:

     Method '"L-BFGS-B"' is that of Byrd _et. al._ (1995) which allows
     _box constraints_, that is each variable can be given a lower
     and/or upper bound. The initial value must satisfy the
     constraints. This uses a limited-memory modification of the BFGS
     quasi-Newton method. If non-trivial bounds are supplied, this
     method will be selected, with a warning.

which only demands that "the initial value must satisfy the constraints".

Paul
#
On 5/7/07, Paul Smith <phhs80 at gmail.com> wrote:
Furthermore, X^2 is everywhere differentiable and notwithstanding the
reported problem occurs with

myfunc <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  (x1-x2)^2
}

optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=list(fnscale=-1))

Paul
#
Paul Smith said the following on 5/7/2007 3:25 PM:
Then perhaps supply the gradient:

mygrad <- function(x) {
   x1 <- x[1]
   x2 <- x[2]
   c(2, -2) * c(x1, x2)
}

optim(c(0.2,0.2),myfunc,mygrad,lower=c(0,0),upper=c(1,1),
       method="L-BFGS-B",control=list(fnscale=-1))

HTH,

--sundar
#
Your function, (x1-x2)^2, has zero gradient at all the starting values such
that x1 = x2, which means that the gradient-based search methods will
terminate there because they have found a critical point, i.e. a point at
which the gradient is zero (which can be a maximum or a minimum or a saddle
point).  

However, I do not why optim converges to the boundary maximum, when analytic
gradient is supplied (as shown by Sundar).

Ravi.

----------------------------------------------------------------------------
-------

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvaradhan at jhmi.edu

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 

----------------------------------------------------------------------------
--------


-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Paul Smith
Sent: Monday, May 07, 2007 6:26 PM
To: R-help
Subject: Re: [R] Bad optimization solution
On 5/7/07, Paul Smith <phhs80 at gmail.com> wrote:
details
perturb
To
cc
Subject
optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=
list(fnscale=-1))
Furthermore, X^2 is everywhere differentiable and notwithstanding the
reported problem occurs with

myfunc <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  (x1-x2)^2
}

optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=
list(fnscale=-1))

Paul

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
G'day Paul,

On Mon, 7 May 2007 22:30:32 +0100
"Paul Smith" <phhs80 at gmail.com> wrote:

            
Why?

As far as I can tell you are trying to minimize |x1-x2| where x1 and x2
are between 0 and 1.  The minimal value that the absolute function can
take is zero and any point (x1,x2)=(x,1-x) where x is between 0 and 1
will achieve this value and also respect the constraints that you have
imposed.  Hence, any such point, including (0.5,0.5) is a solution to
your problem.
Why?  Unless there are some additional constraint that you have not
told optim() (and us) about, these are two possible solutions from an
infinite set of solutions.  As I said, any point of the form (x, 1-x)
with x between 0 and 1 is a solution to your problem, unless I am
missing something....

Cheers,

	Berwin
#
G'day Paul,

On Mon, 7 May 2007 23:25:52 +0100
"Paul Smith" <phhs80 at gmail.com> wrote:
[...]
Same argument as with abs(x1-x2) holds.  (x1-x2)^2 is non-negative for
all x1, x2.  All points of the form (x, 1-x) where x is between 0 and 1
satisfy the constraints and achieve a function value of 0.  Hence, all
such points are solutions.

There is no problem.  Except if there are further constraints that
reduce the set of possible solutions which you have not told us about.

Cheers,

	Berwin
#
G'day all,

On Tue, 8 May 2007 12:10:25 +0800
Berwin A Turlach <berwin at maths.uwa.edu.au> wrote:

            
It was pointed out to me that you are actually trying to maximise the
function, sorry, didn't read the command carefully enough.

And I also noticed that my comments were anyhow wrong, for
minimisation the points (x,x) and not (x, 1-x) would be the solutions;
must have also misread the function as |x1+x2|...

Looks as if I need more coffee to wake up and get my brain going...

Please ignore my last two mails on this issue and apologise to the
list....

Cheers,

	Berwin
#
It seems that there is here a problem of reliability, as one never
knows whether the solution provided by R is correct or not. In the
case that I reported, it is fairly simple to see that the solution
provided by R (without any warning!) is incorrect, but, in general,
that is not so simple and one may take a wrong solution as a correct
one.

Paul
On 5/8/07, Ravi Varadhan <rvaradhan at jhmi.edu> wrote:
#
The issue is that you are using a derivative based optimizer for a
problem for which it is well known that such optimizers will not
perform well.  You should consider using a global optimizer.  For
example, "rgenoud" combines a genetic search algorithm with a BFGS
optimizer and it works well for your problem:

library(rgenoud)

myfunc <- function(x) {
  x1 <- x[1]
   x2 <- x[2]
   abs(x1-x2)
 }

optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=list(fnscale=-1))

genoud(myfunc, nvars=2, Domains=rbind(c(0,1),c(0,1)),max=TRUE,boundary.enforcement=2)

myfunc <- function(x) {
  x1 <- x[1]
  x2 <- x[2]
  (x1-x2)^2
}

optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=list(fnscale=-1))
genoud(myfunc, nvars=2, Domains=rbind(c(0,1),c(0,1)),max=TRUE,boundary.enforcement=2)

Cheers,
Jas.

=======================================
Jasjeet S. Sekhon                     
                                      
Associate Professor             
Travers Department of Political Science
Survey Research Center          
UC Berkeley                     

http://sekhon.berkeley.edu/
V: 510-642-9974  F: 617-507-5524
=======================================






Paul Smith writes:
 > It seems that there is here a problem of reliability, as one never
 > knows whether the solution provided by R is correct or not. In the
 > case that I reported, it is fairly simple to see that the solution
 > provided by R (without any warning!) is incorrect, but, in general,
 > that is not so simple and one may take a wrong solution as a correct
 > one.
 > 
 > Paul
 > 
 >
> On 5/8/07, Ravi Varadhan <rvaradhan at jhmi.edu> wrote:
> > Your function, (x1-x2)^2, has zero gradient at all the starting values such
 > > that x1 = x2, which means that the gradient-based search methods will
 > > terminate there because they have found a critical point, i.e. a point at
 > > which the gradient is zero (which can be a maximum or a minimum or a saddle
 > > point).
 > >
 > > However, I do not why optim converges to the boundary maximum, when analytic
 > > gradient is supplied (as shown by Sundar).
 > >
 > > Ravi.
 > >
 > > ----------------------------------------------------------------------------
 > > -------
 > >
 > > Ravi Varadhan, Ph.D.
 > >
 > > Assistant Professor, The Center on Aging and Health
 > >
 > > Division of Geriatric Medicine and Gerontology
 > >
 > > Johns Hopkins University
 > >
 > > Ph: (410) 502-2619
 > >
 > > Fax: (410) 614-9625
 > >
 > > Email: rvaradhan at jhmi.edu
 > >
 > > Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
 > >
 > >
 > >
 > > ----------------------------------------------------------------------------
 > > --------
 > >
 > >
 > > -----Original Message-----
 > > From: r-help-bounces at stat.math.ethz.ch
 > > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Paul Smith
 > > Sent: Monday, May 07, 2007 6:26 PM
 > > To: R-help
 > > Subject: Re: [R] Bad optimization solution
 > >
> > On 5/7/07, Paul Smith <phhs80 at gmail.com> wrote:
> > > > I think the problem is the starting point.  I do not remember the
 > > details
 > > > > of the BFGS method, but I am almost sure the (.5, .5) starting point is
 > > > > suspect, since the abs function is not differentiable at 0.  If you
 > > perturb
 > > > > the starting point even slightly you will have no problem.
 > > > >
 > > > >              "Paul Smith"
 > > > >              <phhs80 at gmail.com
 > > > >              >
 > > To
 > > > >              Sent by:                  R-help <r-help at stat.math.ethz.ch>
 > > > >              r-help-bounces at st
 > > cc
 > > > >              at.math.ethz.ch
 > > > >
 > > Subject
 > > > >                                        [R] Bad optimization solution
 > > > >              05/07/2007 04:30
 > > > >              PM
 > > > >
 > > > >
 > > > >
 > > > >
 > > > >
 > > > >
 > > > >
 > > > >
 > > > > Dear All
 > > > >
 > > > > I am trying to perform the below optimization problem, but getting
 > > > > (0.5,0.5) as optimal solution, which is wrong; the correct solution
 > > > > should be (1,0) or (0,1).
 > > > >
 > > > > Am I doing something wrong? I am using R 2.5.0 on Fedora Core 6 (Linux).
 > > > >
 > > > > Thanks in advance,
 > > > >
 > > > > Paul
 > > > >
 > > > > ------------------------------------------------------
 > > > > myfunc <- function(x) {
 > > > >   x1 <- x[1]
 > > > >   x2 <- x[2]
 > > > >   abs(x1-x2)
 > > > > }
 > > > >
 > > > >
 > > optim(c(0.5,0.5),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=
 > > list(fnscale=-1))
 > > >
 > > > Yes, with (0.2,0.9), a correct solution comes out. However, how can
 > > > one be sure in general that the solution obtained by optim is correct?
 > > > In ?optim says:
 > > >
 > > >      Method '"L-BFGS-B"' is that of Byrd _et. al._ (1995) which allows
 > > >      _box constraints_, that is each variable can be given a lower
 > > >      and/or upper bound. The initial value must satisfy the
 > > >      constraints. This uses a limited-memory modification of the BFGS
 > > >      quasi-Newton method. If non-trivial bounds are supplied, this
 > > >      method will be selected, with a warning.
 > > >
 > > > which only demands that "the initial value must satisfy the constraints".
 > >
 > > Furthermore, X^2 is everywhere differentiable and notwithstanding the
 > > reported problem occurs with
 > >
 > > myfunc <- function(x) {
 > >   x1 <- x[1]
 > >   x2 <- x[2]
 > >   (x1-x2)^2
 > > }
 > >
 > > optim(c(0.2,0.2),myfunc,lower=c(0,0),upper=c(1,1),method="L-BFGS-B",control=
 > > list(fnscale=-1))
 > >
 > > Paul
 > >
 > > ______________________________________________
 > > R-help at stat.math.ethz.ch mailing list
 > > https://stat.ethz.ch/mailman/listinfo/r-help
 > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 > > and provide commented, minimal, self-contained, reproducible code.
 > >
 > 
 >
#
Paul,

The problem lies neither with R nor with numercial methods.  The onus is always on the user to understand what the numerical schemes can do and what they can't do.  One should never blindly take the results given by a numerical scheme and run with it.  In your example, the optimization method is doing what it was designed to do: to find a critical point of a function where the gradient is zero.  It is your responsibility to ensure that the result makes sense, and if it doesn't, to understand why it doesn't make sense.  In your problem, maxima ((1,0) and (0,1)) lie on the boundary of the parameter space, and the gradient at the maxima (defined as the limit from within the domain) are clearly not zero.  Another problem with your example is that the hessian for your function is singular, it has eigenvalues of 0 and 2.  In short, there is no substitute to using your analytic powers!

Ravi.

----- Original Message -----
From: Paul Smith <phhs80 at gmail.com>
Date: Tuesday, May 8, 2007 4:33 am
Subject: Re: [R] Bad optimization solution
To: R-help <r-help at stat.math.ethz.ch>
#
Thanks, Ravi, for your clear explanation!

Paul
On 5/8/07, RAVI VARADHAN <rvaradhan at jhmi.edu> wrote: