Hello!
I am trying to create an R optimization routine for a task that's
currently being done using Excel (lots of tables, formulas, and
Solver).
However, otpim seems to be finding a local minimum.
Example data, functions, and comparison with the solution found in
Excel are below.
I am not experienced in optimizations so thanks a lot for your advice!
Dimitri
### 2 Inputs:
IV<-data.frame(IV=c(0,6672895.687,13345791.37,20018687.06,26691582.75,33364478.44,40037374.12,46710269.81,53383165.5,60056061.18,66728956.87,73401852.56,80074748.24,86747643.93,93420539.62,100093435.3,106766331,113439226.7,120112122.4,126785018.1,133457913.7,140130809.4,146803705.1,153476600.8,160149496.5,166822392.2,173495287.9,180168183.5,186841079.2,193513974.9,200186870.6))
DV<-data.frame(DV=c(0,439.8839775,829.7360945,1176.968757,1487.732038,1767.147276,2019.49499,2248.366401,2456.78592,2647.310413,2822.109854,2983.033036,3131.661246,3269.352233,3397.276321,3516.446162,3627.741311,3731.928591,3829.679009,3921.581866,4008.156537,4089.862363,4167.106955,4240.253215,4309.625263,4375.513474,4438.178766,4497.856259,4554.75841,4609.077705,4660.988983))
## Function "transformIV" transforms a data frame column "IV" using
parameters .alpha & .beta:
## It returns a data frame column IV_transf:
transformIV = function(.alpha,.beta) {
IV_transf <- as.data.frame(1 - (1/exp((IV/.beta)^.alpha)))
return(IV_transf)
}
### Function "mysum" calculates the sum of absolute residuals after a
regression with a single predictor:
mysum<- function(myIV,myDV){
regr<-lm(myDV[[1]] ~ 0 + myIV[[1]])
mysum<-sum(abs(regr$resid))
return(mysum)
}
### Function to be optimized;
### param is a vector of 2 values (.alpha and .beta)
myfunc <- function(param){
myalpha<-param[1]
mybeta<-param[2]
IVtransf<-transformIV(myalpha, mybeta)
sumofdevs<-mysum(myIV=IVtransf,myDV=DV)
return(sumofdevs)
}
# Optimizing using optim:
myopt <- optim(fn=myfunc, par=c(0.1,max(IV)), method="L-BFGS-B", lower=0)
(myopt)
myfunc(myopt$par)
## Comparing this solution to Excel Solver solution:
myfunc(c(0.888452533990788,94812732.0897449))
Just to add:
I also experimented with the starting parameters (par) under optim,
especially with the second one. I tried 1, 10, 100, 1000, etc.
When I tried 100,000,000 then I got a somewhat better solution (but
still not as good as in Excel). However, under message it said:
"ERROR: ABNORMAL_TERMINATION_IN_LNSRCH"
Dimitri
On Thu, Nov 10, 2011 at 1:50 PM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:
Hello!
I am trying to create an R optimization routine for a task that's
currently being done using Excel (lots of tables, formulas, and
Solver).
However, otpim seems to be finding a local minimum.
Example data, functions, and comparison with the solution found in
Excel are below.
I am not experienced in optimizations so thanks a lot for your advice!
Dimitri
### 2 Inputs:
IV<-data.frame(IV=c(0,6672895.687,13345791.37,20018687.06,26691582.75,33364478.44,40037374.12,46710269.81,53383165.5,60056061.18,66728956.87,73401852.56,80074748.24,86747643.93,93420539.62,100093435.3,106766331,113439226.7,120112122.4,126785018.1,133457913.7,140130809.4,146803705.1,153476600.8,160149496.5,166822392.2,173495287.9,180168183.5,186841079.2,193513974.9,200186870.6))
DV<-data.frame(DV=c(0,439.8839775,829.7360945,1176.968757,1487.732038,1767.147276,2019.49499,2248.366401,2456.78592,2647.310413,2822.109854,2983.033036,3131.661246,3269.352233,3397.276321,3516.446162,3627.741311,3731.928591,3829.679009,3921.581866,4008.156537,4089.862363,4167.106955,4240.253215,4309.625263,4375.513474,4438.178766,4497.856259,4554.75841,4609.077705,4660.988983))
## Function "transformIV" transforms a data frame column "IV" using
parameters .alpha & .beta:
## It returns a data frame column IV_transf:
transformIV = function(.alpha,.beta) {
?IV_transf <- as.data.frame(1 - (1/exp((IV/.beta)^.alpha)))
?return(IV_transf)
}
### Function "mysum" calculates the sum of absolute residuals after a
regression with a single predictor:
mysum<- function(myIV,myDV){
?regr<-lm(myDV[[1]] ~ 0 + myIV[[1]])
?mysum<-sum(abs(regr$resid))
?return(mysum)
}
### Function to be optimized;
### param is a vector of 2 values (.alpha and .beta)
myfunc <- function(param){
?myalpha<-param[1]
?mybeta<-param[2]
?IVtransf<-transformIV(myalpha, mybeta)
?sumofdevs<-mysum(myIV=IVtransf,myDV=DV)
?return(sumofdevs)
}
# Optimizing using optim:
myopt <- optim(fn=myfunc, par=c(0.1,max(IV)), method="L-BFGS-B", lower=0)
(myopt)
myfunc(myopt$par)
## Comparing this solution to Excel Solver solution:
myfunc(c(0.888452533990788,94812732.0897449))
--
Dimitri Liakhovitski
marketfusionanalytics.com
Bert,
that's exactly where I started. I found optim in the first paragraph
under "General Purpose Continuous Solvers" and used bounded BFGS for a
constrained optimization for a situation with more than 1 parameters.
Again, not being an engineer / mathematician - would greatly
appreciate any pointers.
Thank you, everyone!
Dimitri
On Thu, Nov 10, 2011 at 2:39 PM, Bert Gunter <gunter.berton at gene.com> wrote:
Refer to the CRAN "Optimization" task view, please. That is a much more
appropriate place to begin than posting a query here.
All numerical optimizers only produce local optima.
-- Bert
On Thu, Nov 10, 2011 at 11:24 AM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:
Just to add:
I also experimented with the starting parameters (par) under optim,
especially with the second one. I tried 1, 10, 100, 1000, etc.
When I tried 100,000,000 then I got a somewhat better solution (but
still not as good as in Excel). However, under message it said:
"ERROR: ABNORMAL_TERMINATION_IN_LNSRCH"
Dimitri
On Thu, Nov 10, 2011 at 1:50 PM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:
Hello!
I am trying to create an R optimization routine for a task that's
currently being done using Excel (lots of tables, formulas, and
Solver).
However, otpim seems to be finding a local minimum.
Example data, functions, and comparison with the solution found in
Excel are below.
I am not experienced in optimizations so thanks a lot for your advice!
Dimitri
### 2 Inputs:
IV<-data.frame(IV=c(0,6672895.687,13345791.37,20018687.06,26691582.75,33364478.44,40037374.12,46710269.81,53383165.5,60056061.18,66728956.87,73401852.56,80074748.24,86747643.93,93420539.62,100093435.3,106766331,113439226.7,120112122.4,126785018.1,133457913.7,140130809.4,146803705.1,153476600.8,160149496.5,166822392.2,173495287.9,180168183.5,186841079.2,193513974.9,200186870.6))
DV<-data.frame(DV=c(0,439.8839775,829.7360945,1176.968757,1487.732038,1767.147276,2019.49499,2248.366401,2456.78592,2647.310413,2822.109854,2983.033036,3131.661246,3269.352233,3397.276321,3516.446162,3627.741311,3731.928591,3829.679009,3921.581866,4008.156537,4089.862363,4167.106955,4240.253215,4309.625263,4375.513474,4438.178766,4497.856259,4554.75841,4609.077705,4660.988983))
## Function "transformIV" transforms a data frame column "IV" using
parameters .alpha & .beta:
## It returns a data frame column IV_transf:
transformIV = function(.alpha,.beta) {
?IV_transf <- as.data.frame(1 - (1/exp((IV/.beta)^.alpha)))
?return(IV_transf)
}
### Function "mysum" calculates the sum of absolute residuals after a
regression with a single predictor:
mysum<- function(myIV,myDV){
?regr<-lm(myDV[[1]] ~ 0 + myIV[[1]])
?mysum<-sum(abs(regr$resid))
?return(mysum)
}
### Function to be optimized;
### param is a vector of 2 values (.alpha and .beta)
myfunc <- function(param){
?myalpha<-param[1]
?mybeta<-param[2]
?IVtransf<-transformIV(myalpha, mybeta)
?sumofdevs<-mysum(myIV=IVtransf,myDV=DV)
?return(sumofdevs)
}
# Optimizing using optim:
myopt <- optim(fn=myfunc, par=c(0.1,max(IV)), method="L-BFGS-B",
lower=0)
(myopt)
myfunc(myopt$par)
## Comparing this solution to Excel Solver solution:
myfunc(c(0.888452533990788,94812732.0897449))
--
Dimitri Liakhovitski
marketfusionanalytics.com
Bert,
that's exactly where I started. I found optim in the first paragraph
under "General Purpose Continuous Solvers" and used bounded BFGS for a
constrained optimization for a situation with more than 1 parameters.
Again, not being an engineer / mathematician - would greatly
appreciate any pointers.
I'm no expert on numerical optimisation (though I've done quite
a bit of it in my own ham-fisted way :-) ). Anyway, it seems to me
that the only strategy that anyone can use in seeking a global
optimum when there are multiple local optima is to use a wide
variety of starting values. Possibly placed on a (relatively coarse)
grid.
It can be a tough problem. Good luck.
cheers,
Rolf Turner
Bert,
that's exactly where I started. I found optim in the first paragraph
under "General Purpose Continuous Solvers" and used bounded BFGS for a
constrained optimization for a situation with more than 1 parameters.
Again, not being an engineer / mathematician - would greatly
appreciate any pointers.
I'm no expert on numerical optimisation (though I've done quite
a bit of it in my own ham-fisted way ). Anyway, it seems to me
that the only strategy that anyone can use in seeking a global
optimum when there are multiple local optima is to use a wide
variety of starting values. Possibly placed on a (relatively coarse)
grid.
Simulated annealing and other stochastic global optimization
methods are also possible solutions, although they may or may not
work better than the many-starting-points solution -- it depends
on the problem, and pretty much everything has to be tuned. Tabu
search <http://en.wikipedia.org/wiki/Tabu_search> is another possibility,
although I don't know much about it ...
Simulated annealing and other stochastic global optimization
methods are also possible solutions, although they may or may not
work better than the many-starting-points solution -- it depends
on the problem, and pretty much everything has to be tuned. Tabu
search <http://en.wikipedia.org/wiki/Tabu_search> is another possibility,
although I don't know much about it ...
It is known that the Excel Solver has much improved during recent years.
Still there are slightly better points such as
myfunc(c(0.889764228112319, 94701144.5712312)) # 334.18844
restricting the domain to [0, 1] x [0, 10^9] for an evolutionary approach,
for instance DEoptim::DEoptim().
Finding a global optimum in 2 dimensions is not so difficult. Here the scale
of the second variable could pose a problem as small local minima might be
overlooked easily.
Hans Werner
Hans W Borchers <hwborchers <at> googlemail.com> writes:
Ben Bolker <bbolker <at> gmail.com> writes:
Simulated annealing and other stochastic global optimization
methods are also possible solutions, although they may or may not
work better than the many-starting-points solution -- it depends
on the problem, and pretty much everything has to be tuned. Tabu
search <http://en.wikipedia.org/wiki/Tabu_search> is another possibility,
although I don't know much about it ...
It is known that the Excel Solver has much improved during recent years.
Still there are slightly better points such as
myfunc(c(0.889764228112319, 94701144.5712312)) # 334.18844
restricting the domain to [0, 1] x [0, 10^9] for an evolutionary approach,
for instance DEoptim::DEoptim().
Finding a global optimum in 2 dimensions is not so difficult. Here the scale
of the second variable could pose a problem as small local minima might be
overlooked easily.
Have taken a (quick) second look at the problem, I agree that scaling
and centering are more likely to be useful solutions than stochastic
global optimization stuff. Even using 'parscale' (see the optim
help page) may clear up the problem.
Ben Bolker