Optimization in R similar to MS Excel Solver
I do not know what algorithms the Excel solver function uses. See inline for how to do what you want in R. Forgive me if I have misinterpreted your request.
On 19-10-2012, at 16:25, Richard James wrote:
Dear Colleagues,
I am attempting to develop an optimization routine for a river suspended
sediment mixing model. I have Calcium and Magnesium concentrations (%) for
sediments from 4 potential source areas (Topsoil, Channel Banks, Roads,
Drains) and I want to work out, based on the suspended sediment calcium and
magnesium concentrations, what are the optimal contributions from each
source area to river suspended sediments. The dataset is as follows:
Topsoil Channel Bank
Roads Drains
Ca(%) 0.03 0.6
0.2 0.35
Mg(%) 0.0073 0.0102
0.0141 0.012
Contribution 0.25 0.25
0.25 0.25
and
Ca in river(%) 0.33
Mg in river (%) 0.0114
Read data (it would have been a lot more helpful if you had provided the result of dput for the data and the target).
basedat <- read.table(text="Topsoil Channel-Bank Roads Drains
Ca(%) 0.03 0.6 0.2 0.35
Mg(%) 0.0073 0.0102 0.0141 0.012
Contribution 0.25 0.25 0.25 0.25", header=TRUE)
basedat
target <- c("Ca in river(%)"=0.33,"Mg in river (%)"=0.0114)
# convert to matrix and vector for later use
bmat <- as.matrix(basedat[1:2,])
pstart <- as.numeric(basedat["Contribution",1:3])
I want to optimize the contribution values (currently set at 0.25) by minimizing the sum of squares of the relative errors of the mixing model, calculated as follows: SSRE =( (x1-((a1*c1)+(a2*c2)+(a3*c3)+(a4*c4))/x1)^2) + (x2-((b1*c1)+(b2*c2)+(b3*c3)+(b4*c4))/x2)^2)
I do not understand why you are dividing by x1 and x2. It make no sense to me to calculate (xi- (xi_calculated)/xi)^2 Given what your stated purpose then (xi- xi_calculated)^2 or something like (1-x1/xi_calculated)^2 seem more appropriate.
Where: x1 = calcium in river; x2 = magnesium in river; a1:a4 = Ca in topsoil, channel banks, roads, drains b1:b4 = Mg in topsoil, channel banks, roads, drains c1:4 = Contribution to be optimized I can generate a solution very quickly using the MS Excel Solver function, however later I want to be able to run a Monte-Carlo simulation on to generate confidence intervals based on variance in the source area concentrations ? hence my desire to use R to develop the mixing model. So far I have been using the ?optim? function, however I?m confused as to what form the ?par? and ?fn? arguments should take from my data above. I am also unsure of how to write the model constraints ? i.e. total contribution should sum to 1, and all values must be non-negative.
The 'par' should be a vector of starting values of the parameters for your objective function.
The 'fn' is the function that calculates a scalar given the parameter values.
Your 'par' is a vector with all elements between 0 and 1 and with a sum == 1.
That can't be done with optim but you can simulate the requirements by letting optim work with a three element vector and defining the fourth value as
1-sum(first three params). The requirement that all params must lie between 0 and 1 can be met by making 'fn' return a large value when the requirement is not met.
Some code:
SSRE <- function(parx) {
par<- c(parx,1-sum(parx))
if(all(par > 0 & par < 1)) { # parameters meet requirements
sum((target - (bmat %*% par))^2) # this is a linear algebra version of your objective without the division by xi
# or if you want to divide by target (see above) see below in the benchmark section for comment
# sum(((target - (bmat %*% par))/target)^2)
} else 1e7 # penalty for parameters not satisfying constraints
}
SSRE(pstart)
z <- optim(pstart,SSRE)
z
c(z$par, 1-sum(z$par)) # final contributions
Results:
# > z <- optim(pstart,SSRE)
# > z
# $par
# [1] 0.1492113 0.2880078 0.2950191
#
# $value
# [1] 2.157446e-12
#
# $counts
# function gradient
# 108 NA
#
# $convergence
# [1] 0
#
# $message
# NULL
You can also try this:
z <- optim(pstart,SSRE,method="BFGS")
z
c(z$par, 1-sum(z$par))
z <- nlminb(pstart,SSRE)
z
c(z$par, 1-sum(z$par))
If you intend to do run these optimizations many times I wouldn't use optim without specifying the method argument.
See this benchmark.
library(rbenchmark) benchmark(optim(pstart,SSRE),optim(pstart,SSRE,method="BFGS"), nlminb(pstart,SSRE),
+ replications=1000, columns=c("test","replications","elapsed","relative"))
test replications elapsed relative
3 nlminb(pstart, SSRE) 1000 0.829 1.264
1 optim(pstart, SSRE) 1000 1.647 2.511
2 optim(pstart, SSRE, method = "BFGS") 1000 0.656 1.000
If you use sum(((target - (bmat %*% par))/target)^2) as objective you'll see different timings. nlminb turns out to be the fastest.
good luck,
Berend
Any help, advise, or suggestions to this problem would be very much appreciated. -- View this message in context: http://r.789695.n4.nabble.com/Optimization-in-R-similar-to-MS-Excel-Solver-tp4646759.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.