An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121114/f871bbce/attachment.pl>
Optimizing
5 messages · Sam Asin, Bert Gunter, R. Michael Weylandt +1 more
Sam: 1. Homework? R has a no homework policy. 2. But in any case, check out the Optimization task view on CRAN. You should be able to find something there that meets your needs. Of course, if something is "a little over your head," that's not an excuse, but rather an admission that you have to do some "homework" on your own. Cheers, Bert
On Wed, Nov 14, 2012 at 5:23 PM, Sam Asin <asin.sam at gmail.com> wrote:
Hello,
I am fairly new with R and am having trouble finding an optimal group. I
checked the help functions for the various optimize commands and it was a
little over my head.
I have a dataset with 4 columns, name, type, value, and cost. The set
consists of a list of people, which have 3 types. I want to choose 6
people, two of each type, and maximize the sum of their values. However,
I'm subject to the constraint that the wage of the three people has to sum
to less than 20 dollars. Here is some sample data.
people <- c("A", "B", "C", "D", "E", "F", "G", "H", "I")
type<- c(1, 1, 1, 1, 2, 2, 3, 3, 3)
value<-c(25.20, 24, 38, 20, 14, 20, 31, 11, 8)
wage<- c(4, 3.8, 5.1, 3.5, 2.4, 3, 6, 2.4, 2)
data<- data.frame(people, type, value, wage)
With this small dataset the question isn't very interesting, but the answer
would be something like person C, D, E, F, G, and I (I didn't check to see
that those prices sum to less than $20).
How can I write a program that will do this? Can I just use the optimize
command? Do I have to transform my dataset into something that is easier
to use the optimize command on? Or should I write my own code that does
the process?
Thanks,
Sam
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121115/27bd3be0/attachment.pl>
1 day later
On Thu, Nov 15, 2012 at 7:49 PM, Sam Asin <asin.sam at gmail.com> wrote:
Hey, It's actually not homework, what gave you that impression?
Real data sets don't usually have people named A,B,C with wages 3,4,5. ;-) To your question at hand, it's close to a classic problem in combinatorial optimization known as the knapsack problem, but there are some small differences. That's a difficult (in a technical sense) problem but well-studied so there are lots of good "almost solutions." I'd look into that and see if you can transform it to fit that framework, for which there is almost surely a CRAN-tested implementation available.
I graduated in May and studied Math, economics, and international relations, so I don't have much of a programming background. This is a project that I'm working on out of personal interest. Obviously, I've tried doing some homework, but after 45 minutes of digging around without even really having any leads I figured I would post here. The optimization problems I see more generally seem tailored towards maximizing a function subject to some constraint on that function. For example, maximizing U(x,y)=2x^2+4y s.t. x+y=3. I don't really see a way to frame my current maximization like that at all. We aren't choosing one observation in our dataset, we are choosing a group. And, the constraints aren't something like the types sum to a number, but rather that our choice of observations meet some specific condition. Sorry if I'm being clueless here, but when I look at http://cran.r-project.org/web/views/Optimization.html I just see a giant list of packages, most of which I believe are optimizing a function subject to constraints. Maybe my problem actually is a common one, there is some package out there that would do it, and I just need to find it. If so, my googling and looking on the help pages has been so far unsuccessful. I apologize if the answer really is sitting there on some help page, but I really haven't found it. If this were some homework problem, then I would know that it's probably a common, normal thing, and feel more confident looking on the Cran page. But, since it isn't, I wasn't even sure if the answer was out there and my searching had so far been fruitless. That's why I wanted to know if there would be some function that would do this sort of thing or if it's the sort of thing that I just need to manually work through. The help pages are over my head in so far as the help pages are highly technical and focused on methods. They assume a knowledge about the basic type of optimization problem that I don't have, and I think that's because there is a standard type of optimization that is most common. Anyways, I am thinking I might just try to manually do the calculation because I can't follow the optimize commands. The general strategy I think is to somehow make a dataset that has all combinations of 6 people as rows. Then, for each row, make a vector that is just all the types of the people in that row, a vector for the sum of the values, and a vector for the sum of the wages. Then, keep only those rows which have the correct types and wages under the max. Then, sort by the sum of the values. Let me know if you have any better ideas! Sam On Wed, Nov 14, 2012 at 7:35 PM, Bert Gunter <gunter.berton at gene.com> wrote:
Sam: 1. Homework? R has a no homework policy. 2. But in any case, check out the Optimization task view on CRAN. You should be able to find something there that meets your needs. Of course, if something is "a little over your head," that's not an excuse, but rather an admission that you have to do some "homework" on your own. Cheers, Bert On Wed, Nov 14, 2012 at 5:23 PM, Sam Asin <asin.sam at gmail.com> wrote:
Hello, I am fairly new with R and am having trouble finding an optimal group. I checked the help functions for the various optimize commands and it was a little over my head. I have a dataset with 4 columns, name, type, value, and cost. The set consists of a list of people, which have 3 types. I want to choose 6 people, two of each type, and maximize the sum of their values. However, I'm subject to the constraint that the wage of the three people has to
sum
to less than 20 dollars. Here is some sample data.
people <- c("A", "B", "C", "D", "E", "F", "G", "H", "I")
type<- c(1, 1, 1, 1, 2, 2, 3, 3, 3)
value<-c(25.20, 24, 38, 20, 14, 20, 31, 11, 8)
wage<- c(4, 3.8, 5.1, 3.5, 2.4, 3, 6, 2.4, 2)
data<- data.frame(people, type, value, wage)
With this small dataset the question isn't very interesting, but the
answer
would be something like person C, D, E, F, G, and I (I didn't check to
see
that those prices sum to less than $20).
How can I write a program that will do this? Can I just use the optimize
command? Do I have to transform my dataset into something that is easier
to use the optimize command on? Or should I write my own code that does
the process?
Thanks,
Sam
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Wed, Nov 14, 2012 at 8:23 PM, Sam Asin <asin.sam at gmail.com> wrote:
Hello,
I am fairly new with R and am having trouble finding an optimal group. I
checked the help functions for the various optimize commands and it was a
little over my head.
I have a dataset with 4 columns, name, type, value, and cost. The set
consists of a list of people, which have 3 types. I want to choose 6
people, two of each type, and maximize the sum of their values. However,
I'm subject to the constraint that the wage of the three people has to sum
to less than 20 dollars. Here is some sample data.
people <- c("A", "B", "C", "D", "E", "F", "G", "H", "I")
type<- c(1, 1, 1, 1, 2, 2, 3, 3, 3)
value<-c(25.20, 24, 38, 20, 14, 20, 31, 11, 8)
wage<- c(4, 3.8, 5.1, 3.5, 2.4, 3, 6, 2.4, 2)
data<- data.frame(people, type, value, wage)
With this small dataset the question isn't very interesting, but the answer
would be something like person C, D, E, F, G, and I (I didn't check to see
that those prices sum to less than $20).
How can I write a program that will do this? Can I just use the optimize
command? Do I have to transform my dataset into something that is easier
to use the optimize command on? Or should I write my own code that does
the process?
This can be formulated as an integer programming problem. Note that
the proposed solution in your post is infeasible as it violates the
wage constraint.
library(lpSolve)
people <- c("A", "B", "C", "D", "E", "F", "G", "H", "I")
type<- c(1, 1, 1, 1, 2, 2, 3, 3, 3)
value<-c(25.20, 24, 38, 20, 14, 20, 31, 11, 8)
wage<- c(4, 3.8, 5.1, 3.5, 2.4, 3, 6, 2.4, 2)
con.mat <- rbind(type == 1, type == 2, type == 3, wage)
con.dir <- c("==", "==", "==", "<=")
con.rhs <- c(2, 2, 2, 20)
binary.vec <- seq_along(people)
out <- lp("max", value, con.mat, con.dir, con.rhs, binary.vec = binary.vec)
out$solution # 1 0 1 0 1 1 0 1 1
people[out$solution == 1] # "A" "C" "E" "F" "H" "I"
out # 116.2
# note: proposed solution in post violates wage constraint
proposed.soln <- c("C", "D", "E", "F", "G", "I")
crossprod(wage, people %in% proposed.soln) # 22
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com