An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20080302/4fe5fc58/attachment.pl
Constrained regression
3 messages · Carlos Alzola, Mike Cheung, Berwin A Turlach
Dear Carlos, One approach is to use structural equation modeling (SEM). Some SEM packages, such as LISREL, Mplus and Mx, allow inequality and nonlinear constraints. Phantom variables (Rindskopf, 1984) may be used to impose inequality constraints. Your model is basically: y = b0 + b1*b1*x1 + b2*b2*x2 +...+ bp*bp*xp + e 1 = b1*b1 + b2*b2 +...+ bp*bp Alternatively, you can set some condition bounds on the parameter estimates. Then you only have to impose the second constraint. Rindskopf, D. (1984). Using phantom and imaginary latent variables to parameterize constraints in linear structural models. Psychometrika, 49, 37-47. Regards, Mike
--------------------------------------------------------------------- Mike W.L. Cheung Phone: (65) 6516-3702 Department of Psychology Fax: (65) 6773-1843 National University of Singapore http://courses.nus.edu.sg/course/psycwlm/internet/ --------------------------------------------------------------------- On Mon, Mar 3, 2008 at 11:52 AM, Carlos Alzola <calzola at cox.net> wrote: > Dear list members, > > I am trying to get information on how to fit a linear regression with > constrained parameters. Specifically, I have 8 predictors , their > coeffiecients should all be non-negative and add up to 1. I understand it is > a quadratic programming problem but I have no experience in the subject. I > searched the archives but the results were inconclusive. > > Could someone provide suggestions and references to the literature, please? > > Thank you very much. > > Carlos > > Carlos Alzola > calzola at cox.net > (703) 242-6747 > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
G'day Carlos, On Mon, Mar 3, 2008 at 11:52 AM
Carlos Alzola <calzola at cox.net> wrote:
I am trying to get information on how to fit a linear regression with constrained parameters. Specifically, I have 8 predictors , their coeffiecients should all be non-negative and add up to 1. I understand it is a quadratic programming problem but I have no experience in the subject. I searched the archives but the results were inconclusive. Could someone provide suggestions and references to the literature, please?
A suggestion:
library(MASS) ## to access the Boston data designmat <- model.matrix(medv~., data=Boston) Dmat <- crossprod(designmat, designmat) dvec <- crossprod(designmat, Boston$medv) Amat <- cbind(1, diag(NROW(Dmat))) bvec <- c(1, rep(0,NROW(Dmat)) meq <- 1 library(quadprog) res <- solve.QP(Dmat, dvec, Amat, bvec, meq)
The solution seems to contain values that are, for all practical purposes, actually zero:
res$solution
[1] 4.535581e-16 2.661931e-18 1.016929e-01 -1.850699e-17 [5] 1.458219e-16 -3.892418e-15 8.544939e-01 0.000000e+00 [9] 2.410742e-16 2.905722e-17 -5.700600e-20 -4.227261e-17 [13] 4.381328e-02 -3.723065e-18 So perhaps better:
zapsmall(res$solution)
[1] 0.0000000 0.0000000 0.1016929 0.0000000 0.0000000 0.0000000 [7] 0.8544939 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 [13] 0.0438133 0.0000000 So the estimates seem to follow the constraints. And the unconstrained solution is:
res$unconstrainted.solution
[1] 3.645949e+01 -1.080114e-01 4.642046e-02 2.055863e-02 [5] 2.686734e+00 -1.776661e+01 3.809865e+00 6.922246e-04 [9] -1.475567e+00 3.060495e-01 -1.233459e-02 -9.527472e-01 [13] 9.311683e-03 -5.247584e-01 which seems to coincide with what lm() thinks it should be:
coef(lm(medv~., Boston))
(Intercept) crim zn indus chas
3.645949e+01 -1.080114e-01 4.642046e-02 2.055863e-02 2.686734e+00
nox rm age dis rad
-1.776661e+01 3.809865e+00 6.922246e-04 -1.475567e+00 3.060495e-01
tax ptratio black lstat
-1.233459e-02 -9.527472e-01 9.311683e-03 -5.247584e-01
So there seem to be no numeric problems. Otherwise we could have done
something else (e.g calculate the QR factorization of the design
matrix, say X, and give the R factor to solve.QP, instead of
calculating X'X and giving that one to solve.QP).
If the intercept is not supposed to be included in the set of
constrained estimates, then something like the following can be done:
Amat[1,] <- 0 res <- solve.QP(Dmat, dvec, Amat, bvec, meq) zapsmall(res$solution)
[1] 6.073972 0.000000 0.109124 0.000000 0.000000 0.000000 0.863421 [8] 0.000000 0.000000 0.000000 0.000000 0.000000 0.027455 0.000000 Of course, since after the first command in that last block the second column of Amat contains only zeros
Amat[,2]
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 we might as well have removed it (and the corresponding entry in bvec)
Amat <- Amat[, -2] bvec <- bvec[-2]
before calling solve.QP(). Note, the Boston data set was only used to illustrate how to fit such models, I do not want to imply that these models are sensible for these data. :-) Hope this helps. Cheers, Berwin =========================== Full address ============================= Berwin A Turlach Tel.: +65 6516 4416 (secr) Dept of Statistics and Applied Probability +65 6516 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: statba at nus.edu.sg Singapore 117546 http://www.stat.nus.edu.sg/~statba