Skip to content

How to write a loop in R to select multiple regression model and validate it ?

3 messages · beginner, Jeff Newmiller

#
I would like to run a loop in R. I have never done this before, so I would be
very grateful for your help !

1. I have a sample set: 25 objects. I would like to draw 1 object from it
and use it as a test set for my future external validation. The remaining 24
objects I would like to use as a training set (to select a model). I would
like to repeat this process until all 25 objects are used as a test set. 

2. For each of the training sets I would like to run the following code:


library(leaps)
forward <- regsubsets(Y ~.,data = training, method = "forward", nbest=1) 
backward <- regsubsets(Y ~.,data = training, method = "backward", nbest=1)
stepwise <- regsubsets(Y ~., data = training, method = "seqrep", nbest=1)
exhaustive <- regsubsets(Y ~.,data = training, method = "forward", nbest=1)
summary(forward)
summary(backward)
summary(stepwise)
summary(exhaustive)

I would like R programme to select the best model (with the highest adjusted
R2) using each of the selection methods, so there are 4 final best models
(e.g. the best model selected with forward selection, the best model
selected with backward selection and so on...). 

 
Afterwards I would like to perform internal cross validation of all 4
selected models and choose 1 out of 4 which has the lowest average mean
squared error (MSE). I used to do it using the code below:

library(DAAG)
val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X1+X2+X3))
val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X1+X2+X4))
val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X3+X4+X5))
val.daag<-CVlm(df=training, m=1, form.lm=formula(Y ~ X4+X5+X7))

For the best selected model (the lowest MSE) I would like to perform an
external validation on 1 object left on the site at the beginning of the
study (please refer to point 1.).

3. And loop again using different training and test set ....


I hope that you could help me with this. 

If you have any suggestions how to select the best model and perform
validation more efficiently, I would be happy to hear about that.

Thank you !



--
View this message in context: http://r.789695.n4.nabble.com/How-to-write-a-loop-in-R-to-select-multiple-regression-model-and-validate-it-tp4668669.html
Sent from the R help mailing list archive at Nabble.com.
#
This doesn't look like a task you have acquired through a real-life problem... it looks like homework. There is a stated no-homework policy in the Posting Guide (please read it), since you should be using the resources provided along with your educational environment (teaching assistants, tutors, office hours...), and we don't know whether the help we provide would be considered "cheating".
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.
beginner <paxkn at nottingham.ac.uk> wrote: