Logistic regression problem
Em S?b, 2008-09-27 ?s 10:51 -0700, milicic.marko escreveu:
I have a huge data set with thousands of variable and one binary variable. I know that most of the variables are correlated and are not good predictors... but... It is very hard to start modeling with such a huge dataset. What would be your suggestion. How to make a first cut... how to eliminate most of the variables but not to ignore potential interactions... for example, maybe variable A is not good predictor and variable B is not good predictor either, but maybe A and B together are good predictor... Any suggestion is welcomed
milicic.marko
I think do you start with a rpart("binary variable"~.)
This show you a set of variables to start a model and the start set to
curoff for continous variables
Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil