subset selection for glm
On Sat, 15 Oct 2005, Dhiren DSouza wrote:
I posted a message earlier about subset selection. I have a data set with 50 variables x1, x2, .... x50 x50 is a binary response variable that I would like to predict. Is there a library I could use to do an exhaustive search for a subset (forward/backward subset selection) of variables to include in the regression model. Any help would be greatly appreciated.
?step (as surely help.search() would have shown you), and btw, that is not an `exhaustive search' procedure. Frank Harrell has posted repeatedly on the dangers of unthinking use of such a procedure -- if he does not chime in now, please do look at his posts (and if you have access to it, his book). You have not told us *why* you want to do variable selection (which is a more accurate name for what you are calling `subset' selection), and for most purposes it is not a good idea. Let me second Roger Bivand's comment earlier today:
I would, though, appeal to posters to give those who try to reply to questions at least a little help, by including an informative signature block.
I know that several helpers are quite unlikely to offer help to someone sending an unsigned letter, for that is what not using a real user name and affiliation amounts to. So, PLEASE give your credentials -- this forum is a free (to the recipients) technical support forum, and that is a privilege that should be respected.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595