all subsets for glm

Of all the dangerous ways of doing this and getting confusing results,
gl1ce in lasso2 should be the least risky.
Thanks Dieter. In case an exhaustive search (all subsets) remains
infeasible, I'll include a shrinkage method for sure. Looks like
glmpath could be useful here.

Best,
Harald
If you actually want to find the best subsets, you can get a good 
approximation by using leaps on the weighted least squares fit that is the 
last iteration of the IWLS algorithm for fitting the glm.

Running regsubsets witha reasonably large value of nbest and then 
refitting the top models as glms afterwards will fairly realiably give the 
best glms.

Whether this is better than lasso depends on what you are trying to do - 
IMO the only point of all-subsets regression is to get many best models 
rather than a single one, and lasso doesn't do at all well at that.

 	-thomas

Of all the dangerous ways of doing this and getting confusing results,
gl1ce in lasso2 should be the least risky.
Thanks Dieter. In case an exhaustive search (all subsets) remains
infeasible, I'll include a shrinkage method for sure. Looks like
glmpath could be useful here.

Best,
Harald

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
If you actually want to find the best subsets, you can get a good 
approximation by using leaps on the weighted least squares fit that
is the last iteration of the IWLS algorithm for fitting the glm.

Running regsubsets witha reasonably large value of nbest and then 
refitting the top models as glms afterwards will fairly realiably
give the best glms.
Thanks, that sounds interesting. I am as yet clueless to the workings
of IWLS, so maybe this is nonsense: The result of running glm on the
full model (all variables) is a crass example for overfitting, i.e.
zero residuals, all R_i^2 close to 1, large coefficients. Would then
the "weighed last squares fit of the last iteration of IWLS" not be
pretty meaningless ?
Whether this is better than lasso depends on what you are trying to
do - IMO the only point of all-subsets regression is to get many best
models rather than a single one, and lasso doesn't do at all well at
that.
Yes, I am trying to get a number of best models, since the final model
selection shall be based on interpretability and expert knowledge. By
now I have bootstrapped the lasso (using glmpath) to generate such a
set, but the resulting models are very similar and I suspect there are
is a larger variety of "best models".

Harald