Stepwise GLM selection by LRT?
On Thu, 12 Jul 2007, Lutz Ph. Breitling wrote:
Thank you very much for the prompt reply. Seems like I had not fully understood what the k-parameter to stepAIC is doing. Your suggested approach looks indeed fine to me, actually I do not quite understand why you say that it's only an approximation to the LRT?
So this is computing AIC_k = -2L + kp. If you compare models with p and p+q parameters, this is equvalent to comparing 2 log LR with kq and so for q=1 the Wilks' LRT is found for k = qchisq(1-p, df=1) (which is just a squared Normal). However, no one said q would always be one, and stepAIC steps in terms, not individual coefficients. Therein lies one of the approximations (another is in the asympototic distribution theory of the test).
Best wishes- Lutz On 7/11/07, Ravi Varadhan <rvaradhan at jhmi.edu> wrote:
Check out the stepAIC function in MASS package. This is a nice tool, where you can actually implement any penalty even though the function's name has "AIC" in it because it is the default. Although this doesn't do an LRT test based variable selection, you can sort of approximate it by using a penalty of k = qchisq(1-p, df=1), where p is the p-value for variable selection. This penalty means that a variable enters/exits an existing model, when the absolute value of change in log-likelihood is greater than qchisq(1-p, df=1). For p = 0.1, k = 2.71, and for p=0.05, k = 3.84. Is this whhant you'd like to do? Ravi. ---------------------------------------------------------------------------- ------- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvaradhan at jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html ---------------------------------------------------------------------------- -------- -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Lutz Ph. Breitling Sent: Wednesday, July 11, 2007 3:06 PM To: r-help at stat.math.ethz.ch Subject: [R] Stepwise GLM selection by LRT? Dear List, having searched the help and archives, I have the impression that there is no automatic model selection procedure implemented in R that includes/excludes predictors in logistic regression models based on LRT P-values. Is that true, or is someone aware of an appropriate function somewhere in a custom package? Even if automatic model selection and LRT might not be the most appropriate methods, I actually would like to use these in order to simulate someone else's modeling approach... Many thanks for all comments- Lutz ----- Lutz Ph. Breitling German Cancer Research Center Heidelberg/Germany
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595