Skip to content
Back to formatted view

Raw Message

Message-ID: <CAHA9McMOpD-Tbu9P5=QFO2Vu1xpzto+x0yVH_Wpep93cW5-DaA@mail.gmail.com>
Date: 2012-11-16T22:32:32Z
From: Steve Lianoglou
Subject: Stepwise regression scope: all interacting terms (.^2)
In-Reply-To: <B34ECAC0-D224-4D84-98CE-547567E6E39E@comcast.net>

Hi Mark,

To put some context to David's response below, you can search the list
archives for times when people ask about stepwise regression. You can
get started here:

http://search.gmane.org/search.php?group=gmane.comp.lang.r.general&query=stepwise+penalized

The long and short of it is that you are almost always encouraged to
use some regularization/penalized model instead of this stepwise
approach. Frank Harrell, in particular, is generally quite vocal
against stepwise regression -- I'm actually surprised he hasn't chimed
in by now, but maybe he's getting a bit tired of fighting the good
fight -- or, it's close to the holiday and he's taking a break ;-)

Anyway ... HTH,

-steve

On Fri, Nov 16, 2012 at 4:13 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Nov 16, 2012, at 12:16 PM, Mark Ebbert wrote:
>
>> I haven't heard anything on this question. Is there something fundamentally wrong with my question? Any feedback is appreciated.
>>
>
> Perhaps failure to read this sig at the bottom of every posted message to rhelp?
>
> "PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code."
>
>
>> Mark
>> On Nov 15, 2012, at 8:13 AM, Mark T. W. Ebbert wrote:
>>
>>> Dear Gurus,
>>>
>>> Thank you in advance for your assistance. I'm trying to understand scope better when performing stepwise regression using "step."
>
> From the help page of step:
> "If scope is a single formula, it specifies the upper component, and the lower model is empty. "
>
>>> I have a model with a binary response variable and 10 predictor variables. When I perform stepwise regression I define scope=.^2 to allow interactions between all terms.
>
> I generally avoid answering questions about stepwise regression, because most of them do not include sufficient background material to justify that strategy. Yours certainly did not.
>
>
>>> But I am missing something. When I perform stepwise regression (both directions) on the main model (y~x1+x2+?+x10) the method returns quickly with an answer; however, when I define all interactions in the main model (y~x1+x2+?+x10+x1:x2+x1:x3+?) and then perform stepwise regression (backward only) it runs so long I have to kill it.
>>>
>>> So here's my question: what is the difference between scope=.^2 on the additive (proper term?) model and defining all interactions and doing backward regression? My understanding is that .^2 is supposed to allow all interactions!
>
> Well, I would have guessed all two-way interactions (all 45  of them in your case) would be included and then successively reduce until you got to your specified (arbitrary and most likely incorrectly set) endpoint.) I think the help page Details section is unclear on this point. I do not think that the 120 potential three-way interactions are part of the scope in that instance, but it should be easy enough for you to test that possibility.
>
> --
> David Winsemius, MD
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact