-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Wensui Liu
Sent: Saturday, February 18, 2006 2:03 PM
To: Liaw, Andy
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Question about variable selection
Thank you so much for your reply, Andy.
But what if I am only interesed in main effects instead of
interactions?
On 2/18/06, Liaw, Andy <andy_liaw at merck.com> wrote:
That depends on whether the IV could have some significant
interactions with other Ivs not considered in the bivariate
iv <- expand.grid(-2:2, -2:2)
y <- 3 + iv[,1] * iv[,2] + rnorm(nrow(iv), sd=0.1) summary(lm(y ~
iv[,1]))
Call:
lm(formula = y ~ iv[, 1])
Residuals:
Min 1Q Median 3Q Max
-4.06259 -1.06048 -0.02377 1.05901 4.04315
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.01908 0.41482 7.278 2.09e-07 ***
iv[, 1] 0.01417 0.29332 0.048 0.962
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.074 on 23 degrees of freedom Multiple
R-Squared: 0.0001014, Adjusted R-squared: -0.04337
F-statistic: 0.002333 on 1 and 23 DF, p-value: 0.9619
summary(lm(y ~ iv[,1] * iv[,2]))
Call:
lm(formula = y ~ iv[, 1] * iv[, 2])
Residuals:
Min 1Q Median 3Q Max
-0.22390 -0.08894 -0.01279 0.13525 0.17608
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.019083 0.026330 114.665 <2e-16 ***
iv[, 1] 0.014167 0.018618 0.761 0.455
iv[, 2] -0.005486 0.018618 -0.295 0.771
iv[, 1]:iv[, 2] 0.992865 0.013165 75.418 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1316 on 21 degrees of freedom
Multiple R-Squared: 0.9963, Adjusted R-squared: 0.9958
F-statistic: 1896 on 3 and 21 DF, p-value: < 2.2e-16
Andy
From: Wensui Liu
Dear Lister,
I have a question about variable selection for regression.
if the IV is not significantly related to DV in the bivariate
analysis, does it make sense to include this IV into the
with multiple IVs?
Thank you so much!
[[alternative HTML version deleted]]