step.lm() fails to drop {many empty 2-way factor cells} (PR#3491)
Your example works on 2 of the 3 systems I tried (and in fact the warnings you got on stepAIC are spurious). It's all down to rounding errors. I've added a fuzz in the termination test. B
On Wed, 16 Jul 2003 maechler@stat.math.ethz.ch wrote:
Exec. Summary:
step() basically ``fails'' whereas MASS' stepAIC() does work
This may not be a bug in the strictest sense, but at least
something for the wish list. Unfortunately I have no time
currently to investigate further myself but want to be sure this
won't be forgotten:
The example is using a real data set with 216 observations on 9
variables -- where we have anonymized variable and factor level names.
The 8 explanatory variables are 4 factors (with 3 / 2 / 4 / 4 levels)
and 4 continuous covariates. The factor combinations observed
are very far from balanced, and have actually quite a few 0 cells.
The following is a reproducible R script
(you need internet connection and our FTP server working) :
## step.lm() "fails" when 2-way factor interactions have (many) empty cells
## whereas stepAIC() works fine
Dat <- read.table("ftp://stat.ethz.ch/U/maechler/R/step-ex.tab", header=TRUE)
Dat$f3 <- ordered(Dat$f3)
## Full Model : y ~ (f1+f2+f3+f4 + C1+C2+C3+C4)
## ---------- where f1..f4 are FACTORS and C1..C4 are continuous covariates
## "Look at data"
str(Dat)
summary(Dat)
## several 2-way interactions have empty cells, e.g.
with(Dat, table(f1,f4)) #!
with(Dat, table(f1,f3))
## see the full design:
with(Dat, ftable(table(f1,f2,f3,f4))) ##--> many many 0
pairs(Dat, col = 1+as.integer(Dat$f1), pch = 1+as.integer(Dat$sex))
## Fit a "full model" with all 2-way interactions :
lmAll <- lm(formula = y ~ .^2, data = Dat)
summary(lmAll) # many insignificant ones {BTW: why is it bad to have "*" here?}
drop1(lmAll)
## clearly shows interaction terms that should be dropped (would decrease AIC)
## BUT:
sr <- step(lmAll)
## nothing eliminated ?step claims it would work when drop1() does
library(MASS)
sAr <- stepAIC(lmAll) ## <<< stepAIC() *does* work
## but with instructive
## There were 19 warnings (use warnings() to see them)
warnings()
##- Warning messages:
##- 1: 0 df terms are changing AIC in: stepAIC(lmAll)
##- 2: 0 df terms are changing AIC in: stepAIC(lmAll)
##- ...
##- ...
summary(sAr)
______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595