Hi, all.
(R-1.3.0 on linux, alpha and intel; also tested on R-1.1.1 on irix.)
Below is a program that creates some random data (n, x, and y), creates a
factor out of x and y and then creates a factor z out of their interaction
(corresponding, with the default nf = 2, to quadrants, which is how I came
upon this). It then runs an analysis of variance.
f.test.problem <-
function(n = 100, nf = 2){
t1 <- data.frame(n = rnorm(n), x = rnorm(n), y = rnorm(n))
t1$x <- cut(t1$x, nf, labels = 1:nf)
t1$y <- cut(t1$y, nf, labels = 1:nf)
t1$z <- interaction(t1$x, t1$y, drop = F)
print(table(t1$x))
print(table(t1$y))
print(table(t1$z))
summary(aov(n ~ z, data = t1))
}
Here's the problem: if none of the nf * nf levels of z is empty -- that
is, if there is at least one trial taking on each value -- I get the error
"Error in model.matrix(t, data) : invalid variable type".
traceback() gives:
8: model.matrix.default(mt, mf, contrasts)
7: model.matrix(mt, mf, contrasts)
6: lm(formula = n ~ z, data = t1, singular.ok = TRUE)
5: eval(expr, envir, enclos)
4: eval(lmcall, parent.frame())
3: aov(n ~ z, data = t1)
2: summary(aov(n ~ z, data = t1))
1: f.test.problem(nf = 3)
However, if one of the levels of z is empty (which I'm checking using
table), then the analysis of variance runs! (Easy to see if you use nf =
4 or even 3; it won't take long to get some examples that run and some
that don't.)
Creating n and z outside of a dataframe and then running aov(n~z) doesn't
help.
The problem does not arise if I do aov(n ~ x, data = t1) or
aov(n ~ y, data = t1) -- the analysis of variance runs whether there are
empty categories or not.
Finally: if I specify drop = T in the interaction call:
t1$z <- interaction(t1$x, t1$y, drop = T)
then the analysis works whether a factor actually gets dropped or
not. That is, even when no factor is empty (and so I got an error with
drop = F), everything works.
So the problem seems to arise from something going on in interaction. I'm
not sure what, and I'm sure someone else will see the problem faster than
I will.
Apologies in advance if I'm being dense and this is really how things
ought to work. If not, I'll submit a formal bug report.
Matt Wiener
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
interaction() -- problem with drop?
2 messages · Matthew Wiener, Peter Dalgaard
Matthew Wiener <mcw at ln.nimh.nih.gov> writes:
Hi, all.
(R-1.3.0 on linux, alpha and intel; also tested on R-1.1.1 on irix.)
Below is a program that creates some random data (n, x, and y), creates a
factor out of x and y and then creates a factor z out of their interaction
(corresponding, with the default nf = 2, to quadrants, which is how I came
upon this). It then runs an analysis of variance.
f.test.problem <-
function(n = 100, nf = 2){
t1 <- data.frame(n = rnorm(n), x = rnorm(n), y = rnorm(n))
t1$x <- cut(t1$x, nf, labels = 1:nf)
t1$y <- cut(t1$y, nf, labels = 1:nf)
t1$z <- interaction(t1$x, t1$y, drop = F)
print(table(t1$x))
print(table(t1$y))
print(table(t1$z))
summary(aov(n ~ z, data = t1))
}
Here's the problem: if none of the nf * nf levels of z is empty -- that
is, if there is at least one trial taking on each value -- I get the error
"Error in model.matrix(t, data) : invalid variable type".
traceback() gives:
...
Apologies in advance if I'm being dense and this is really how things ought to work. If not, I'll submit a formal bug report.
You're not and please do... This happens already with a <- gl(2,5,10) b <- gl(5,1,10) zz <- interaction(a,b) model.matrix.default(~zz) wheras zz <- a:b model.matrix.default(~zz) works fine, and zz is *apparently* identical between the two, save for the level names (which is another bug...). And watch this:
zz1 <- interaction(a,b) dput(zz1)
structure(c(1, 3, 5, 7, 9, 2, 4, 6, 8, 10), .Label = c("1.1",
"2.1", "1.2", "2.2", "1.3", "2.3", "1.4", "2.4", "1.5", "2.5"
), class = "factor")
model.matrix.default(~zz1)
Error in model.matrix(t, data) : invalid variable type
zz2 <- structure(c(1, 3, 5, 7, 9, 2, 4, 6, 8, 10), .Label = c("1.1",
+ "2.1", "1.2", "2.2", "1.3", "2.3", "1.4", "2.4", "1.5", "2.5" + ), class = "factor")
all.equal(zz1,zz2)
[1] TRUE
model.matrix.default(~zz2)
(Intercept) zz22.1 zz21.2 zz22.2 zz21.3 zz22.3 zz21.4 zz22.4 zz21.5 zz22.5 1 1 0 0 0 0 0 0 0 0 0 2 1 0 1 0 0 0 0 0 0 0 <stuff like this usually happens if the internal bit that says that an object has a class doesn't get turned on for some reason>
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._