Skip to content

interaction() -- problem with drop?

2 messages · Matthew Wiener, Peter Dalgaard

#
Hi, all. 

(R-1.3.0 on linux, alpha and intel; also tested on R-1.1.1 on irix.)

Below is a program that creates some random data (n, x, and y), creates a
factor out of x and y and then creates a factor z out of their interaction
(corresponding, with the default nf = 2, to quadrants, which is how I came
upon this).  It then runs an analysis of variance.

f.test.problem <- 
function(n = 100, nf = 2){

  t1 <- data.frame(n = rnorm(n), x = rnorm(n), y = rnorm(n))

  t1$x <- cut(t1$x, nf, labels = 1:nf)
  t1$y <- cut(t1$y, nf, labels = 1:nf)
  t1$z <- interaction(t1$x, t1$y, drop = F)
  
  print(table(t1$x))
  print(table(t1$y))
  print(table(t1$z))
  
  summary(aov(n ~ z, data = t1))
}

Here's the problem:  if none of the nf * nf levels of z is empty -- that
is, if there is at least one trial taking on each value -- I get the error
"Error in model.matrix(t, data) : invalid variable type".

traceback() gives:

8: model.matrix.default(mt, mf, contrasts)
7: model.matrix(mt, mf, contrasts)
6: lm(formula = n ~ z, data = t1, singular.ok = TRUE)
5: eval(expr, envir, enclos)
4: eval(lmcall, parent.frame())
3: aov(n ~ z, data = t1)
2: summary(aov(n ~ z, data = t1))
1: f.test.problem(nf = 3)


However, if one of the levels of z is empty (which I'm checking using
table), then the analysis of variance runs!  (Easy to see if you use nf =
4 or even 3; it won't take long to get some examples that run and some
that don't.)

Creating n and z outside of a dataframe and then running aov(n~z) doesn't
help.

The problem does not arise if I do aov(n ~ x, data = t1) or 
aov(n ~ y, data = t1) -- the analysis of variance runs whether there are
empty categories or not.


Finally:  if I specify drop = T in the interaction call:
t1$z <- interaction(t1$x, t1$y, drop = T)
then the analysis works whether a factor actually gets dropped or
not.  That is, even when no factor is empty (and so I got an error with
drop = F), everything works.

So the problem seems to arise from something going on in interaction.  I'm
not sure what, and I'm sure someone else will see the problem faster than
I will.

Apologies in advance if I'm being dense and this is really how things
ought to work.  If not, I'll submit a formal bug report.

Matt Wiener

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Matthew Wiener <mcw at ln.nimh.nih.gov> writes:
...
You're not and please do...

This happens already with

a <- gl(2,5,10)
b <- gl(5,1,10)
zz <- interaction(a,b)
model.matrix.default(~zz)

wheras

zz <- a:b             
model.matrix.default(~zz)

works fine, and zz is *apparently* identical between the two, save for
the level names (which is another bug...).

And watch this:
structure(c(1, 3, 5, 7, 9, 2, 4, 6, 8, 10), .Label = c("1.1", 
"2.1", "1.2", "2.2", "1.3", "2.3", "1.4", "2.4", "1.5", "2.5"
), class = "factor")
Error in model.matrix(t, data) : invalid variable type
+ "2.1", "1.2", "2.2", "1.3", "2.3", "1.4", "2.4", "1.5", "2.5"
+ ), class = "factor")
[1] TRUE
(Intercept) zz22.1 zz21.2 zz22.2 zz21.3 zz22.3 zz21.4 zz22.4 zz21.5 zz22.5
1            1      0      0      0      0      0      0      0      0      0
2            1      0      1      0      0      0      0      0      0      0

<stuff like this usually happens if the internal bit that says that an
object has a class doesn't get turned on for some reason>