Thank you very much Terry. I'm still puzzled at why this worked a year ago. What changed? I'd very much like to reverse the change by setting an argument somewhere or manipulating the terms object. I echo your sentiments about the general approach. Frank
On 06/15/2015 09:05 AM, Therneau, Terry M., Ph.D. wrote:
Frank,
I don't think there is any way to "fix" your problem except the way
that I did it.
library(survival)
tdata <- data.frame(y=c(1,3,3,5, 5,7, 7,9, 9,13),
x1=factor(letters[c(1,1,1,1,1,2,2,2,2,2)]),
x2= c(1,2,1,2,1,2,1,2,1,2))
fit1 <- lm( y ~ x1 * strata(x2) - strata(x2), tdata)
coef(fit1)
(Intercept) x1b x1a:strata(x2)x2=2
x1b:strata(x2)x2=2
3.000000 5.000000 1.000000 1.666667
Your code is calling model.matrix with the same model frame and terms
structure as the lm call above (I checked). In your case you know
that the underlying model has 2 intercepts (strata), one for the group
with x2=1 and another for the group with x2=2, but how is the
model.matrix routine supposed to guess that? It can't, so
model.matrix returns the proper result for the lm call. As seen above
the result is not singular, while for the Cox model it is singular due
to the extra intercept.
This is simply an extension of leaving the "intercept" term in the
model and then removing that column from the returned X matrix, which
is necessary to have the correct coding for ordinary factor variables,
something we've both done since day 1. In order for model.matrix to
do the right thing with interactions, it has to know how many
intercepts there actually are.
I've come to the conclusion that the entire thrust of 'contrasts' in S
was wrong headed, i.e., the "remove redundant columns from the X
matrix ahead of time" logic. It is simply not possible for the
model.matrix routine to guess correctly for all y and x combinations,
something that been acknowledged in R by changing the default for
"singular.ok" to TRUE. Dealing with this after the fact via a good
contrast function (a la SAS -- heresy!) would have been a much better
design choice. But as long as I'm in R the coxph routine tries to be
a good citizen.
Terry T.
------------------------------------------------------------------------ Frank E Harrell Jr Professor and Chairman School of Medicine Department of *Biostatistics* *Vanderbilt University* [[alternative HTML version deleted]]