Skip to content
Prev 131921 / 398506 Next

lm/model.matrix confusion (? bug)

On Wed, 2007-12-12 at 06:46 -0800, Mark Difford wrote:
This is the problem - it doesn't! 

What is says is model.matrix() don't add an intercept to the returned
model matrix. R then tries to work out what to do with the returned
matrix that is now on the rhs of the formula. But note that at no point
have you gotten round the fact that it will add an intercept *after* the
formula is parsed and a -1 is not found. This is because model.matrix is
called again, internally in lm but not with your formula as an argument
but with a terms object and nowhere in this object is it specified not
to include an intercept.

To do this, lm would have to parse the formula for model.matrix(....)
and if it finds something work accordingly.
In this instance, you don't want to be working with lm. You can use
lm.fit which returns an object with $coefficients, so I would guess you
need something like this (not tested)

    if (ols) {
        obj <- x[[1]]
        mt <- terms(obj)
        mf <- model.frame(obj)
        y <- model.response(mf)
        X <- model.matrix(mt, mf, contrasts = obj$contrasts)
	fit <- lm.fit(X, y) ## store fit in case you need something else
	olscf <- coef(fit)  ## get coefficients using an extractor function
        ##if (attr(mt, "intercept") == 1)  ## This is my hack to overcome the double-intercept problem
        ##    { olscf <- summary(lm(y ~ X))$coefficients }
        ##else {
        ##    olscf <- summary(lm(y ~ X - 1))$coefficients
        ##}
        ##rownames(olscf) <- rownames(coef(obj))
}

Note also, that even if you are using your hack, you don't need to do

summary(lm(y ~ X))$coefficients

as

coef(lm(y ~ X))

will do.

lm is mainly there for top-level work. If you need to do things like you
have above, use lm.fit or do the qr decomposition yourself.

Or try a different way to build the formula you need and work with lm;
Bill Venables has a nice piece in R News Vol 2 Issue 2 in the
Programmer's Niche section on something that might be of use if you want
to build a correct formula for your needs from the colnames of X and y?

All the best,

G