Checking for orthogonal contrasts

Fri, Dec 3, 2010 8:17 AM #

David,

Thanks for the comments.

I think, though, that I have found the answer to my own post.

?lm illustrates the use of crossprod() for probing the orthogonality of
a model matrix. If I understand correctly, the necessary condition is
essentially that all between-term off-diagonal elements of crossprod(m)
are zero if the contrasts are orthogonal, where 'term' refers to the
collection of columns related to a single term in the model formula.

Example:

y<-rnorm(27)
g <- gl(3, 9)
h <- gl(3,3,27)

m1 <- model.matrix(y~g*h,  contrasts = list(g="contr.sum",
h="contr.sum"))
crossprod(m1)

#Compare with
m2 <- model.matrix(y~g*h,  contrasts = list(g="contr.treatment",
h="contr.treatment"))
crossprod(m2)
	#Note the nonzero off-diagonal elements between, say, g and h or
g, h and the various gi:hj elements


That presumably implies that one could test a linear model explicitly
for contrast orthogonality (and, implicitly, balanced design?) using
something like

model.orthogonal.lm <- function(l) {
	#l is a linear model 
	m <- model.matrix(l)
	a <- attr(m, "assign")
	a.outer <- outer(a, a, FUN="!=")
	m.xprod <- crossprod(m) 
	all( m.xprod[a.outer] == 0 )
}

l1 <- lm(y~g*h,  contrasts = list(g="contr.sum", h="contr.sum"))

l2 <- lm(y~g*h,  contrasts = list(g="contr.treatment",
h="contr.treatment"))

model.orthogonal.lm(l1) 
	#TRUE

model.orthogonal.lm(l2)
	#FALSE

Not sure how it would work on balanced incomplete block designs,
though. I'll have to try it.

Before I do, though, a) do I have the stats right? and b) this now
seems so obvious that someone must already have done it somewhere... ?


*******************************************************************
This email and any attachments are confidential. Any\ us...{{dropped:19}}

2 days later

Peter Dalgaard

Sun, Dec 5, 2010 10:42 AM #

On Dec 3, 2010, at 17:17 , S Ellison wrote:

You'll find that the block and treatment terms are NOT orthogonal. That's where all the stuff about "efficiency factors" and "recovery of interblock information" comes from.

a) basically, yes, I think you do

b) yes, many, but there is an amazing amount of sloppily thought out "folklore" going around, including the common misconception that somehow sum-to-zero contrasts are inherently better than the other types. What does seem to be the case is just that they have computational advantages in completely balanced designs, because they then imply orthogonality of COLUMNS of the design matrix. That in turn means that you can construct the sum of squares for each model term based on its own columns only. In unbalanced designs, they just tend to give incorrect results...

Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com