Skip to content

contr.sum() and contrast names

4 messages · John Fox, Milan Bouchet-Valat

#
Hi!

I would like to suggest to make it possible, in one way or another, to
get meaningful contrast names when using contr.sum(). Currently, when
using contr.treatment(), one gets factor levels as contrast names; but
when using contr.sum(), contrasts are merely numbered, which is not
practical and can lead to mistakes (see code at the end of this
message).

This issue was discussed quickly in 2005 by Brian Ripley in a reply to a
message on R-help [1]. He rightly stressed that treatment and sum
contrasts are not equivalent to levels of a factor, because one needs to
know the reference (here, level or sum) to interpret them. But when one
knows the type of contrasts that are being used, useful labels are still
of high value. I don't think anybody does serious work with sum
contrasts named myfactor1, myfactor2, myfactor3. (This reasoning does
not so much apply to contr.helmert() since ordered factors can quite
naturally be reported using numbers.)

Thus, would it be possible to add an option to contr.sum() so that it
returns a matrix whose column names are the levels of the input factor?
Such an option could also be added to other contrasts with default to
FALSE. Another solution, which could be even more practical, would be to
add a new function, called for example contr.sum2(), which would do the
same thing - after all, we already have contr.SAS() to implement a
slightly different behavior while being essentially the same as
contr.treatment().

This contr.sum() issue really sounds like a detail, but it's sad one
given that factors work really great in R in all other situations. The
only reason I can think of to explain this behavior is that people
rarely use it. When fitting log-linear models with glm(), for example,
this contrast is the most natural one, but currently gives poorly named
coefficients when everything could be so easy to interpret if factor
levels were used. This means people have to implement a replacement for
contr.sum() by hand, which is not the end of the world but is definitely
not optimal given how simple the solution is.

Thanks for your attention!


Illustration of the current difference between contr.sum() and
contr.treatment():
B C
A 0 0
B 1 0
C 0 1
[,1] [,2]
A    1    0
B    0    1
C   -1   -1

1: https://stat.ethz.ch/pipermail/r-help/2005-July/075430.html
#
Hi Milan,

Take a look at the contr.Sum() and contr.Treatment() functions in the car package.

(I recall, BTW, the sometimes acrimonious previous discussion of this issue.)

Best,
 John

------------------------------------------------
John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/
	
On Sat, 27 Oct 2012 13:39:06 +0200
Milan Bouchet-Valat <nalimilan at club.fr> wrote:
2 days later
#
Le samedi 27 octobre 2012 ? 10:44 -0400, John Fox a ?crit :
Yeah, this is the kind of function I had in mind. Just that I think we
should have an equivalent in the base packages.
I could not find this discussion, do you have any pointer?

Anyways, going deeper and deeper into the archives, I've found you made
the same proposal almost exactly 10 years ago[1]! Oddly enough, it only
prompted one reply at the time - has the rest of the discussion been
removed from the archives because it was too rude? ;-) Should we reopen
the debate now that some time has been spent without actions being taken
on that front?


Regards

1: http://tolstoy.newcastle.edu.au/R/devel/02b/0878.html
#
Hi Milan,

On Tue, 30 Oct 2012 10:25:56 +0100
Milan Bouchet-Valat <nalimilan at club.fr> wrote:
Given the history of the issue, I think that it's unlikely that this will happen. Your message hasn't generated a cascade of discussion. And, of course, even if you had provoked a discussion, R Core would have to agree that this kind of change is desirable.
Sorry. I don't recall when the discussion took place and failed to find it just now when I tried to locate it in the R email list archives.
I think that you know that the answer to that is "no." And to be clear, the nasty comments that I recall weren't directed at me, and the rudeness was bidirectional (at least insofar as I remember the exchange accurately, which I may not).
You *have* reopened the issue but haven't elicited a response.

A nice characteristic of R is that if you don't like how something is implemented, you can offer your own version, which is what I did with contr.Treatment(), contr.Sum(), and contr.Helmert() in the car package, and similarly with Anova() in the car package. Sometimes these kinds of changes get into the standard R distribution, but more commonly, people who agree with you are free to use the alternatives that you provide.

Best,
 John