Skip to content

problems with glm (PR#771)

1 message · Brian Ripley

#
[We do have a TooMuchAtOnce category, but it is easier to cope with
short separate reports with informative subject lines for, e.g. the BUGS
list.]
On Mon, 18 Dec 2000 james.lindsey@luc.ac.be wrote:

            
[...]
Why not just wt <- Reg66 != Reg71  ?

[...]
I think this is the same as

[Rd] glm gives incorrect results for zero-weight cases (PR#780)

in which case it is already fixed in R-patched.  That does look to have
solved this one too.  Note that glm has *not* `estimated the diagonals'
at all.  These are predictions (the cases are not in the model), and
predict.glm did get them right.

predict(z, data)
      1       2       3       4       5       6       7       8 
0.61337 3.30334 3.45098 4.51559 3.33786 6.02783 6.17548 7.24009 
      9      10      11      12      13      14      15      16 
3.50109 6.19106 6.33871 7.40332 4.57309 7.26306 7.41071 8.47532 

[...]
There is a help page for factor(), not for factors, and I don't believe
that does state so. It says that *factor* sorts them by default, which is
true.  Where is the `man page for factors'?
You can do that, of course, by specifying the order of the levels.  
However, we teach students that the details of the coding of linear models
are details, and that they should regard coefficients as secondary and
predictions as primary.  aov() has the right idea in suppressing the
individual coefficients.  If you want to interpret coefficients you need to
understand codings. Period.

I don't see the value of changing the default for factor: it would not be
backwards-compatible, and some people have thought thought what the default
would be and accepted it.
Are we supposed to guess that `mean value contrasts' means contr.sum?  In
which case the numbers refer to the number of the *contrast*: they do not
refer to levels, nor are single levels relevant. As in
[,1] [,2] [,3]
CC    1    0    0
GL    0    1    0
LY    0    0    1
WM   -1   -1   -1
GL LY WM
CC  0  0  0
GL  1  0  0
LY  0  1  0
WM  0  0  1

Now, what should the column labels be in the first case?  And in what sense
are these `mean value contrasts'?

Do you really want labels like "CCvWM"?  They could get very cumbersome.

(One could argue that contr.treatment is already wrong, but as they
are not even contrasts ....)

[...]
Good idea, but could you supply examples so we can put them in the
regression tests?