glm gives incorrect results for zero-weight cases (PR#780) - R-devel

Wed, Dec 20, 2000 2:59 AM #

Using zero-weight values in glm returns incorrect fitted values and
linear predictors, the ninth value in the following.

data=d.AD, weights=c(rep(1,8), 0))

1        2        3        4        5        6        7        8 
2.989646 2.535391 2.862201 2.989646 2.535391 2.862201 3.145992 2.691737 
       9 
2.493205

1        2        3        4        5        6        7        8 
2.989646 2.535391 2.862201 2.989646 2.535391 2.862201 3.145992 2.691737 
       9 
3.018547

1        2        3        4        5        6        7        8 
19.87864 12.62136 17.50000 19.87864 12.62136 17.50000 23.24272 14.75728 
       9 
12.10000

[1] 19.87864 12.62136 17.50000 19.87864 12.62136 17.50000 23.24272 14.75728
[9] 20.46154

The reason is obvious: glm.fit only ever updates eta[good], and 
zero-weight values are not `good'.  So eta[weights == 0] is stuck at the
initial values.

There are two possible fixes:

1) Update eta after the final fit, and then mu.  Out of range values
could then be NA (although it looks like predict.glm does not check).

2) Update all eta and hence mu values during the iterations.  This will
apply the constraints on eta/mu at zero-weight points too, and so might
be different.

I am inclined to think that 2) is right, and that adding points with zero 
weight to the fit is not the same as omitting them.

Opinions?


--please do not edit the information below--

Version:
 platform = sparc-sun-solaris2.6
 arch = sparc
 os = solaris2.6
 system = sparc, solaris2.6
 status = 
 major = 1
 minor = 2.0
 year = 2000
 month = 12
 day = 15
 language = R

Search Path:
 .GlobalEnv, package:ctest, Autoloads, package:base

Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Brian Ripley

Wed, Dec 20, 2000 4:37 AM #

On 20 Dec 2000, Peter Dalgaard BSA wrote:

Constraints can be added by the user, of course, but in the standard cases
(canonical links) they never bite.  Poisson with linear link is one case
where they might.  This checking is something that R has but S and GLIM
(AFAIR) do not.

Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Peter Dalgaard

Wed, Dec 20, 2000 4:37 AM #

ripley@stats.ox.ac.uk writes:

Just for clarification: This applies only to cases where the
parametrization is non-canonical, e.g. additive models with Poisson
response, right? And essentially the issue is that if you have a model
like lambda = a + b x and you put in a zero-weight observation with x
= 0, then that should effectively constrain a to be positive. That
does make quite good sense, yes.

O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Thomas Lumley

Wed, Dec 20, 2000 8:18 AM #

On 20 Dec 2000, Peter Dalgaard BSA wrote:

Not just non-canonical. There are boundary problems with gamma/reciprocal
glms.  I would also go for the second solution.


	-thomas


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._