Skip to content

(PR#8877) predict.lm does not have a weights argument for

4 messages · Brian Ripley, Peter Dalgaard

#
I am more than 'a little disappointed' that you expect a detailed 
explanation of the problems with your 'bug' report, especially as you did 
not provide any explanation yourself as to your reasoning (nor did you 
provide any credentials nor references).

Note that

1) Your report did not make clear that this was only relevant to 
prediction intervals, which are not commonly used.

2) Only in some rather special circumstances do weights enter into 
prediction intervals, and definitely not necessarily the weights used for 
fitting.  Indeed, it seems that to label the variances that do enter as 
inverse weights would be rather misleading.

3) In a later message you referenced Brown's book, which is dealing with a 
different model.

The model fitted by lm is

 	y = x\beta + \epsilon, \epsilon \sim N(0, \sigma^2)

(Row vector x, column vector \beta.)

If the observations are from the model, OLS is appropriate, but weighting 
is used in several scenarios, including:

(a) case weights:  w_i = 3 means `I have three observations like (y, x)'

(b) inverse-variance weights, most often an indication that w_i = 1/3 
means that y_i is actually the average of 3 observations at x_i.

(c) multiple imputation, where a case with missing values in x is split 
into say 5 parts, with case weights less than and summing to one.

(d) Heteroscedasticity, where the model is rather

         y = x\beta + \epsilon, \epsilon \sim N(0, \sigma^2(x))

And there may well be other scenarios, but those are the most common (in 
decreasing order) in my experience.


Now, consider prediction intervals.  It would be perverse to consider 
these to be for other than a single future observation at x.  In scenarios 
(a) to (c), R's current behaviour is what is commonly accepted to be 
correct (and you provide no arguments otherwise). If a future observation 
has missing values, predict.lm would only be a starting point for multiple 
imputation.

Even if 'newdata' is not supplied, prediction intervals must apply to new 
observations, not the existing ones (or the formula used is wrong: perhaps 
to avoid your confusion they should not be allowed in that case).

Only in case (d), which is a different model, is it appropriate to supply 
error variances (not weights) for prediction intervals.  This is why I 
marked it for the wishlist.  Equally, one might want to specify
\sigma^2 for all future observations as being different from the model 
fitting, as the training data may include other components of variance in 
their error variances.
On Sat, 20 May 2006, jranke at uni-bremen.de wrote:

            
Where are the references and arguments?
Not found.
That example is not a valid use of WLS, as you have the weights depending 
on the data you are fitting.

  
    
#
ripley at stats.ox.ac.uk writes:
I'd have (d) higher on the list, but never mind. There's also

(e) Inverse probability weights: Knowing that part of the population
is undersampled and wanting results that are compatible with what you
would have gotten in a balanced sample. Prototypically: You sample X,
taking only a third of those with X > c; find population mean of X,
(or univariate regression on some other variable, which is only
recorded in the subsample).

(R-bugs stripped from recipients since this doesn't really have
anything to do with the purported bug.)
#
On Wed, 24 May 2006, Peter Dalgaard wrote:

            
I find that if you detect heteroscedasticity, then one of the following 
applies:

- a transformation of y would be beneficial

- a non-normal model, e.g. a Poisson regression, is more appropriate

- the error variance really depends on y or Ey not x, as in most
   measurement-error scenarios (and the example in ?nls and the example
   in the addendum to the bug report).

- in analytical chemistry as in the example on the addendum to the bug
   report, there are errors in both y and x to consider, and a functional
   relationship model is better.

So I very rarely actually get as far as predicting from a heteroscedastic 
regression model.
I would call this an example of case weights (you are just weighting cases 
and saying `I have 1/p like this', and in rlm there is a difference 
between (a) and (b) and you would want to use wt.method="case" for (e)).
2 days later
#
Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:
No it's not quite the same. One is "I have 3 of these", the other is
"I have looked at one case, but it comes from a population that I know
is undersampled by a factor of 3". Standard error of estimates will be
considerably different.