Skip to content

strange behavior of loess() & predict()

1 message · Hong Ooi

#
_______________________________________________________________________________________


The problem appears to be in how your original data has several tied values:
x
1.8   2 2.2 2.4 2.6 2.8   3 3.2 3.4 3.6   4 
  1   2   2   2   5   7   2   3   1   2   1

IIRC the maths and programming behind loess assume unique values for the predictor.

One way to get around this is to jitter your data:
[1] 3.156192 3.141705 3.126918 3.112996 3.101108 3.087696 3.063471 3.038609 3.024639 3.032585 3.059480 3.091774
[13] 3.115763 3.117743 3.092979 3.040798 2.988283 2.957976 2.950648 3.008358 3.070065 3.127379 3.193501 3.149428
[25] 3.082843 3.010998 2.939407 2.888213 2.841487 2.812815 2.801583 2.807181 2.837887 2.899130 2.978165 3.062088
[37] 3.137995 3.204628 3.271813 3.339450 3.407396 3.475510 3.543843 3.612450 3.681267 3.750227 3.819267 3.888321
[49] 3.957324 4.026212

Another way is to summarise your data using table() and aggregate(), and fit a weighted model where the weights are the counts for each unique x-value:
[1] 3.186959 3.163133 3.136244 3.110822 3.091396 3.076705 3.047705 3.018362 3.007143 3.032246 3.069599 3.092369
[13] 3.098049 3.084134 3.053633 3.027429 3.012429 3.013908 3.036517 3.060372 3.076116 3.086870 3.095758 3.097287
[25] 3.073824 3.031238 2.976659 2.917402 2.863489 2.821469 2.796398 2.793336 2.823850 2.892363 2.980322 3.068725
[37] 3.140843 3.208920 3.279124 3.351965 3.427952 3.504330 3.577149 3.647119 3.714984 3.781486 3.847369 3.913375
[49] 3.980249 4.048733

There's probably a way to make the aggregate and table calls neater.


-- 
Hong Ooi
Senior Research Analyst, IAG Limited
388 George St, Sydney NSW 2000
+61 (2) 9292 1566
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Leo G??rtler
Sent: Wednesday, 7 December 2005 8:10 AM
To: r-help at stat.math.ethz.ch
Cc: gavin.simpson at ucl.ac.uk
Subject: Re: [R] strange behavior of loess() & predict()
Gavin Simpson wrote:
Dear list,

I am very sorry for being inaccurate in my question. But re-reading the 
predict.loess help site does not provide a solution. As long as predict 
is used on a new dataset based on this dataset, the strange values 
remain and can be reproduced.
Adding a new element to both vectors (at the beginning, e.g. "1" for 
each vector) results in plausible values - but not in every case.
Even switching x and y is sufficient (i.e. x as predictor and y as 
dependent variable). So my question is:

Is it normal - or under which conditions does it take place - that 
predict.loess predicts values that are almost 20000/max(y) ~ 5000 times 
higher than expected?

best,

leo g??rtler