An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101026/0c4ff485/attachment.pl>
anomalies with the loess() function
6 messages · Federico Bonofiglio, Jonathan P Daily, Peter Ehlers +1 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101026/1aa6c8be/attachment.pl>
On 2010-10-26 11:48, Jonathan P Daily wrote:
?loess use this instead: fit<- loess(b~a) lines(a, predict(fit))
I don't think that will work when there are incomplete cases, in which case 'a' and predict(fit) may not correspond. I think that it's always best to define a set of predictor values and use predict() to get the corresponding fits and plot according to taste: fm <- loess(b ~ a) aa <- seq(0, 1000, length=101) bb <- predict(fm, aa) lines(aa, bb, col="blue", lwd=2) @Federico: see further comments below.
--------------------------------------
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
the thing itself have purpose? Or do we, what's the word... imbue it."
- Jubal Early, Firefly
From:
Federico Bonofiglio<bonoricus at gmail.com>
To:
r-help at r-project.org
Date:
10/26/2010 02:38 PM
Subject:
[R] anomalies with the loess() function
Sent by:
r-help-bounces at r-project.org
Hello Masters,
I run the loess() function to obtain local weighted regressions, given
lowess() can't handle NAs, but I don't
improve significantly my situation......, actually loess() performance
leave
me much puzzled....
I attach my easy experiment below
#------SCRIPT----------------------------------------------
#I explore the functionalities of lowess()& loess()
#because I have encountered problems in execute local weighted regressions
#with lowess() (in presence of NAs)& with loess() (always!!!)
#I generate 2 fictious vectors
a<-sample(c(sample(1:1000,100),rep(NA,50)))
b<-sample(c(sample(1:1000,100),rep(NA,50)))
#lm() has no problems..can handle the missing values
plot(a,b)
abline(lm(b~a),col="red",lwd=2)
#loess return a plain mess like it would go dizzed with ordering or
something.
Yes, the 'mess' is due to the unordered nature of your data. lines() will plot in the order in which the points occur in your data. You could order before calling loess: ord <- order(a) a1 <- a[ord] b1 <- b[ord] fm <- loess(b1 ~ a1)
#Off course lowess() turns useless in presence of NAs, I don't even try it. lines(loess(b~a)) #I get rid off NAs and compare lowess()& loess() performance, expecting to #obtain the same result as both functions implement local weighted regressions a<-na.omit(a) b<-na.omit(b)
This is a bad idea. The values of 'a' and 'b' will no longer be paired. Another reason to prefer dataframes.
#check out the evidence.....something's wrong with loess()???
There's nothing wrong with loess; it just needs more than a single intercept and slope to plot its predictions. -Peter Ehlers
par(mfrow=c(1,2)) plot(a,b) lines(lowess(a,b),col="red")#if NAs are excluded lowess() runs regularly plot(a,b) lines(loess(b~a),col="red")#.....but loess() keeps messing all over...!!???
On Tue, 2010-10-26 at 13:29 -0700, Peter Ehlers wrote:
On 2010-10-26 11:48, Jonathan P Daily wrote:
?loess use this instead:
If you change this:
fit<- loess(b~a)
to be this: fit <- loess(b~a, na.action = na.exclude) then this:
lines(a, predict(fit))
will work. G
I don't think that will work when there are incomplete cases, in which case 'a' and predict(fit) may not correspond. I think that it's always best to define a set of predictor values and use predict() to get the corresponding fits and plot according to taste: fm <- loess(b ~ a) aa <- seq(0, 1000, length=101) bb <- predict(fm, aa) lines(aa, bb, col="blue", lwd=2) @Federico: see further comments below.
--------------------------------------
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
the thing itself have purpose? Or do we, what's the word... imbue it."
- Jubal Early, Firefly
From:
Federico Bonofiglio<bonoricus at gmail.com>
To:
r-help at r-project.org
Date:
10/26/2010 02:38 PM
Subject:
[R] anomalies with the loess() function
Sent by:
r-help-bounces at r-project.org
Hello Masters,
I run the loess() function to obtain local weighted regressions, given
lowess() can't handle NAs, but I don't
improve significantly my situation......, actually loess() performance
leave
me much puzzled....
I attach my easy experiment below
#------SCRIPT----------------------------------------------
#I explore the functionalities of lowess()& loess()
#because I have encountered problems in execute local weighted regressions
#with lowess() (in presence of NAs)& with loess() (always!!!)
#I generate 2 fictious vectors
a<-sample(c(sample(1:1000,100),rep(NA,50)))
b<-sample(c(sample(1:1000,100),rep(NA,50)))
#lm() has no problems..can handle the missing values
plot(a,b)
abline(lm(b~a),col="red",lwd=2)
#loess return a plain mess like it would go dizzed with ordering or
something.
Yes, the 'mess' is due to the unordered nature of your data. lines() will plot in the order in which the points occur in your data. You could order before calling loess: ord <- order(a) a1 <- a[ord] b1 <- b[ord] fm <- loess(b1 ~ a1)
#Off course lowess() turns useless in presence of NAs, I don't even try it. lines(loess(b~a)) #I get rid off NAs and compare lowess()& loess() performance, expecting to #obtain the same result as both functions implement local weighted regressions a<-na.omit(a) b<-na.omit(b)
This is a bad idea. The values of 'a' and 'b' will no longer be paired. Another reason to prefer dataframes.
#check out the evidence.....something's wrong with loess()???
There's nothing wrong with loess; it just needs more than a single intercept and slope to plot its predictions. -Peter Ehlers
par(mfrow=c(1,2)) plot(a,b) lines(lowess(a,b),col="red")#if NAs are excluded lowess() runs regularly plot(a,b) lines(loess(b~a),col="red")#.....but loess() keeps messing all over...!!???
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
On Tue, 2010-10-26 at 20:04 +0200, Federico Bonofiglio wrote:
Hello Masters,
<snip />
#I generate 2 fictious vectors a<-sample(c(sample(1:1000,100),rep(NA,50))) b<-sample(c(sample(1:1000,100),rep(NA,50)))
<snip />
a<-na.omit(a) b<-na.omit(b) #check out the evidence.....something's wrong with loess()???
No, something is wrong with your assumptions. lowess() and loess() do not return anything like the same object. lowess() returns ordered $x and $y components. The key here is "ordered". loess() returns a complex list:
str(fit)
List of 18 $ n : int 66 $ fitted : num [1:66] 390 447 617 494 283 ... $ residuals: Named num [1:66] 270 188 161 -288 104 ... ..- attr(*, "names")= chr [1:66] "1" "2" "4" "5" ... $ enp : num 7.87 .... etc. The fitted values etc are in the order of the data you supplied in a and b. Hence when you plot these values you will get a cats cradle of lines because you are asking R to join points in sequence that are *not* ordered across the plot.
par(mfrow=c(1,2)) plot(a,b) lines(lowess(a,b),col="red")#if NAs are excluded lowess() runs regularly plot(a,b) lines(loess(b~a),col="red")#.....but loess() keeps messing all over...!!???
Not sure what you hoped that to do. You are assuming far too much here.
You need to extract the information you want from a loess() object. That
lowess() works here is because it returns components $x and $y and the
underlying plot infrastructure in R will look for this in object it is
supplied. loess() doesn't supply these $x and $y so what you are seeing
is just gibberish because you supplied garbage.
Two alternatives:
a<-sample(c(sample(1:1000,100),rep(NA,50)))
b<-sample(c(sample(1:1000,100),rep(NA,50)))
mod <- loess(b~a, na.action = na.exclude)
fit <- fitted(mod)
ord <- order(a)
plot(a, b)
lines(a[ord], fit[ord], col = "red")
The above respects your fitted values and the locations of missing
values. The next solution will plot the fitted smooth for a set of
regularly spaced values over the range of a
pdata <- data.frame(a = seq(min(a, na.rm = TRUE),
max(a, na.rm = TRUE), length = 100))
pred <- predict(mod, pdata)
plot(a, b)
lines(pdata$a, pred, col = "red")
HTH
G
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
On Wed, 2010-10-27 at 08:26 +0100, Gavin Simpson wrote:
On Tue, 2010-10-26 at 13:29 -0700, Peter Ehlers wrote:
On 2010-10-26 11:48, Jonathan P Daily wrote:
?loess use this instead:
If you change this:
fit<- loess(b~a)
to be this: fit <- loess(b~a, na.action = na.exclude) then this:
lines(a, predict(fit))
will work.
I meant "will work in general". In this particular case, the data are not ordered in 'a' so it still gives a cradle of lines. The above was meant to address the "incomplete cases" issue. G
G
I don't think that will work when there are incomplete cases, in which case 'a' and predict(fit) may not correspond. I think that it's always best to define a set of predictor values and use predict() to get the corresponding fits and plot according to taste: fm <- loess(b ~ a) aa <- seq(0, 1000, length=101) bb <- predict(fm, aa) lines(aa, bb, col="blue", lwd=2) @Federico: see further comments below.
--------------------------------------
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
the thing itself have purpose? Or do we, what's the word... imbue it."
- Jubal Early, Firefly
From:
Federico Bonofiglio<bonoricus at gmail.com>
To:
r-help at r-project.org
Date:
10/26/2010 02:38 PM
Subject:
[R] anomalies with the loess() function
Sent by:
r-help-bounces at r-project.org
Hello Masters,
I run the loess() function to obtain local weighted regressions, given
lowess() can't handle NAs, but I don't
improve significantly my situation......, actually loess() performance
leave
me much puzzled....
I attach my easy experiment below
#------SCRIPT----------------------------------------------
#I explore the functionalities of lowess()& loess()
#because I have encountered problems in execute local weighted regressions
#with lowess() (in presence of NAs)& with loess() (always!!!)
#I generate 2 fictious vectors
a<-sample(c(sample(1:1000,100),rep(NA,50)))
b<-sample(c(sample(1:1000,100),rep(NA,50)))
#lm() has no problems..can handle the missing values
plot(a,b)
abline(lm(b~a),col="red",lwd=2)
#loess return a plain mess like it would go dizzed with ordering or
something.
Yes, the 'mess' is due to the unordered nature of your data. lines() will plot in the order in which the points occur in your data. You could order before calling loess: ord <- order(a) a1 <- a[ord] b1 <- b[ord] fm <- loess(b1 ~ a1)
#Off course lowess() turns useless in presence of NAs, I don't even try it. lines(loess(b~a)) #I get rid off NAs and compare lowess()& loess() performance, expecting to #obtain the same result as both functions implement local weighted regressions a<-na.omit(a) b<-na.omit(b)
This is a bad idea. The values of 'a' and 'b' will no longer be paired. Another reason to prefer dataframes.
#check out the evidence.....something's wrong with loess()???
There's nothing wrong with loess; it just needs more than a single intercept and slope to plot its predictions. -Peter Ehlers
par(mfrow=c(1,2)) plot(a,b) lines(lowess(a,b),col="red")#if NAs are excluded lowess() runs regularly plot(a,b) lines(loess(b~a),col="red")#.....but loess() keeps messing all over...!!???
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%