-----Original Message-----
From: Liaw, Andy [mailto:andy_liaw at merck.com]
Sent: Monday, September 16, 2002 4:17 PM
To: 'John Fox'; jdeke2 at comcast.net
Cc: r-help at stat.math.ethz.ch
Subject: RE: [R] loess crash
I agree with John mostly. For a model as complicated as
you're trying to
fit with loess, you might as well try things like ppr (in
the `modreg'
package), MARS (in the 'mda' package) or neural nets (in the 'nnet'
package), or even randomForest... Actually MARS might offer
a bit more
interpretability than others, because of its hierarchical
construction.
If you do care about `marginal effects' of the predictors,
then aren't you
sort of assuming additivity? In which case the additive model is more
appropriate. If not, the `marginal effects' can be misleading.
In terms of comparing a loess with 5 terms with a less
complicated model, I
think it needs to be pointed out that (AFAIK) it can only be
done on a more
or less qualitative level, as the models are not nested.
Cheers,
Andy
-----Original Message-----
From: John Fox [mailto:jfox at mcmaster.ca]
Sent: Monday, September 16, 2002 1:59 PM
To: jdeke2 at comcast.net
Cc: r-help at stat.math.ethz.ch
Subject: RE: [R] loess crash
Dear John,
It's true that the gam function in mgcv fits with splines
while loess uses
local regression, but an even more fundamental difference is
that gam fits
additive models (though, with some care, you can include
higher-dimensional
terms). Given your description of what you plan to do with
model, an additive model might be what you want.
More generally, a model that fits five-way interactions may
be useful as a
point of comparison for simpler models, but I doubt that it
will provide a
digestible description of the data.
I hope that this helps,
John
At 10:45 AM 9/16/2002 -0400, you wrote:
Thanks for the suggestion. I've only used splines for
before -- I've never used them for regression (although I'm
people do). I'll look into it...
-----Original Message-----
From: Rafael A. Irizarry [mailto:ririzarr at jhsph.edu]
Sent: Monday, September 16, 2002 10:17 AM
To: jdeke2 at comcast.net
Cc: 'r-help at stat.math.ethz.ch'
Subject: RE: [R] loess crash
i would suggest looking at the package mgcv.
you can fit generalized additive models which are useful for what
you desribe below.
On Mon, 16 Sep 2002, John Deke wrote:
Ah... I hadn't noticed that option! Thanks... that's a
happy to use local linear regression.
To answer your question -- perhaps I'm off base, but my
to do this is that I have a set of explanatory variables
influence my dependent variable in ways that are
parametrically. That is, I suspect that there are all sorts of
relationships between these variables, and its not at all
a satisfying theoretical model that would suggest a
relationship. So, rather than using parametric
regression, I'd like to try
something non-parametric.
My plan for summarizing the results is to find the
of each explanatory variable of interest, holding all
would calculate predicted outcomes for combinations of
variables that are most likely to occur in "the real world".
John
-----Original Message-----
From: John Fox [mailto:jfox at mcmaster.ca]
Sent: Monday, September 16, 2002 9:31 AM
To: John Deke
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] loess crash
Dear John,
For curiosity, I tried your example under R 1.5.1 on an
Mb of memory running Windows 2000. The results were just
The four-predictor problem ran essentially instantly, and the
five-predictor problem crashed R, again instantly.
I also tried making the problem less computationally
specifying locally linear, rather than quadratic, fits;
> loess(y~x1+x2+x3+x4+x5, data2, degree=1)
Call:
loess(formula = y ~ x1 + x2 + x3 + x4 + x5, data = data2,
Number of Observations: 500
Equivalent Number of Parameters: 13.5
Residual Standard Error: 1.012
Although something is obviously wrong here, I wonder
to fit a local regression with so many predictors (unless
compare the general nonparametric fit with some more
how would you describe the five-dimensional surface
John
At 07:36 AM 9/16/2002 -0400, John Deke wrote:
Here's a simple example that yields the crash:
library(modreg)
data1 <- array(runif(500*5),c(500,5))
colnames(data1) <- c("x1","x2","x3","x4","x5")
y <-
3+2*data1[,"x1"]+15*data1[,"x2"]+13*data1[,"x3"]-8*data1[,"x4
data2 <- cbind(y,data1)
data2 <- as.data.frame(data2)
result1 <- loess(y~x1+x2+x3+x4,data2)
To get the crash, I just add x5--
result1 <- loess(y~x1+x2+x3+x4+x5,data2)
And bammo -- I'm dead. It doesn't even pause -- Rgui
really crashes -- the program is terminated, I get the
dialogue saying that a log file is being generated --
death scene.
I know its a computationally intensive thing, but the
crash (with four explanatory variables) runs almost
see how adding a fifth could be so catastrophic. But I
this particular methodology....
John
At 03:38 AM 9/16/2002, Peter Dalgaard BSA wrote:
John Deke <jdeke2 at comcast.net> writes:
Hmm... if I reduce the number of observations to
the error.
I don't think its an issue of colinearity, because
different combinations of variables, all of which
OLS or logistic regression.
I'm probably doing something stupid, but I'm not
At 02:00 PM 9/15/2002, John Deke wrote:
Hi,
I have a data frame with 6563 observations. I can
with loess using four explanatory variables. If I
crashes. There are no missings in the data, and
regression with any four of the five explanatory
works. Its only when I go from four to five that
Hmm... I wouldn't try loess with more than one or two
mean, it's a smoothing method and representing a smooth
many variables can be computationally demanding.
The Fortran source code for loess is one of the more
of R, but I can see that some structures inside of it
size, which might explain it (BTW: Does R really crash,
memory exhausted?).
Do you have a simple example that reproduces the crash
-----------------------------------------------------
John Fox