Dear All, I'm sure this is not the first time this question comes up but I couldn't find the keywords that would point me out to it - so apologies if this is a re-post. Basically I've got thousands of points, each depending on three variables: x, y, and z. if I do a plot(x,y, col=z), I get something very messy. So I would like to smooth the values of z according to the values of their neighbors (given by x and y). Could you please point me out to the function(s) I should look into for the smothing, and possibly for the plotting? Many thanks for your help, Emmanuel
Smoothing z-values according to their x, y positions
7 messages · Bert Gunter, Emmanuel Levy, David Winsemius
There are dozens of functions within R and contributed packages that do this. The spatial statistics packages (see the "spatial" task view on CRAN) is certainly where most are concentrated, but thin plate splines (in splines packages, I believe), tensor product splines (in mgcv package), local likelihood (in locfit package) etc. are also available for spatial smoothing. Perhaps the most basic place to look is the ?loess function in the base distribution, which will do exactly what you requested. Bert Gunter Genentech Nonclinical Statistics -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Emmanuel Levy Sent: Wednesday, March 19, 2008 12:42 PM To: r-help at r-project.org Subject: [R] Smoothing z-values according to their x, y positions Dear All, I'm sure this is not the first time this question comes up but I couldn't find the keywords that would point me out to it - so apologies if this is a re-post. Basically I've got thousands of points, each depending on three variables: x, y, and z. if I do a plot(x,y, col=z), I get something very messy. So I would like to smooth the values of z according to the values of their neighbors (given by x and y). Could you please point me out to the function(s) I should look into for the smothing, and possibly for the plotting? Many thanks for your help, Emmanuel ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Dear Bert,
Thanks for your reply - I indeed saw a lot of functions using:
help.search("smooth")
The problem is that most seem to not be very appropriate to what I'd like, or
they seem extremely complicated (e.g. gma). I am probably missing
something as I don't
see how to use Loess. From my poor understanding, it seems to be for 2D data.
Here I want to smooth the third "z" component.
In the meantime I adapted a function from Greg Warnes, which sort of
does the job (although the smoothing is not very nice).
So I'd be very happy to learn how do do something similar with loess!
x = runif(1000)
y = runif(1000)
z = x+y
hist2d.el(x,y,z, nbins=10)
## -- This function was adapted from Greg Warnes
hist2d.el <- function( x,y=NULL, z=NULL, nbins=200, my.cuts = NULL,
same.scale=FALSE, na.rm=TRUE, show=TRUE, col=c("black",
heat.colors(12)), ... )
{
if(is.null(y))
{
if(ncol(x) != 2) stop("If y is ommitted, x must be a 2 column matirx")
y <- x[,2]
x <- x[,1]
}
if(length(nbins)==1)
nbins <- rep(nbins,2)
nas <- is.na(x) | is.na(y)
if(na.rm)
{
x <- x[!nas]
y <- y[!nas]
}
else
stop("missinig values not permitted if na.rm=FALSE")
if(!is.null(my.cuts)){
nbins = rep(length(my.cuts)-1,2)
x.cuts=my.cuts
y.cuts=my.cuts
} else {
if(same.scale)
{
x.cuts <- seq( from=min(x,y), to=max(x,y), length=nbins[1]+1)
y.cuts <- seq( from=min(x,y), to=max(x,y), length=nbins[2]+1)
}
else
{
x.cuts <- seq( from=min(x), to=max(x), length=nbins[1]+1)
y.cuts <- seq( from=min(y), to=max(y), length=nbins[2]+1)
}
}
print(nbins)
index.x <- cut( x, x.cuts, include.lowest=TRUE)
index.y <- cut( y, y.cuts, include.lowest=TRUE)
m <- matrix( 0, nrow=nbins[1], ncol=nbins[2],
dimnames=list( levels(index.x),
levels(index.y) ) )
m.sum <- matrix( 0, nrow=nbins[1], ncol=nbins[2],
dimnames=list( levels(index.x),
levels(index.y) ) )
for( i in 1:length(index.x) )
m[ index.x[i], index.y[i] ] <- m[ index.x[i], index.y[i] ] + 1
for( i in 1:length(index.x) )
m.sum[ index.x[i], index.y[i] ] <- m.sum[ index.x[i], index.y[i] ] + z[i]
m[which(m==0)]=1
m = m.sum/(m)
xvals <- x.cuts[1:nbins[1]]
yvals <- y.cuts[1:nbins[2]]
if(show)
image( xvals,yvals, m, col=col,...)
#
invisible(list(counts=m,x=xvals,y=yvals))
invisible(m)
}
On 19/03/2008, Bert Gunter <gunter.berton at gene.com> wrote:
There are dozens of functions within R and contributed packages that do this. The spatial statistics packages (see the "spatial" task view on CRAN) is certainly where most are concentrated, but thin plate splines (in splines packages, I believe), tensor product splines (in mgcv package), local likelihood (in locfit package) etc. are also available for spatial smoothing. Perhaps the most basic place to look is the ?loess function in the base distribution, which will do exactly what you requested. Bert Gunter Genentech Nonclinical Statistics -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Emmanuel Levy Sent: Wednesday, March 19, 2008 12:42 PM To: r-help at r-project.org Subject: [R] Smoothing z-values according to their x, y positions Dear All, I'm sure this is not the first time this question comes up but I couldn't find the keywords that would point me out to it - so apologies if this is a re-post. Basically I've got thousands of points, each depending on three variables: x, y, and z. if I do a plot(x,y, col=z), I get something very messy. So I would like to smooth the values of z according to the values of their neighbors (given by x and y). Could you please point me out to the function(s) I should look into for the smothing, and possibly for the plotting? Many thanks for your help, Emmanuel
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
"Emmanuel Levy" <emmanuel.levy at gmail.com> wrote in news:e4654710803191554k4dfc2728g70bde35f41627754 at mail.gmail.com:
Dear Bert,
Thanks for your reply - I indeed saw a lot of functions using:
help.search("smooth")
The problem is that most seem to not be very appropriate to what I'd
like, or they seem extremely complicated (e.g. gma). I am probably
missing something as I don't
see how to use Loess. From my poor understanding, it seems to be for
2D data. Here I want to smooth the third "z" component.
In the meantime I adapted a function from Greg Warnes, which sort of
does the job (although the smoothing is not very nice).
So I'd be very happy to learn how do do something similar with
loess!
You could consider at combining the kde2d() function in the MASS package
within the VR bundle in combination with contour(). The example I
followed came from MASS 2ed in chapter 5's section on density estimation.
It is probably easiest to just copy the help example:
library(MASS)
attach(geyser)
f1 <- kde2d(duration[-272], duration[-1],
h = rep(1.5, 2), n = 50, lims = c(0.5, 6, 0.5, 6))
contour(f1, xlab = "previous duration",
ylab = "duration", levels = c(0.05, 0.1, 0.2, 0.4) )
I have also had success with Loader's locfit package.
David Winsemius
"Emmanuel Levy" <emmanuel.levy at gmail.com> wrote in news:e4654710803191554k4dfc2728g70bde35f41627754 at mail.gmail.com:
Dear Bert,
Thanks for your reply - I indeed saw a lot of functions using:
help.search("smooth")
The problem is that most seem to not be very appropriate to what I'd
like, or they seem extremely complicated (e.g. gma). I am probably
missing something as I don't see how to use Loess. From my poor
understanding, it seems to be for 2D data. Here I want to smooth the
third "z" component.
In the meantime I adapted a function from Greg Warnes, which sort of
does the job (although the smoothing is not very nice).
So I'd be very happy to learn how do do something similar with
loess!
I am realizing that my memories about the kde2d function were faulty. My suggestion would only let you look at the 2-d "marginals". You probably want a true 3d plotting function. The densities would then be closed surfaces. Take a look at Feng and Tierney's figure 10 in this poster at: <http://user2007.org/program/posters/feng.pdf> The code cannot be cut-and-pasted, but if it does what you want, it would be worth it.
David Winsemius
Dear David, Thanks a lot for pointing out kde2d, just tried it out but the problem is that it indeed takes the density of points into account, which I dont want. For example, if in an region of surface S I've got 10,000 points, and that their average height is 0.5, and in an other region I've got only ten points and their average value if also 0.5, I'd like all these points to be transformed to the same 0.5 value. At the moment, it seems that it's not the case. For example, the range of the values I give is: 0.2 - 3.7, but the range of the values that are outputed is 0 - 0.17. I haven't looked yet at the locfit package as it is not installed, but I will check it out! Thanks for helping! Emmanuel
On 20/03/2008, David Winsemius <dwinsemius at comcast.net> wrote:
"Emmanuel Levy" <emmanuel.levy at gmail.com> wrote in news:e4654710803191554k4dfc2728g70bde35f41627754 at mail.gmail.com:
> Dear Bert,
>
> Thanks for your reply - I indeed saw a lot of functions using:
> help.search("smooth")
>
> The problem is that most seem to not be very appropriate to what I'd
> like, or they seem extremely complicated (e.g. gma). I am probably
> missing something as I don't
> see how to use Loess. From my poor understanding, it seems to be for
> 2D data. Here I want to smooth the third "z" component.
>
> In the meantime I adapted a function from Greg Warnes, which sort of
> does the job (although the smoothing is not very nice).
> So I'd be very happy to learn how do do something similar with
> loess!
>
You could consider at combining the kde2d() function in the MASS package
within the VR bundle in combination with contour(). The example I
followed came from MASS 2ed in chapter 5's section on density estimation.
It is probably easiest to just copy the help example:
library(MASS)
attach(geyser)
f1 <- kde2d(duration[-272], duration[-1],
h = rep(1.5, 2), n = 50, lims = c(0.5, 6, 0.5, 6))
contour(f1, xlab = "previous duration",
ylab = "duration", levels = c(0.05, 0.1, 0.2, 0.4) )
I have also had success with Loader's locfit package.
--
David Winsemius
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
"Emmanuel Levy" <emmanuel.levy at gmail.com> wrote in news:e4654710803191929j554c211bla7088bc8bc69c445 at mail.gmail.com:
Dear David, Thanks a lot for pointing out kde2d, just tried it out but the problem is that it indeed takes the density of points into account, which I dont want. For example, if in an region of surface S I've got 10,000 points, and that their average height is 0.5, and in an other region I've got only ten points and their average value if also 0.5, I'd like all these points to be transformed to the same 0.5 value. At the moment, it seems that it's not the case. For example, the range of the values I give is: 0.2 - 3.7, but the range of the values that are outputed is 0 - 0.17.
"Emmanuel Levy" <emmanuel.levy at gmail.com> wrote in news:e4654710803191554k4dfc2728g70bde35f41627754 at mail.gmail.com:
> Dear Bert,
>
> Thanks for your reply - I indeed saw a lot of functions using:
> help.search("smooth")
>
> The problem is that most seem to not be very appropriate to what
> I'd like, or they seem extremely complicated (e.g. gma). I am
> probably missing something as I don't see how to use Loess. From
> my poor understanding, it seems to be for 2D data. Here I want
> to smooth the third "z" component.
Your description of what is desired leads me to believe you were misreading the "loess" documentation. For proof I suggest you visit Deepayan Sarkar's webpage and in particular see Figure 6.8 where 3d plots of loess fits for more complex data arrangements are exemplified: Choose Figure 6.8 (code on right side of window): <http://lmdvr.r-forge.r-project.org/figures/figures.html> Code: <http://dsarkar.fhcrc.org/lattice/book/Chapter06-Trivariate/all.R> PNG image: <http://dsarkar.fhcrc.org/lattice/book/images/Figure_06_08_stdClassic.png>
David Winsemius