Skip to content

weighted cumulative distribution with ggplot2

6 messages · Francesco, David Winsemius

#
Dear all,

I am trying to draw a weighted cumulative distribution (as defined
here http://rss.acs.unt.edu/Rdoc/library/spatstat/html/ewcdf.html)
with ggplot2

however the syntax

temp<-qplot(X,weight=weight,data=data,stat = "ecdf", geom =
"step",colour=factor(year))

seems not to produce exactly the right figure (the values seems higher
at some points)... I am wrong in the weight definition?

The data is like the following

X     Weight Year
0      2         2001
0      1         2001
1      5         2001
2      1         2001
2      3         2001
2      2         2002
3.. etc

Any ideas ?
Many thanks in advance
#
I think I have my answer... ggplot2 uses ecdf which does NOT allow
weightings...
so there is no warning or error, but still the resulting plot do not
take into account the command weight=weight

Hope that helps someone, just in case ;-)
On 8 October 2012 15:40, Francesco <cariboupad at gmx.fr> wrote:
#
On Oct 8, 2012, at 8:01 AM, Francesco wrote:

            
It was completely unclear why you expected ggplot to use ' ewcdf' when you gave a command to use 'ecdf'.
David Winsemius, MD
Alameda, CA, USA
#
On Oct 8, 2012, at 9:18 AM, David Winsemius wrote:

            
You might want to look at stat_function. It appears designed to provide a mechanism for running data through functions that do not have current support in ggplot2. I've never really grok-ked how one is supposed to pass arguments into ggplot constructs and find the help pages not so helpful in figuring this out, so this is a big fat untested guess.
#
On Oct 8, 2012, at 10:12 AM, David Winsemius wrote:

            
Here's a further stab at implementing my guess:

dat <- read.table(text="X     Weight Year
0      2         2001
0      1         2001
1      5         2001
2      1         2001
2      3         2001
2      2         2002", header=TRUE)

# Notice that ewcdf returns a function rather than a vector:

 with(dat, ewcdf(X, weights=Weight) )
Empirical CDF 
Call: ewcdf(X, weights = Weight)
 x[1:3] =      0,      1,      2

temp<-qplot(X,weight=weight,data=dat,stat = "ecdf", geom =
                        "step",colour=factor(year))

temp + stat_function(fun = with(dat, ewcdf(X, weights=Weight) ), 
                      mapping=aes(x=X, weights=Weight), colour = "red", 
                      data=dat )

I'm not sure that is what was intended. I think there may still be residual points for the forst qplot call but the data does seem to be getting through to the ewcdf function. Maybe you can fix it.
David Winsemius, MD
Alameda, CA, USA
#
thank you very much David, I will have a look at your idea
Best,
On 8 October 2012 19:24, David Winsemius <dwinsemius at comcast.net> wrote: