Skip to content

R-alpha: frame tools

8 messages · Ross Ihaka, Kurt Hornik, Peter Dalgaard +1 more

#
Kurt Hornik writes:

	> What's also needed is something to produce and plot empirical
	> distribution functions.  I thought I could write something for my
	> Biostat course last semester, but it never happened ...

Have you had a look at the type="S" and type="s" options to plot?  They
produce two variants of step functions.  I didn't know about them until
I started implementing our version graphics.  They are documented in "par".
	Ross
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
5 days later
#
I hadn't known about them either, but I think Martin's plot.step is
needed for plotting e.g. ECDFs.

Here's something what is needed for teaching elementary statistics:
Given a sample vector x, compute the corresponding ECDF, plot it, and
perhaps evaluate it at points other than the data points.

One solution might be to have a function "ecdf" which returns an obj of
class "step" (e.g., simply a list with the points where the jumps occur
and the corresponding values, and maybe if we want to be more general
than that info on whether the function is left- or right-continuous at
the point (or perhaps attains a different value).  (I.e., something of
the kind that Martin's plot.step uses ...)

This would more or less automagically take care of the plotting part.
We could also easily add print and summary methods.

The question is, how can we `easily' evaluate a step function at some
point?  Using `predict' does not seem to be the right thing ... any
suggestions?

-k
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#
On Tue, 19 Aug 1997, Kurt Hornik wrote:
The approx() function is supposed to do this (it currently only does
linear interpolation).  If the right and left continuous step functions
were added to approx then we could easily use approxfun to define the ECDF
as a function closure, making it automatically available at any point you
want. After all, the ECDF *is* a function.


Thomas Lumley
------------------------------------------------------+------
Biostatistics		: "Never attribute to malice what  :
Uni of Washington	:  can be adequately explained by  :
Box 357232		:  incompetence" - Hanlon's Razor  :
Seattle WA 98195-7232	:				   :
------------------------------------------------------------

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#
tapply() has been broken for a long time and is still wrong in 50-a3.  I
think the following version works. 

	-thomas

"tapply" <-function (x, INDEX, FUN, ...) 
{
        if (is.character(FUN)) 
                FUN <- get(FUN, mode = "function")
        if (mode(FUN) != "function") 
                stop(paste("\"", FUN, "\" is not a function"))
        if (!is.list(INDEX)) 
                INDEX <- list(INDEX)
        namelist <- vector("list", length(INDEX))
        extent <- integer(length(INDEX))
        nx <- length(x)
        group <- rep(1, nx)
        ngroup <- 1
        for (i in seq(INDEX)) {
                index <- as.factor(INDEX[[i]])
                if (length(index) != nx) 
                        stop("arguments must have same length")
                namelist[[i]] <- levels(index)
                extent[[i]] <- nlevels(index)
                group <- group + ngroup * (codes(index) - 1)
                ngroup <- ngroup * nlevels(index)
        }
        if (missing(FUN))  
                return(group)
	ansmat<-array(NA,dim=extent,dimnames=namelist)
        ans <- lapply(split(x, group), FUN, ...)
        if (all(unlist(lapply(ans, length)) == 1)) {
                ans <- unlist(ans, recursive = FALSE)
	      }
	else { mode(ansmat)<-"list"}
	ansmat[as.numeric(names(ans))]<-ans
	ans<-ansmat
        return(ans)
}


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
2 days later
#
Ah, very nice idea!

So, we'd have

	ecdf(x)

return a FUNCTION of class (e.g.) "step" which would on the one hand
have the right methods for it and on the other hand would allow us to
evaluate it at new points right away?

Martin, Peter?

-k
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#
Kurt Hornik <hornik@ci.tuwien.ac.at> writes:
Neat idea. And - hey! - someone has made it possible to assign classes
to a function now. I looked into that a while ago with a view to kill
off the antipedagogical habit that R and S has of dumping screenfulls
of gibberish on the unsuspecting newbie. At that time, it wasn't
possible so I dropped it, meaning to take it up on the list later on.

That would be like

print.function<-function(f)cat("Use 'show' to see content of function\n")
show<-unclass

Hmm. Do we want something like this? It clashes a little with other
uses of function classes, but not badly as far as I can see.
#
Another minor incompatibility
R>attr(terms(y~x),"response")
[1] TRUE
S> attr(terms(y~x),"response")
[1] 1

In S the attribute indicates which column of the model frame will contain
the response. In R this always column 1 because model frames are only
useful when their columns are in the right order (model.matrix doesn't
check).


Thomas Lumley
------------------------------------------------------+------
Biostatistics		: "Never attribute to malice what  :
Uni of Washington	:  can be adequately explained by  :
Box 357232		:  incompetence" - Hanlon's Razor  :
Seattle WA 98195-7232	:				   :
------------------------------------------------------------

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
1 day later
#
You can always define a dummy graphics driver for batch mode by

dummy.driver<-function() postscript(file="/dev/null")

which will actually do all the calculations needed (and so catch errors)
but won't produce anything.  Then you can edit the Rprofile so that
dummy.driver() is started if you aren't running interactively.  

I don't know if something similar can be done for Mac/Win.

	-thomas

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-