hist.default documentation
Thanks, I've committed the change. Duncan Murdoch
On 6/17/2005 10:30 AM, Deepayan Sarkar wrote:
On 6/17/05, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
On 6/17/2005 8:58 AM, Deepayan Sarkar wrote:
I think there are a couple of things in ?hist that are not quite as
clear as they could be.
(1)
freq: logical; if 'TRUE', the histogram graphic is a representation
of frequencies, the 'counts' component of the result; if
'FALSE', _relative_ frequencies ("probabilities"), component
'density', are plotted. Defaults to 'TRUE' _iff_ 'breaks'
are equidistant (and 'probability' is not specified).
Unless I'm missing something, the 'density' component is NOT relative
frequency or 'probability' in any reasonable sense, country-specific
biases notwithstanding, except in the very special case where
all(diff(breaks) == 1). Thus, the above description is confusing and
probably even wrong.
I agree.
Also, it seems to me that hist cannot draw a relative frequency histogram at all (which is not a bad thing, but it's of course very important to the undergrads we're teaching intro stats and R to). This should be explicitly mentioned.
I'm not sure about this. Is it really worth mentioning something if you can't do it? Are you thinking of just giving a reference to barplot?
Not mentioning it is fine.
(2)
breaks: one of:
...
* a single number giving the number of cells for the
histogram,
...
This is not quite true. 'breaks' is used in 'pretty', so it's more a
suggestion than an exact specification. I'm not sure whether or not
the behaviour should be changed (what's the point of having ``pretty''
breakpoints anyway?), but if not, the documentation should be
clarified.
I like the pretty breakpoints. It is good to label the breakpoints, and ugly to have labels at other than pretty points. I'd clarify by changing "giving" to "suggesting".
Actually, I missed the remark just below this:
In the last three cases the number is a suggestion only.
so this is fine as it is.
I'll be happy to provide a patch if these changes are considered reasonable.
Please do.
Here's the output of svn diff. Is this a reasonable way of providing a patch?
Index: hist.Rd
===================================================================
--- hist.Rd (revision 34748)
+++ hist.Rd (working copy)
@@ -28,9 +28,9 @@
}
\item{freq}{logical; if \code{TRUE}, the histogram graphic is a
representation of frequencies, the \code{counts} component of
- the result; if \code{FALSE}, \emph{relative} frequencies
- (\dQuote{probabilities}), component \code{density},
- are plotted. Defaults to \code{TRUE} \emph{iff} \code{breaks} are
+ the result; if \code{FALSE}, probability densities, component
+ \code{density}, are plotted (so that the histogram has a total area
+ of one). Defaults to \code{TRUE} \emph{iff} \code{breaks} are
equidistant (and \code{probability} is not specified).}
\item{probability}{an \emph{alias} for \code{!freq}, for S compatibility.}
\item{include.lowest}{logical; if \code{TRUE}, an \code{x[i]} equal to
Deepayan