Skip to content

hist.default documentation

4 messages · Deepayan Sarkar, Duncan Murdoch

#
I think there are a couple of things in ?hist that are not quite as
clear as they could be.

(1)    

  freq: logical; if 'TRUE', the histogram graphic is a representation
          of frequencies, the 'counts' component of the result; if
          'FALSE', _relative_ frequencies ("probabilities"), component
          'density', are plotted.   Defaults to 'TRUE' _iff_ 'breaks'
          are equidistant (and 'probability' is not specified).
 
Unless I'm missing something, the 'density' component is NOT relative
frequency or 'probability' in any reasonable sense, country-specific
biases notwithstanding, except in the very special case where
all(diff(breaks) == 1). Thus, the above description is confusing and
probably even wrong.

Also, it seems to me that hist cannot draw a relative frequency
histogram at all (which is not a bad thing, but it's of course very
important to the undergrads we're teaching intro stats and R to). This
should be explicitly mentioned.

(2) 

  breaks: one of:

             ...
             *  a single number giving the number of cells for the
                histogram,
             ...

This is not quite true. 'breaks' is used in 'pretty', so it's more a
suggestion than an exact specification. I'm not sure whether or not
the behaviour should be changed (what's the point of having ``pretty''
breakpoints anyway?), but if not, the documentation should be
clarified.

I'll be happy to provide a patch if these changes are considered reasonable.

Deepayan
#
On 6/17/2005 8:58 AM, Deepayan Sarkar wrote:
I agree.
I'm not sure about this.  Is it really worth mentioning something if you 
can't do it?  Are you thinking of just giving a reference to barplot?
I like the pretty breakpoints. It is good to label the breakpoints, and 
ugly to have labels at other than pretty points.  I'd clarify by 
changing "giving" to "suggesting".
Please do.

Duncan Murdoch
#
On 6/17/05, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
Not mentioning it is fine.
Actually, I missed the remark just below this:

          In the last three cases the number is a suggestion only.

so this is fine as it is.
Here's the output of svn diff. Is this a reasonable way of providing a patch?

Index: hist.Rd
===================================================================
--- hist.Rd     (revision 34748)
+++ hist.Rd     (working copy)
@@ -28,9 +28,9 @@
   }
   \item{freq}{logical; if \code{TRUE}, the histogram graphic is a
     representation of frequencies, the \code{counts} component of
-    the result; if \code{FALSE}, \emph{relative} frequencies
-    (\dQuote{probabilities}), component \code{density},
-    are plotted.   Defaults to \code{TRUE} \emph{iff} \code{breaks} are
+    the result; if \code{FALSE}, probability densities, component
+    \code{density}, are plotted (so that the histogram has a total area
+    of one).  Defaults to \code{TRUE} \emph{iff} \code{breaks} are
     equidistant (and \code{probability} is not specified).}
   \item{probability}{an \emph{alias} for \code{!freq}, for S compatibility.}
   \item{include.lowest}{logical; if \code{TRUE}, an \code{x[i]} equal to

Deepayan
#
Thanks, I've committed the change.

Duncan Murdoch
On 6/17/2005 10:30 AM, Deepayan Sarkar wrote: