Skip to content

Unexpected behavior from hist()

4 messages · Mohamed Badawy, Sarah Goslee, David L Carlson +1 more

#
Hi,
On Thu, Jun 13, 2013 at 11:13 AM, Mohamed Badawy <mbadawy at pm-engr.com> wrote:
You don't provide a reproducible example, so here's some fake data:

somedata <- runif(1000)
Because you misread the help. using freq=FALSE (equivalent to
prob=TRUE, which is a legacy option), you are getting:

freq: logical; if ?TRUE?, the histogram graphic is a representation
          of frequencies, the ?counts? component of the result; if
          ?FALSE?, probability densities, component ?density?, are
          plotted (so that the histogram has a total area of one).
          Defaults to ?TRUE? _if and only if_ ?breaks? are equidistant
          (and ?probability? is not specified).


It sounds like what you actually want is:

somehist <- hist(somedata, plot=FALSE)
somehist$counts <- somehist$counts/sum(somehist$counts)
plot(somehist)
Nabble is not the R-help mailing list. Posting via email is the
correct thing to do.

Sarah
#
Density means that the AREAS of the bars add to 1, not the HEIGHTS
of the bars. You probably have intervals that are less than 1. Eg:
$breaks
 [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11
0.12 0.13

$counts
 [1]  42  88 151 177 178 131  97  70  43  14   6   2   1

$density
 [1]  4.2  8.8 15.1 17.7 17.8 13.1  9.7  7.0  4.3  1.4  0.6  0.2
0.1

$mids
 [1] 0.005 0.015 0.025 0.035 0.045 0.055 0.065 0.075 0.085 0.095
0.105 0.115
[13] 0.125

$xname
[1] "x"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"
[1] 0.042 0.088 0.151 0.177 0.178 0.131 0.097 0.070 0.043 0.014
0.006 0.002
[13] 0.001
[1] 1

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Sarah Goslee
Sent: Thursday, June 13, 2013 10:36 AM
To: Mohamed Badawy
Cc: r-help at r-project.org
Subject: Re: [R] Unexpected behavior from hist()

Hi,

On Thu, Jun 13, 2013 at 11:13 AM, Mohamed Badawy
<mbadawy at pm-engr.com> wrote:
with a raw data set of length 22,000, here is what I had:
near 5000.
You don't provide a reproducible example, so here's some fake data:

somedata <- runif(1000)
rectangles, but the highest rectangle is obviously higher than 1,
how can this be?!!!

Because you misread the help. using freq=FALSE (equivalent to
prob=TRUE, which is a legacy option), you are getting:

freq: logical; if 'TRUE', the histogram graphic is a representation
          of frequencies, the 'counts' component of the result; if
          'FALSE', probability densities, component 'density', are
          plotted (so that the histogram has a total area of one).
          Defaults to 'TRUE' _if and only if_ 'breaks' are
equidistant
          (and 'probability' is not specified).


It sounds like what you actually want is:

somehist <- hist(somedata, plot=FALSE)
somehist$counts <- somehist$counts/sum(somehist$counts)
plot(somehist)
posted it from Nabble, reason was "Message rejected by filter rule
match"

Nabble is not the R-help mailing list. Posting via email is the
correct thing to do.

Sarah