Gregor GORJANC <gregor.gorjanc at bfro.uni-lj.si> writes:
Hello!
Up to now I have been using hist() to display the distributions.
Howevere, I noteiced strange numbers on y (vertical) axis, if I used
probability = T or freq = F option. I thought it is a bug and launched
the R-bug system and found some posts on that matter. Brian Ripley
responded to one, that one should look at truehist() for that. Ok I
can use truehist() if I want to see the ratios or probabilities, but
what is then the "density or probability" in hist()?
...
truehist(mydata) # looks OK
And truehist(mydata, h=.5)?
It is a density estimate. The sum of the bar _areas_ should be 1.
sum(x$intensities * .5)
[1] 0.9999998
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Hello!
Up to now I have been using hist() to display the distributions.
Howevere, I noteiced strange numbers on y (vertical) axis, if I used
probability = T or freq = F option. I thought it is a bug and launched
the R-bug system and found some posts on that matter. Brian Ripley
responded to one, that one should look at truehist() for that. Ok I can
use truehist() if I want to see the ratios or probabilities, but what is
then the "density or probability" in hist()?
For example:
# some data
mydata <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,3,4,5)
# histogram with frequencies
hist(mydata)
# histogram with ratios or probabilities
hist(mydata, freq = F) # what are that values on vertical axis
# lets take a look at values behind
x <-hist(mydata, freq = F, plot = F); x
$breaks
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
$counts
[1] 22 1 0 1 0 1 0 1
$intensities
[1] 1.69230735 0.07692308 0.00000000 0.07692308 0.00000000 0.07692308
0.00000000
[8] 0.07692308
$density
[1] 1.69230735 0.07692308 0.00000000 0.07692308 0.00000000 0.07692308
0.00000000
[8] 0.07692308
$mids
[1] 1.25 1.75 2.25 2.75 3.25 3.75 4.25 4.75
$xname
[1] "mydata"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
# HOW are this intensities and density values calculated? What they
actually represent?
# MASS packages
library(MASS)
# again histogram with prob = T by default
truehist(mydata) # looks OK
Lep pozdrav / With regards / Con respeto,
Gregor GORJANC
---------------------------------------------------------------
University of Ljubljana
Biotechnical Faculty URI: http://www.bfro.uni-lj.si
Zootechnical Department mail: gregor.gorjanc <at> bfro.uni-lj.si
Groblje 3 tel: +386 (0)1 72 17 861
SI-1230 Domzale fax: +386 (0)1 72 41 005
Slovenia