histogram first bar wrong position
Looking at the return value of hist will show you what is happening:
x <- rep(1:6,10*(6:1)) z <- hist(x, freq=TRUE) z
$breaks [1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 $counts [1] 60 50 0 40 0 30 0 20 0 10 ... The the first bin is [1-1.5], including both endpoints, while the other bins include only the upper endpoint. I recommend defining your own breakpoints, ones don't include possible data points, as in
print(hist(x, breaks=seq(min(x)-0.5, max(x)+0.5, by=1), freq=TRUE))
$breaks [1] 0.5 1.5 2.5 3.5 4.5 5.5 6.5 $counts [1] 60 50 40 30 20 10 ... S+ had a 'factor' method for hist() that did this sort of thing, but R does not. Bill Dunlap TIBCO Software wdunlap tibco.com
On Thu, Dec 22, 2016 at 5:17 AM, itpro <itpro1 at yandex.ru> wrote:
Hi, everyone. I stumbled upon weird histogram behaviour. Consider this "dice emulator": Step 1: Generate uniform random array x of size N. Step 2: Multiply each item by six and round to next bigger integer to get numbers 1 to 6. Step 3: Plot histogram.
x<-runif(N) y<-ceiling(x*6) hist(y,freq=TRUE, col='orange')
Now what I get with N=100000
x<-runif(100000) y<-ceiling(x*6) hist(y,freq=TRUE, col='green')
At first glance looks OK. Now try N=100
x<-runif(100) y<-ceiling(x*6) hist(y,freq=TRUE, col='red')
Now first bar is not where it should be. Hmm. Look again to 100000 histogram... First bar is not where I want it, it's only less striking due to narrow bars. So, first bar is always in wrong position. How do I fix it to make perfectly spaced bars?
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code.