Skip to content
Prev 131979 / 398506 Next

what does cut(data, breaks=n) actually do?

cut(data, breaks=n)
splits the data in n bins of (approximately) the same size.

The used size is obtained by:
max(data) - min(data)
------------------------------------
                 n

 > x=rnorm(x)
 > cut(x,breaks=3)
 [1] (1.79,9.97]  (-6.39,1.79] (9.97,18.2]  (9.97,18.2]  (-6.39,1.79]
 [6] (1.79,9.97]  (-6.39,1.79] (1.79,9.97]  (-6.39,1.79] (-6.39,1.79]
Levels: (-6.39,1.79] (1.79,9.97] (9.97,18.2]

Then you have:
 > 18.2-9.97
[1] 8.23
 > 9.97-1.79
[1] 8.18
 > 1.79+6.39
[1] 8.18
 >

 > (max(x)-min(x))/3
[1] 8.164187

I don't know the reasons for the little differences (I am wondering about).
I hope it is useful.
domenico
melissa cline wrote: