Skip to content
Prev 132123 / 398506 Next

what does cut(data, breaks=n) actually do?

Peter Dalgaard wrote:
It can be a bit dangerous to use quantile() to provide breaks for cut(),
because quantiles can be non-unique, which cut() doesn't like:
Error in cut.default(x1, breaks = quantile(x1, (0:2)/2)) :
   'breaks' are not unique
However, cut2() in Hmisc handles this situation gracefully:
Attaching package: 'Hmisc'
        The following object(s) are masked from package:base :
          format.pval,
          round.POSIXt,
          trunc.POSIXt,
          units
[1] 1 1 1 1 1 1 1 1 1 2
Levels: 1 2
(Additionally, a potentially dangerous peculiarity of quantile() for 
this kind of purpose is that its return values can be out of order 
(i.e., diff(quantile(...))<0, at rounding error level), but this doesn't 
actually upset cut() in R because cut() sorts the breaks prior to using 
them.)

-- Tony Plate