quantile function
On Fri, 6 Feb 2004 09:30:31 -0600 (CST)
Giovanni Petris <GPetris at uark.edu> wrote:
I am trying to `cut' a continuous variable into contiguous classes containing approximately an equal number of observations. I thought quantile() was the appropriate function to use in order to find the breakpoints, but I end up with classes of different sizes - see example below. Does anybody have an explanation for that? And what is the `recommended' way of computing what I am looking for? Example:
ca$age
[1] 28 42 46 45 34 44 48 45 38 45 49 45 41 46 49 46 44 48 52 48 45 50 53 57 46 [26] 52 54 57 47 52 55 59 50 54 57 60 51 55 46 63 51 59 48 35 53 59 57 37 55 32[51] 60 43 59 37 30 47 60 38 34 48 32 38 36 49 33 42 38 58 35 43 39 59 39 43 42[76] 60 40 44
table(cut(ca$age,breaks=c(-Inf,quantile(ca$age, seq(0,1,length=11)[-1]))))
(-Inf,35] (35,38.4] (38.4,43] (43,45] (45,46.5] (46.5,49] (49,52]
(52,55]
9 7 10 8 5 10 7
7
(55,59] (59,63]
10 5
Thanks in advance,
Giovanni
--
The cut2 function in the Hmisc package tries to do this the best it can.
Frank
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University