Skip to content

Binning continuous data

2 messages · Faryabi, Robert (NIH/NCI) [F], David Winsemius

#
Hi there,

Here is the scenario:

I have a measurement of some sort for two variables, I would like to figure out a rough pattern between them. Let say if the values of the first variable are low, middle, high, and extremely high, then what would be the corresponding pattern of the second variable. The idea is not to find the 2d distribution, but plot a conditional distribution of the second variable based on the binning of the the first variable and then present it in a boxplot.

I got the breakpoints for binning the first variables by a bi-modal density estimation. Now I need to bin the first variable accordingly and map them to a categorical value.

Is there an R command that does the binning?

Thanks,
Robert
#
On Feb 29, 2012, at 5:01 PM, Faryabi, Robert (NIH/NCI) [F] wrote:

            
It sounds as though you want `cut` and `table`. Whether that is the  
best use of the data is more questionable. Generally the  
categorization process removes quite a bit of the information content  
and may either introduce significant biases or lower power  when the  
cuts are chosen after looking at the data or lower power when any  
inferential test is used. You _should_ also look at 2d density  
estimation as a method that is less susceptible to these distortions.

help( kde2d, package=MASS)

help( bkde2D , package=KernSmooth)

help( s.kde2d , package=ade4)