Back to formatted view
Raw Message

Message-ID: <5100595c-d2f7-4b48-9139-45e65a47c6ca@email.android.com>
Date: 2012-12-06T22:21:47Z
From: Jeff Newmiller
Subject: Best way to coerce numerical data to a predetermined histogram	bin?
In-Reply-To: <CABG0rftbt-1osM41QK3cZG2o02XVbiWAcYky_ZCscn4qmQ8X+A@mail.gmail.com>

?cut
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

Jonathan Greenberg <jgrn at illinois.edu> wrote:

>Folks:
>
>Say I have a set of histogram breaks:
>
>breaks=c(1:10,15)
>
># With bin ids:
>
>bin_ids=1:(length(breaks)-1)
>
># and some data (note that some of it falls outside the breaks:
>
>data=runif(min=1,max=20,n=100)
>
>***
>
>What is the MOST EFFICIENT way to "classify" data into the histogram
>bins
>(return the bin_ids) and, say, return NA if the value falls outside of
>the
>bins.
>
>By classify, I mean if the data value is greater than one break, and
>less
>than or equal to the next break, it gets assigned that bin's ID (note
>that
>length(breaks) = length(bin_ids)+1)
>
>Also note that, as per this example, the bins are not necessarily equal
>widths.
>
>I can, of course, cycle through each element of data, and then move
>through
>breaks, stopping when it finds the correct bin, but I feel like there
>is
>probably a faster (and more elegant) approach to this.  Thoughts?
>
>--j