Skip to content

grouping and counting in dataframe

5 messages · zem, David Winsemius, jim holtman

zem
#
hi all,

i have a little problem, i have some code writen, but it is to slow... 

i have a dataframe with a column of time series and a grouping column,
really there is no metter if in the first col what kind of data is, it can
be a random number like this
x<-rnorm(10)
gr<-c(1,3,1,2,2,4,2,3,3,3)
x<-cbind(x,gr)

now i have to look for every row i , for this group, how much from the x[,1]
is in a range from x[1,i] such x[1,i] (+/-) k (k is a another number) 

thanks in advance
#
On Feb 25, 2011, at 8:28 PM, zem wrote:

            
That is not a dataframe. It is a matrix. And not all time series  
objects are the same, so you should not assume that any old two column  
object will respond the same way to R functions.
You may find that the function, findInterval, is useful. I cannot  
determine what you goal is from the description and there is no  
complete example with a specification of what correct output would  
be ....  as you should have seen requested in the Posting Guide.

  
    
zem
#
sry, 
new try: 

tm<-c(12345,42352,12435,67546,24234,76543,31243,13334,64562,64123) 
gr<-c(1,3,1,2,2,4,2,3,3,3) 
d<-data.frame(cbind(time,gr))

where tm are unix times and gr the factor grouping by
i have a skalar for example k=500
now i need to calculate in for every row how much examples in the group are
in the interval [i-500;i+500] and i is the active tm-element, like this:
time gr ct
1  12345  1  2
2  42352  3  0
3  12435  1  2
4  67546  2  0
5  24234  2  0
6  76543  4  0
7  31243  2  0
8  13334  3  0
9  64562  3  2
10 64123  3  2

i hope that was a better illustration of my problem
1 day later
zem
#
have nobody any idea? 
i have already try with tapply(d,gr, ... ) but i have problems with the
choose of the function ...  also i am not really sure if that is the right
direction with tapply ... 
it'll be really great when somebody comes with new suggestion..

10x
#
Here is one solution; mine differs since there should be at least one
item in the range which would be itself:

      tm gr
1  12345  1
2  42352  3
3  12435  1
4  67546  2
5  24234  2
6  76543  4
7  31243  2
8  13334  3
9  64562  3
10 64123  3
+     # determine count in the range
+     sapply(x, function(a) sum((x >= a - 500) & (x <= a + 500)))
+ })
tm gr ct
1  12345  1  2
2  42352  3  1
3  12435  1  2
4  67546  2  1
5  24234  2  1
6  76543  4  1
7  31243  2  1
8  13334  3  1
9  64562  3  2
10 64123  3  2
On Sat, Feb 26, 2011 at 5:10 PM, zem <zmanolova at gmail.com> wrote: