Skip to content

sliding window over a large vector

8 messages · Chris Oldmeadow, Dimitris Rizopoulos, Veslot Jacques +4 more

#
Hi all,

I have a very large binary vector, I wish to calculate the number of 
1's  over sliding windows.

this is my very slow function

slide<-function(seq,window){
   n<-length(seq)-window
   tot<-c()
   tot[1]<-sum(seq[1:window])   
   for (i in 2:n) {
      tot[i]<- tot[i-1]-seq[i-1]+seq[i]
   }
   return(tot)
}
 
this works well for for reasonably sized vectors. Does anybody know a 
way for large vectors ( length=12 million), im trying to avoid using C.

Thanks,
Chris
#
you can have a look at the rollapply() function in the zoo package, e.g.,

x <- rbinom(100, 1, 0.5)
z <- zoo(x)
rollapply(z, 3, sum)


I hope it helps.

Best,
Dimitris
Chris Oldmeadow wrote:

  
    
#
utilisateur     syst?me      ?coul? 
      36.86        0.45       37.32
utilisateur     syst?me      ?coul? 
       0.01        0.00        0.02
[1] TRUE

Jacques VESLOT

CEMAGREF - UR Hydrobiologie

Route de C?zanne - CS 40061      
13182 AIX-EN-PROVENCE Cedex 5, France

T?l.   + 0033   04 42 66 99 76
fax    + 0033   04 42 66 99 34
email   jacques.veslot at cemagref.fr
#
For this particular proble (counting), doesn't cumsum solve it
effectively and efficiently?

    vv <- cumsum(v)
    vv[n:length(vv)] - vv[1:(length(vv)-n+1]

Of course, this doesn't work for the general case of an arbitrary
sliding window function.

     -s
On 12/15/08, Chris Oldmeadow <c.oldmeadow at student.qut.edu.au> wrote:

  
    
#
There seems to be something wrong:
[1] 2 2

but the output should be c(2, 1, 2)

At any rate try this:

library(zoo)
3 * rollmean(x, 3)


On Mon, Dec 15, 2008 at 11:19 PM, Chris Oldmeadow
<c.oldmeadow at student.qut.edu.au> wrote:
#
On Tue, Dec 16, 2008 at 8:23 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
That should be c(2, 1, 1)
#
if you want the speed, you can simply build an fts time series from
it, then apply the moving.sum function and throw away the dates.

this will probably be the fastest implementation of rolling applies
out there unless you do a cumsum difference function.

I got a sample timing of 2 seconds on 12m length vector (see botttom of email).

library(fts)

your.data <- c(0,1,1,0,1,1,1,0,0,0,0,1,1,1,1)

## dates generated automatically
fake.fts <- fts(data=your.data)

answer.fts <- moving.sum(fake.fts,10)

## throw away dates
answer.as.vector <- as.numeric(answer.fts)


my timing:
user  system elapsed
  1.970   0.081   2.051
[1] 12000000
[1] 11999981
-Whit


On Tue, Dec 16, 2008 at 9:12 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote: