Skip to content

simulate binary markov chain

2 messages · Chris Oldmeadow, Charles C. Berry

#
Hi all, I was hoping somebody may know of a function for simulating a 
large binary sequence (length >10 million) using a (1st order) markov 
model with known (2x2) transition matrix. It needs to be reasonably 
fast. I have tried the following;

mc<-function(sq,P){
  s<-c()
  x<-row.names(P)
  n<-length(sq)
  p1<-sum(sq)/n
  s[1] <- rbinom(1,1,p1);
  for ( i in 2:n){
     s[i] <- rbinom( 1, 1, P[s[i-1]+1] )
  }
  return(s)
}


P<-c(0.63,0.27)
x<-rbinom(500,1,0.5)
new<-mc(x,P)

thanks in advance!
Chris
#
On Wed, 17 Dec 2008, Chris Oldmeadow wrote:

            
Chris,

The trick is to recognize that the length of each run is a sample from
the geometric distribution (with 1 added to it). rgeom() is vectorized,
so using it provides fast results.

Suppose that your transition matrix is

    |   | 0     | 1     |
    |---+-------+-------|
    | 0 | pi.11 | pi.12 |
    | 1 | pi.21 | pi.22 |
    |---+-------+-------|

where pi.11+p.12 == 1 and pi.21+pi.22 == 1

This function

  foo <- function(n,pi.12,pi.21) inverse.rle( list(values=rep(0:1,n) ,
 		lengths=1+rgeom( 2*n, rep( c( pi.12, pi.21 ), n) )))

will generate a sequence of 0/1's according to that matrix with length approximately n/pi.12+n/pi.21

On my macbook I get this timing:
user  system elapsed
   1.088   0.204   1.291
0         1
   0 0.6999024 0.3000976
   1 0.1997453 0.8002547
[1] 10048040
So, if this is fast enough, you just choose 'n' to be a bit larger
than desired length divided by (1/pi.12+1/pi.21) and then discard the 
excess.

Chuck


I have tried
Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901