Skip to content

Head or Tails game

10 messages · David Arnold, David Winsemius, R. Michael Weylandt +4 more

#
Hi,

Reading about a "Heads and Tails" game in 
http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/amsbook.mac.pdf
Introduction to Probability  (Example 1.4, pp. 5-8).

You toss a coin 40 times. If heads, Peter wins $1, tails, he loses $1. I
think I can do that ok with:

winnings <- sum(sample(c(-1,1), 40, replace=TRUE))

But I have to do it 10,000 times and I have to record and collect the
winnings. In other languages, I would probably use a for loop and "push"
each winnings into some sort of collective array or vector. However, for
loops seem to be discouraged in R and it seems the same can be said for
"pushing" a calculation onto a vector. So, can someone give me some guidance
on how to collect 10,000 winnings?

The second part of the game asks us to keep track of how often Peter is in
the lead during each game. Obviously, he is in the lead at any moment his
cumulative winnings are positive. But the game requires that we also do
something at the moment the cumulative winnings are zero. (1) if the
previous cumulative sum was nonnegative, then the zero counts a "staying in
the lead." So, for example, during a single game, Peter might be in the lead
for say 34 out of the 40 tosses. I must record the 34 and perform the game
9,999 more times, each time recording the number of times that Peter is in
the lead. So again, any thoughts on how to do  this without for loops and
"pushing?"

Thanks for the help. Great list.

David Arnold
College of the Redwoods
Eureka, CA
http://msemac.redwoods.edu/~darnold/index.php



--
View this message in context: http://r.789695.n4.nabble.com/Head-or-Tails-game-tp4639142.html
Sent from the R help mailing list archive at Nabble.com.
#
On Aug 3, 2012, at 5:55 PM, darnold wrote:

            
What's wrong with:

set.seed(123)  # always good to make reproducible
winnings <- sum(sample(c(-1,1), 10000, replace=TRUE))
?cumsum
?which
? replicate

  
    
#
David,

set.seed(123)  # always good to make reproducible 
winnings <- sum(sample(c(-1,1), 10000, replace=TRUE)) 

Unfortunately, that's not the game. The game requires 40 flips of a coin.

Then you have to play the game 10,000 times.

D.



--
View this message in context: http://r.789695.n4.nabble.com/Head-or-Tails-game-tp4639142p4639145.html
Sent from the R help mailing list archive at Nabble.com.
#
On Aug 3, 2012, at 9:14 PM, darnold <dwarnold45 at suddenlink.net> wrote:

            
colSums(matrix(sample(c(-1, 1), 40*10000, TRUE), ncol = 10000))

or some such

Michael
#
HI,

You could try this:
set.seed(112)
list1<-vector("list",1000)
for(i in 1:1000){
?list1[[i]]<-sample(c(-1,1),40,replace=TRUE)}
?dat1<-do.call(rbind,lapply(list1,function(x) sum(x)))
dat2<-matrix(dat1,ncol=20,byrow=TRUE)
head(dat2)
?# ?? [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#[1,]?? -2?? 12??? 0??? 8??? 0??? 4?? -8??? 2?? -8??? -4?? -14???? 8??? -4?? -12
#[2,]??? 8??? 4?? -4??? 4?? 12?? -2?? -6??? 0?? -8???? 8??? -8??? 12???? 2???? 2
#[3,]?? -2??? 8??? 2?? -6? -12?? -6??? 8??? 6?? -4???? 4???? 4???? 8???? 6??? 10
#[4,]??? 6??? 6??? 0??? 4?? 10?? -8?? -4?? -2??? 4?? -10??? -6??? -2???? 4???? 2
#[5,]?? -6?? -4?? -6?? -6??? 4??? 0? -14?? -2??? 0??? -8??? -6???? 2???? 4??? -8
#[6,]??? 0?? 10??? 0??? 0? -14?? -2??? 2??? 0??? 6?? -10???? 4???? 0??? -4???? 4
? # ? [,15] [,16] [,17] [,18] [,19] [,20]
#[1,]???? 4??? 12??? -6???? 4??? 10???? 8
#[2,]??? -6???? 0???? 4???? 8???? 0??? 12
#[3,]??? -2??? -4??? 10?? -10???? 6?? -10
#[4,]???? 2???? 6???? 4???? 6???? 8???? 6
#[5,]??? 10???? 0??? -6??? -2??? 10??? -8
#[6,]???? 0?? -12??? 16??? -2???? 2???? 4


#system time for 10,000 times
system.time({
?set.seed(112)
?list1<-vector("list",10000)
?for(i in 1:10000){
? list1[[i]]<-sample(c(-1,1),40,replace=TRUE)}
? dat1<-do.call(rbind,lapply(list1,function(x) sum(x)))
?dat2<-matrix(dat1,ncol=200,byrow=TRUE)
?})
?? user? system elapsed 
? 0.112?? 0.000?? 0.111 


A.K.




----- Original Message -----
From: darnold <dwarnold45 at suddenlink.net>
To: r-help at r-project.org
Cc: 
Sent: Friday, August 3, 2012 10:14 PM
Subject: Re: [R] Head or Tails game

David,

set.seed(123)? # always good to make reproducible 
winnings <- sum(sample(c(-1,1), 10000, replace=TRUE)) 

Unfortunately, that's not the game. The game requires 40 flips of a coin.

Then you have to play the game 10,000 times.

D.



--
View this message in context: http://r.789695.n4.nabble.com/Head-or-Tails-game-tp4639142p4639145.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Use a loop to run over the 10,000 reps and in each rep use
rle(x>=0) to look at the runs of nonnegative numbers, postprocessing
its output a bit to count them.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Two possible solutions below.
On Fri, 3 Aug 2012, darnold wrote:

            
In general, vectorizing means doing a bunch of calculations in each
step. Keep in mind that apply and friends are basically convenience
notation for loops (they are convenient, but are best applied after
a bunch of vectorization has already been applied).

In some cases multiple levels of loops can be vectorized together.
I was only able to manage one level, but choosing the RIGHT level
to vectorize is important... get as many calculations done at each
step of code as possible to get the fastest speedup.

The ideas:
  1) Have "sample" do as many coin flips at once as possible. Fold
     the results into a matrix to make it look like multiple
     sets of results.
  2) Use vector logical tests to perform the game logic. Use
     vector indexing if appropriate to look at "previous" results.
  3) Use automatic conversion of logical results to integer for
     accumulating counts of successes.
  4) Consider using cumsum or similar vectorized operations.
  5) Since there are more games than moves in each game, consider
     simulating each step of all games vectorially to minimize
     the number of looping steps.

The code:

#------------------------------------
# Head or Tails game

# Approach by vectorizing each game evaluation and looping over games
loopbygame <- function( results ) {
   gameleadcount <- function(v) {
    pot <- cumsum( v )
    sum( pot > 0 | c( FALSE, 0==pot[ -1 ] & 0 != pot[ -nrow( results ) ] ) )
   }
   apply(results,2,gameleadcount)
}

# Approach by vectorizing one step from all games and looping over steps
loopbymove <- function( results ) {
   pot <- rep( 0, ncol( results ) )
   leadcount <- rep( 0, ncol( results ) )
   for ( moveno in 1:nrow( results ) ) {
     oldpot <- pot
     pot <- pot + results[ moveno, ]
     leadcount <- leadcount + ( pot > 0 | 0 == pot & 0 != oldpot )
   }
   leadcount
}

gamelen <- 40
gamecount <- 10000
set.seed( 123 )
results <- matrix( sample( c( -1, 1 ), gamelen*gamecount, replace=TRUE  )
                  , ncol=gamecount )

system.time( ans1 <- loopbymove( results ) )
system.time( ans2 <-loopbygame( results ) )
all.equal( ans1, ans2 )

My results:
user  system elapsed
   0.040   0.000   0.037
user  system elapsed
   0.224   0.000   0.219
[1] TRUE
You are welcome.  Note that on this "great list" normally requests for 
"ideas" on optimization without a concrete, reproducible code example of 
some type are groused at for not showing sufficient research on your part 
and for essentially asking us to do your work for you. In this case I
found the problem interesting enough to respond in spite of your lack
of reproducible (even if inefficient) code.

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
#
Wow! Some great responses!

I am getting some great responses. I've only read David, Michael, and Dennis
thus far, leading me to develop this result before reading further.

lead <- function(x) {
  n <- length(x)
  count <- 0
  if (x[1] >= 0) count <- count + 1
  for (i in 2:n) {
    if (x[i] > 0 || (x[i] == 0 && x[i-1] >= 0 )) {
      count  <- count + 1
    }
  }
  count
}

games <- replicate(10000,sample(c(-1,1),40,replace=TRUE))

games_sum <- apply(games,2,sum)
plot(table(games_sum))

games_lead <- apply(games,2,cumsum)
games_lead <- apply(games_lead,2,lead)
plot(table(games_lead))

Now I am going to read Arun, William, and Jeff's responses and see what
other ideas are being proposed.

Thanks everyone.

D.



--
View this message in context: http://r.789695.n4.nabble.com/Head-or-Tails-game-tp4639142p4639150.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi,

system.time({
set.seed(111)
colSums(matrix(sample(c(-1, 1), 40*10000, TRUE), ncol = 10000))
})
user? system elapsed 
? 0.032?? 0.012?? 0.041?

system.time({
?set.seed(112)
?list1<-vector("list",10000)
?for(i in 1:10000){
? list1[[i]]<-sample(c(-1,1),40,replace=TRUE)}
? dat1<-do.call(rbind,lapply(list1,function(x) sum(x)))
?dat2<-matrix(dat1,ncol=200,byrow=TRUE)
?})
?? user? system elapsed 
? 0.112?? 0.000?? 0.111?
#modified version

system.time({
?set.seed(112)
?list1<-vector("list",10000)
?for(i in 1:10000){
? list1[[i]]<-sample(c(-1,1),40,replace=TRUE)}
? dat1<-unlist(lapply(list1,function(x) sum(x)))
? })
?user? system elapsed 
?0.092?? 0.000?? 0.092 


It seems like Michael's solution is better in terms of the CPU utilization.? I guess, the loop created the difference.

A.K.



----- Original Message -----
From: Michael Weylandt <michael.weylandt at gmail.com>
To: darnold <dwarnold45 at suddenlink.net>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Friday, August 3, 2012 10:20 PM
Subject: Re: [R] Head or Tails game
On Aug 3, 2012, at 9:14 PM, darnold <dwarnold45 at suddenlink.net> wrote:

            
colSums(matrix(sample(c(-1, 1), 40*10000, TRUE), ncol = 10000))

or some such

Michael
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
2 days later
#
Here is another solution that doesn't need to define an additional function with an explicit loop. It seems to be considerably faster than the approach presented above.

system.time({
  set.seed(123)
  games <- matrix(sample(c(-1, 1), 40*10000, TRUE), ncol = 10000)
  games_sum <- apply(games,2,cumsum)
  games_lead <- colSums((games_sum > 0) | (games_sum==0 & games==-1))
})
   user  system elapsed 
   0.08    0.00    0.08 

plot(table(games_sum[40,]))
plot(table(games_lead))


Compare this with your solution

system.time({
set.seed(123)
games <- replicate(10000,sample(c(-1,1),40,replace=TRUE))

games_sum <- apply(games,2,sum)

games_lead <- apply(games,2,cumsum)
games_lead <- apply(games_lead,2,lead)
})
   user  system elapsed 
   0.95    0.02    0.98 


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204