Skip to content
Prev 67873 / 398502 Next

Generating a binomial random variable correlated with a

On 17-Apr-05 Ted Harding wrote:
Here is a follow-up (somewhat inspired by a remark in Bill's
suggestions). Consider the function

  test<-function(C,N){
    R<-numeric(N); corrs<-numeric(N)
    for(i in (1:N)){
      X0 <- rbinom(100,1,0.5) ; r<-sum(X0)
      Y <- rnorm(100) ; Y<-sort(Y)
      p0<-((1:100)-0.5)/100 ; L<-log(p0/(1-p0)); 
         p<-exp(C*L);p<-p/(1+p);
      ix<-sample((1:100),r,replace=FALSE,prob=p)
      X1<-numeric(100); X1[ix]<-1;
      R[i]<-r ; corrs[i]<-cor(X1,Y)
    }
    list(R=R,corrs=corrs)
  }

By experimenting with different values of C, you can see what
you get. For instance:

  mean(test(0.1,1000)$corrs)
  ## [1] 0.05958913

  mean(test(0.5,1000)$corrs)
  ## [1] 0.2722242

  mean(test(1.0,1000)$corrs)
  ## [1] 0.4347375

and finally (in my trials):

  mean(test(5.2,1000)$corrs)
  ## [1] 0.702623

  mean(test(5.22,10000)$corrs)
  ## [1] 0.6998803

and that might just be close enough to your desired 0.7 ...!

So now you have a true Binomial sample, an associated true
Normal sample, and a Pearson correlation of 0.7 between their
values.

Now of course comes the question which has been intriguing
us all: What is the purpose of achieving a given Pearson
correlation between a Normal variate and a binary variate?

Best wishes,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 17-Apr-05                                       Time: 14:10:43
------------------------------ XFMail ------------------------------