Dear list As a relative newbie to R I am after some basic help. I am wanting to simulate a data set consisting of a Y variable and several X variables, all either binary or discrete. I am wondering how to go about doing this and have failed to find anything about this in the R -help. Thanks in advance Laura _______________________________________________________________________ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
simulating binary variables
4 messages · laura@bayesian-bay.freeserve.co.uk, Ulrich Flenker, Jonathan Baron +1 more
On Fri, 9 Aug 2002 laura at bayesian-bay.freeserve.co.uk wrote:
Dear list As a relative newbie to R I am after some basic help. I am wanting to simulate a data set consisting of a Y variable and several X variables, all either binary or discrete. I am wondering how to go about doing this and have failed to find anything about this in the R -help. Thanks in advance Laura
Laura, have a look at help(rbinom) and help(rpois).
Uli Flenker Institute of Biochemistry German Sport University Cologne Carl-Diem-Weg 6 50933 Cologne Phone 0049/221/4982-506 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On 08/09/02 13:28, laura at bayesian-bay.freeserve.co.uk wrote:
I am wanting to simulate a data set consisting of a Y variable
and several X variables, all either binary or discrete. I am wondering how to go about doing this and have failed to find anything about this in the R -help. Try runif() in the base package. It generates random numbers from a uniform distribution. So, for example, if you want a binary variable with an expectation of .75, and 1000 observations, say: runif(1000)<=.75 Or, if you want to see the numbers right away: (runif(1000)<=.75)+0 To generate a factor with several levels, you can apply cut() to runif(). That may be sufficient, but note that factors are "categorical variables." Of course, you assign these to variables, e.g., x1 <- runif(1000)<=.75 and then use these in your model. Jon
Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron Questionnaires: http://www.psych.upenn.edu/~baron/qs.html Psychology webmaster: http://www.psych.upenn.edu/ R page: http://finzi.psych.upenn.edu/ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
As the previous mail, rbinom and rpois are your best bets for binomial and discrete. If you use runif or rnorm you will get continuos variables which you can convert to discrete by using round(a*runif(100) +b) mod m [but you run the risk of getting a cycle of numbers if you're not careful]. If you got a non-standard distribution you could use sample. For eg sample(0:1, 5000, replace=T) produces the similiar result as binomial with probability 0.5. More interestingly, say if you want to simulate n obs from a distributon that place a an equal mass on the first 100 prime numbers, the x <- c( 2, 3,5,7,11, 13, 17, 19 , ... ......... ,521, 523, 541 ) y <- sample( x, n, replace=T) and you can turn the replace =F is you want sampling without replacement
On 08/09/02 13:28, laura at bayesian-bay.freeserve.co.uk wrote:
I am wanting to simulate a data set consisting of a Y variable
and several X variables, all either binary or discrete. I am wondering how to go about doing this and have failed to find anything about this in the R -help.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._