Skip to content

random number generator in batch jobs

2 messages · Jiqiu Cheng, Brian Ripley

#
Dear sir,
   I want to submit R batch jobs (e.g. 5) under the linux cluster by  
the script file "do_mul".
The script file "do_mul"
"
#!/bin/bash
export var
for var in $(seq 1 5)
do
   qsub -v var do_test
done
exit 0
"
Through "do_mul", 5 "do_test" script files are submitted to the cluster.
The script file "do_test":
"
#!/bin/bash -l
#PBS -l ncpus=1
#PBS -l walltime=0:05:00
cd $PBS_O_WORKDIR
mkdir test$var
cd test$var
module load R/2.5.0
R --vanilla< test
exit 0
"
The content in R file "test" is :
"rm(list=ls(all=TRUE))
sample(10)
"
I expect to have different samples each time. However, for these 5  
replications, the first 3 jobs giving me the same samples and the last  
2 are the same. I'm confused because I already used "R --vanilla" to  
avoid loading same workspace each time and "rm(list=ls(all=TRUE))" to  
remove the same random seed each time. Why do same samples still  
happen among 5 replications? Does anybody have some ideas to solve  
this problem? Looking forward to your reply, thanks.

Regards,
Jiqiu

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
#
Have you read the help page?

      Initially, there is no seed;  a new one is created from the
      current time when one is required.  Hence, different sessions will
      give different simulation results, by default.

Thus if you choose to launch processes on different machines at the 
same time you will get the same random number stream.

Running random number streams for parallel computation is a (very) 
specialized topic and you need to be aware of the literature.  I will 
point out packages rsprng and accuracy (function runifS).
On Mon, 30 Jul 2007, Jiqiu Cheng wrote: