I'm trying to convert a loop in a simulation program to a parallel
process using multicore.? It looks like this:
simus=1000?
set.seed(12345)
for(iter in 1:simus){
?
#bunch of code setting up random draws from various
#distributions for X matrix?
?
#Computaion of Y
?
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
?
estbeta<-fixef(fitmodel)
?sdebeta<-sqrt(diag(vcov(fitmodel)))
? for(l in 1:betasize)
? {?
?? cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
??? if(beta[l]*cibeta>0)????????????? powaprox[[l]]<-powaprox[[l]]+1
????? sdepower[l,iter]<-as.numeric(sdebeta[l])
? }?
##------------------------------------------------------------------------##
????????????????????????? } ##? iteration end here
?
?
The variable fitmodel is defined elsewhere and consists of
fixed and random parts.? Betasize is the length of the vector of
fixed effects.? Looks like near the bottom there's a
counter that increments if the slope estimate doesn't trap 0.?
The next line records the standard deviation.? It's these two items
that I need to recover.
?
So I'm thinking that I'm going to recode this way to run on
a machine that has 4-Xeon CPUs:
?
library(multicore)
options(cores=4)
simus=1000
sim.base.fun<-function(iter){? #replaces for(iter in 1:simus){
#bunch of code setting up random draws from various
#distributions for X matrix?
?
#Computaion of Y
?
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
?
estbeta<-fixef(fitmodel)
?sdebeta<-sqrt(diag(vcov(fitmodel)))
? for(l in 1:betasize)
? {?
?? cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
??? if(beta[l]*cibeta>0)?????
??? return(powaprox[[1]]<-powaprox[[1]]+1 #replaces powaprox[[l]]<-powaprox[[l]]+1
??? retunr(sedpower[1,iter]<-as.numeric(sdebeta[1]) #replaces sdepower[l,iter]<-as.numeric(sdebeta[l])
?????}
????????????????????????????????????? }#end of sim.base.fun
sim.fun<_lapply(iter in 1:simus),sim.base.fun)
?
?
Do I need a collect statement?? How do I kill the worker
processes?? One thing I'm not accounting for is the random
number generating process.? I've seen a couple of ways of
doing this, one using mclapply and another using rlecuyer,
but I'm not sure how to make them work.?? Suggestions?
Is there a way to replicate the same random numbers used
in the conventional loop in the parallel loop?
?
My machine has 4-Intel? Xeon? Processor X5650 CPUs.
When I type:
?
library(multicore)
multicore:::detectCores()
?
it returns 4.? However, the Intel cutsheet says each of these proceesors
has 6 cores.? How can I acces them?? I know there's a point of diminishing
returns but I can't figure out what it is unless I test it.
Parallel loop
10 messages · Scott Raynaud, rsparapa at mcw.edu, Simon Urbanek +4 more
Scott,
On Mar 6, 2012, at 4:19 PM, Scott Raynaud wrote:
I'm trying to convert a loop in a simulation program to a parallel
process using multicore. It looks like this:
simus=1000
set.seed(12345)
for(iter in 1:simus){
#bunch of code setting up random draws from various
#distributions for X matrix
#Computaion of Y
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
estbeta<-fixef(fitmodel)
sdebeta<-sqrt(diag(vcov(fitmodel)))
for(l in 1:betasize)
{
cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
if(beta[l]*cibeta>0) powaprox[[l]]<-powaprox[[l]]+1
sdepower[l,iter]<-as.numeric(sdebeta[l])
}
##------------------------------------------------------------------------##
} ## iteration end here
The variable fitmodel is defined elsewhere and consists of
fixed and random parts. Betasize is the length of the vector of
fixed effects. Looks like near the bottom there's a
counter that increments if the slope estimate doesn't trap 0.
The next line records the standard deviation. It's these two items
that I need to recover.
So I'm thinking that I'm going to recode this way to run on
a machine that has 4-Xeon CPUs:
library(multicore)
options(cores=4)
simus=1000
sim.base.fun<-function(iter){ #replaces for(iter in 1:simus){
#bunch of code setting up random draws from various
#distributions for X matrix
#Computaion of Y
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
estbeta<-fixef(fitmodel)
sdebeta<-sqrt(diag(vcov(fitmodel)))
for(l in 1:betasize)
{
cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
if(beta[l]*cibeta>0)
return(powaprox[[1]]<-powaprox[[1]]+1 #replaces powaprox[[l]]<-powaprox[[l]]+1
retunr(sedpower[1,iter]<-as.numeric(sdebeta[1]) #replaces sdepower[l,iter]<-as.numeric(sdebeta[l])
}
}#end of sim.base.fun
sim.fun<_lapply(iter in 1:simus),sim.base.fun)
No, the above won't work (all typos aside) since you are still using assignments. Inherently the above is not parallelizable as-is because of
powaprox[[l]]<-powaprox[[l]]+1
but that is trivially removed. Also the inner for loop is entirely superfluous.
Since your code was incomplete this is just a suggestion as we can't test anything (I'm not even checking if what you're doing makes any sense), but it should give you an idea of what to do:
sim.base <- function(iter) {
## your unstated code goes here ...
fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2)
estbeta <- fixef(fitmodel)
sdebeta <- sqrt(diag(vcov(fitmodel)))
list( powf = estbeta-sgnbeta*z1score*sdebeta*beta > 0,
sdepower = sdebeta)
)
}
res <- lapply(seq.int(simus), sim.base)
powf <- sapply(res, function(x) x$powf)
sdepower <- sapply(res, function(x) x$sdepower)
powapprox <- apply(powf, 1, sum)
To parallelize, replace lapply above with mclapply.
Do I need a collect statement?
No, if you use mclapply it's all done for you.
How do I kill the worker processes?
No, they go away once they're done computing.
One thing I'm not accounting for is the random number generating process. I've seen a couple of ways of doing this, one using mclapply and another using rlecuyer, but I'm not sure how to make them work. Suggestions? Is there a way to replicate the same random numbers used in the conventional loop in the parallel loop?
There is but you may need read up on that - you have to use a generator that produces a streaming sequence such that you can skip forward (also see recent discussion here and R-devel). For the above you may get away without it by simply setting different seeds in each iteration.
My machine has 4-Intel? Xeon? Processor X5650 CPUs. When I type: library(multicore) multicore:::detectCores() it returns 4. However, the Intel cutsheet says each of these proceesors has 6 cores. How can I acces them?
You can set any number of cores you want using mc.cores=... argument - the detected cores are just a default if you don't specify anything. What OS/distro is this? It is very unusual to see the number of cores reported wrongly by the system ... Cheers, Simon
I know there's a point of diminishing returns but I can't figure out what it is unless I test it.
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
1 day later
Thanks for the feedback.? Part of my problem is that I need the most recent copy of R.? My IS team told me they can only get an earlier copy using apt-get, but I think there must be a way so I've tasked them with figuring it out.? My?OS is Kubuntu. I just thought it odd that it could only detect the number of CPUs rather than the number of cores.? I'm completely new to parallel processing but it seems that something is not right in the core detection. One quetion I still have regrds child processes. I understand they finish on their own, but what if I need to kill those processing because of an obvious problem.? How can I do that? ----- Original Message ----- From: Simon Urbanek <simon.urbanek at r-project.org> To: Scott Raynaud <scott.raynaud at yahoo.com> Cc: "r-sig-hpc at r-project.org" <r-sig-hpc at r-project.org> Sent: Tuesday, March 6, 2012 8:27 PM Subject: Re: [R-sig-hpc] Parallel loop Scott,
On Mar 6, 2012, at 4:19 PM, Scott Raynaud wrote:
I'm trying to convert a loop in a simulation program to a parallel
process using multicore.? It looks like this:
simus=1000
set.seed(12345)
for(iter in 1:simus){
?
#bunch of code setting up random draws from various
#distributions for X matrix
?
#Computaion of Y
?
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
?
estbeta<-fixef(fitmodel)
? sdebeta<-sqrt(diag(vcov(fitmodel)))
? for(l in 1:betasize)
? {?
? ? cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
? ? if(beta[l]*cibeta>0)? ? ? ? ? ? ? powaprox[[l]]<-powaprox[[l]]+1
? ? ? sdepower[l,iter]<-as.numeric(sdebeta[l])
? }?
##------------------------------------------------------------------------##
? ? ? ? ? ? ? ? ? ? ? ? ? } ##? iteration end here
?
?
The variable fitmodel is defined elsewhere and consists of
fixed and random parts.? Betasize is the length of the vector of
fixed effects.? Looks like near the bottom there's a
counter that increments if the slope estimate doesn't trap 0.?
The next line records the standard deviation.? It's these two items
that I need to recover.
?
So I'm thinking that I'm going to recode this way to run on
a machine that has 4-Xeon CPUs:
?
library(multicore)
options(cores=4)
simus=1000
sim.base.fun<-function(iter){? #replaces for(iter in 1:simus){
#bunch of code setting up random draws from various
#distributions for X matrix
?
#Computaion of Y
?
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
?
estbeta<-fixef(fitmodel)
? sdebeta<-sqrt(diag(vcov(fitmodel)))
? for(l in 1:betasize)
? {?
? ? cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
? ? if(beta[l]*cibeta>0)? ?
? ? return(powaprox[[1]]<-powaprox[[1]]+1 #replaces powaprox[[l]]<-powaprox[[l]]+1
? ? retunr(sedpower[1,iter]<-as.numeric(sdebeta[1]) #replaces sdepower[l,iter]<-as.numeric(sdebeta[l])
? ? ? }
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? }#end of sim.base.fun
sim.fun<_lapply(iter in 1:simus),sim.base.fun)
?
No, the above won't work (all typos aside) since you are still using assignments. Inherently the above is not parallelizable as-is because of
powaprox[[l]]<-powaprox[[l]]+1
but that is trivially removed. Also the inner for loop is entirely superfluous.
Since your code was incomplete this is just a suggestion as we can't test anything (I'm not even checking if what you're doing makes any sense), but it should give you an idea of what to do:
sim.base <- function(iter) {
? ## your unstated code goes here ...
? fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2)
? estbeta <- fixef(fitmodel)
? sdebeta <- sqrt(diag(vcov(fitmodel)))
? list( powf = estbeta-sgnbeta*z1score*sdebeta*beta > 0,
? ? ? ? sdepower = sdebeta)
? )
}
res <- lapply(seq.int(simus), sim.base)
powf <- sapply(res, function(x) x$powf)
sdepower <- sapply(res, function(x) x$sdepower)
powapprox <- apply(powf, 1, sum)
To parallelize, replace lapply above with mclapply.
? Do I need a collect statement?
No, if you use mclapply it's all done for you.
How do I kill the worker processes?
No, they go away once they're done computing.
One thing I'm not accounting for is the random number generating process.? I've seen a couple of ways of doing this, one using mclapply and another using rlecuyer, but I'm not sure how to make them work.? Suggestions? Is there a way to replicate the same random numbers used in the conventional loop in the parallel loop?
There is but you may need read up on that - you have to use a generator that produces a streaming sequence such that you can skip forward (also see recent discussion here and R-devel). For the above you may get away without it by simply setting different seeds in each iteration.
? My machine has 4-Intel? Xeon? Processor X5650 CPUs. When I type: ? library(multicore) multicore:::detectCores() ? it returns 4.? However, the Intel cutsheet says each of these proceesors has 6 cores.? How can I acces them?
You can set any number of cores you want using mc.cores=... argument - the detected cores are just a default if you don't specify anything. What OS/distro is this? It is very unusual to see the number of cores reported wrongly by the system ... Cheers, Simon
I know there's a point of diminishing returns but I can't figure out what it is unless I test it.
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
Scott Raynaud wrote:
Thanks for the feedback. Part of my problem is that I need the most recent copy of R. My IS team told me they can only get an earlier copy using apt-get, but I think there must be a way so I've tasked them with figuring it out. My OS is Kubuntu. I just thought it odd that it could only detect the number of CPUs rather than the number of cores. I'm completely new to parallel processing but it seems that something is not right in the core detection. One quetion I still have regrds child processes. I understand they finish on their own, but what if I need to kill those processing because of an obvious problem. How can I do that?
If you kill the parent, then all of the children should die. On Ubuntu, r-base and r-base-dev are the packages; see http://cran.r-project.org/bin/linux/ubuntu/README You don't need the most recent. Just 2.14.0 or higher.
Rodney Sparapani, PhD Center for Patient Care and Outcomes Research Sr. Biostatistician http://www.mcw.edu/pcor 4 wheels good, 2 wheels better! Medical College of Wisconsin (MCW) WWLD?: What Would Lombardi Do? Milwaukee, WI, USA
Scott,
On Mar 8, 2012, at 11:37 AM, Scott Raynaud wrote:
Thanks for the feedback. Part of my problem is that I need the most recent copy of R.
I don't quite understand how this is related to your question ... The version of R plays no role here ...
My IS team told me they can only get an earlier copy using apt-get, but I think there must be a way so I've tasked them with figuring it out. My OS is Kubuntu. I just thought it odd that it could only detect the number of CPUs rather than the number of cores. I'm completely new to parallel processing but it seems that something is not right in the core detection.
The core detection in multicore on Linux simply looks at /proc/cpuinfo so if it's not right, then your OS is reporting something odd. Note that the detection is just a fall-back if you don't specify anything, so it's really up to you how many parallel processes you want to use.
One quetion I still have regrds child processes. I understand they finish on their own, but what if I need to kill those processing because of an obvious problem. How can I do that?
If you interrupt the master process it automatically kills all child processes and cleans up - at least for all high-level functions like mclapply. If you use low-level functions, you can always use kill(children()); collect() Cheers, Simon
----- Original Message ----- From: Simon Urbanek <simon.urbanek at r-project.org> To: Scott Raynaud <scott.raynaud at yahoo.com> Cc: "r-sig-hpc at r-project.org" <r-sig-hpc at r-project.org> Sent: Tuesday, March 6, 2012 8:27 PM Subject: Re: [R-sig-hpc] Parallel loop Scott, On Mar 6, 2012, at 4:19 PM, Scott Raynaud wrote:
I'm trying to convert a loop in a simulation program to a parallel
process using multicore. It looks like this:
simus=1000
set.seed(12345)
for(iter in 1:simus){
#bunch of code setting up random draws from various
#distributions for X matrix
#Computaion of Y
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
estbeta<-fixef(fitmodel)
sdebeta<-sqrt(diag(vcov(fitmodel)))
for(l in 1:betasize)
{
cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
if(beta[l]*cibeta>0) powaprox[[l]]<-powaprox[[l]]+1
sdepower[l,iter]<-as.numeric(sdebeta[l])
}
##------------------------------------------------------------------------##
} ## iteration end here
The variable fitmodel is defined elsewhere and consists of
fixed and random parts. Betasize is the length of the vector of
fixed effects. Looks like near the bottom there's a
counter that increments if the slope estimate doesn't trap 0.
The next line records the standard deviation. It's these two items
that I need to recover.
So I'm thinking that I'm going to recode this way to run on
a machine that has 4-Xeon CPUs:
library(multicore)
options(cores=4)
simus=1000
sim.base.fun<-function(iter){ #replaces for(iter in 1:simus){
#bunch of code setting up random draws from various
#distributions for X matrix
#Computaion of Y
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
estbeta<-fixef(fitmodel)
sdebeta<-sqrt(diag(vcov(fitmodel)))
for(l in 1:betasize)
{
cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
if(beta[l]*cibeta>0)
return(powaprox[[1]]<-powaprox[[1]]+1 #replaces powaprox[[l]]<-powaprox[[l]]+1
retunr(sedpower[1,iter]<-as.numeric(sdebeta[1]) #replaces sdepower[l,iter]<-as.numeric(sdebeta[l])
}
}#end of sim.base.fun
sim.fun<_lapply(iter in 1:simus),sim.base.fun)
No, the above won't work (all typos aside) since you are still using assignments. Inherently the above is not parallelizable as-is because of
powaprox[[l]]<-powaprox[[l]]+1
but that is trivially removed. Also the inner for loop is entirely superfluous.
Since your code was incomplete this is just a suggestion as we can't test anything (I'm not even checking if what you're doing makes any sense), but it should give you an idea of what to do:
sim.base <- function(iter) {
## your unstated code goes here ...
fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2)
estbeta <- fixef(fitmodel)
sdebeta <- sqrt(diag(vcov(fitmodel)))
list( powf = estbeta-sgnbeta*z1score*sdebeta*beta > 0,
sdepower = sdebeta)
)
}
res <- lapply(seq.int(simus), sim.base)
powf <- sapply(res, function(x) x$powf)
sdepower <- sapply(res, function(x) x$sdepower)
powapprox <- apply(powf, 1, sum)
To parallelize, replace lapply above with mclapply.
Do I need a collect statement?
No, if you use mclapply it's all done for you.
How do I kill the worker processes?
No, they go away once they're done computing.
One thing I'm not accounting for is the random number generating process. I've seen a couple of ways of doing this, one using mclapply and another using rlecuyer, but I'm not sure how to make them work. Suggestions? Is there a way to replicate the same random numbers used in the conventional loop in the parallel loop?
There is but you may need read up on that - you have to use a generator that produces a streaming sequence such that you can skip forward (also see recent discussion here and R-devel). For the above you may get away without it by simply setting different seeds in each iteration.
My machine has 4-Intel? Xeon? Processor X5650 CPUs. When I type: library(multicore) multicore:::detectCores() it returns 4. However, the Intel cutsheet says each of these proceesors has 6 cores. How can I acces them?
You can set any number of cores you want using mc.cores=... argument - the detected cores are just a default if you don't specify anything. What OS/distro is this? It is very unusual to see the number of cores reported wrongly by the system ... Cheers, Simon
I know there's a point of diminishing returns but I can't figure out what it is unless I test it.
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
6 days later
So what I meant by the latest version was that I think that I want to do would be easier in package parallel, but I'm a noob at this so I don't know for sure.? Anyway, I need 2.14.1 to get to that package. My IS people insist that the latest version of R avaialble via apt-get is 2.13.1.? Anything later they claim will have to be compiled.? True? ----- Original Message ----- From: Simon Urbanek <simon.urbanek at r-project.org> To: Scott Raynaud <scott.raynaud at yahoo.com> Cc: "r-sig-hpc at r-project.org" <r-sig-hpc at r-project.org> Sent: Thursday, March 8, 2012 8:30 PM Subject: Re: [R-sig-hpc] Parallel loop Scott,
On Mar 8, 2012, at 11:37 AM, Scott Raynaud wrote:
Thanks for the feedback.? Part of my problem is that I need the most recent copy of R.
I don't quite understand how this is related to your question ... The version of R plays no role here ...
My IS team told me they can only get an earlier copy using apt-get, but I think there must be a way so I've tasked them with figuring it out.? My OS is Kubuntu. I just thought it odd that it could only detect the? number of CPUs rather than the number of cores.? I'm completely new to parallel processing but it seems that something is not right in the core detection.
The core detection in multicore on Linux simply looks at /proc/cpuinfo so if it's not right, then your OS is reporting something odd. Note that the detection is just a fall-back if you don't specify anything, so it's really up to you how many parallel processes you want to use.
One quetion I still have regrds child processes. I understand they finish on their own, but what if I need to kill those processing because of an obvious problem.? How can I do that?
If you interrupt the master process it automatically kills all child processes and cleans up - at least for all high-level functions like mclapply. If you use low-level functions, you can always use kill(children()); collect() Cheers, Simon
----- Original Message ----- From: Simon Urbanek <simon.urbanek at r-project.org> To: Scott Raynaud <scott.raynaud at yahoo.com> Cc: "r-sig-hpc at r-project.org" <r-sig-hpc at r-project.org> Sent: Tuesday, March 6, 2012 8:27 PM Subject: Re: [R-sig-hpc] Parallel loop Scott, On Mar 6, 2012, at 4:19 PM, Scott Raynaud wrote:
I'm trying to convert a loop in a simulation program to a parallel
process using multicore.? It looks like this:
simus=1000
set.seed(12345)
for(iter in 1:simus){
?
#bunch of code setting up random draws from various
#distributions for X matrix
?
#Computaion of Y
?
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
?
estbeta<-fixef(fitmodel)
? sdebeta<-sqrt(diag(vcov(fitmodel)))
? for(l in 1:betasize)
? {?
? ? cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
? ? if(beta[l]*cibeta>0)? ? ? ? ? ? ? powaprox[[l]]<-powaprox[[l]]+1
? ? ? sdepower[l,iter]<-as.numeric(sdebeta[l])
? }?
##------------------------------------------------------------------------##
? ? ? ? ? ? ? ? ? ? ? ? ? } ##? iteration end here
?
?
The variable fitmodel is defined elsewhere and consists of
fixed and random parts.? Betasize is the length of the vector of
fixed effects.? Looks like near the bottom there's a
counter that increments if the slope estimate doesn't trap 0.?
The next line records the standard deviation.? It's these two items
that I need to recover.
?
So I'm thinking that I'm going to recode this way to run on
a machine that has 4-Xeon CPUs:
?
library(multicore)
options(cores=4)
simus=1000
sim.base.fun<-function(iter){? #replaces for(iter in 1:simus){
#bunch of code setting up random draws from various
#distributions for X matrix
?
#Computaion of Y
?
(fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
?
estbeta<-fixef(fitmodel)
? sdebeta<-sqrt(diag(vcov(fitmodel)))
? for(l in 1:betasize)
? {?
? ? cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
? ? if(beta[l]*cibeta>0)? ?
? ? return(powaprox[[1]]<-powaprox[[1]]+1 #replaces powaprox[[l]]<-powaprox[[l]]+1
? ? retunr(sedpower[1,iter]<-as.numeric(sdebeta[1]) #replaces sdepower[l,iter]<-as.numeric(sdebeta[l])
? ? ? }
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? }#end of sim.base.fun
sim.fun<_lapply(iter in 1:simus),sim.base.fun)
?
No, the above won't work (all typos aside) since you are still using assignments. Inherently the above is not parallelizable as-is because of
powaprox[[l]]<-powaprox[[l]]+1
but that is trivially removed. Also the inner for loop is entirely superfluous.
Since your code was incomplete this is just a suggestion as we can't test anything (I'm not even checking if what you're doing makes any sense), but it should give you an idea of what to do:
sim.base <- function(iter) {
? ## your unstated code goes here ...
? fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2)
? estbeta <- fixef(fitmodel)
? sdebeta <- sqrt(diag(vcov(fitmodel)))
? list( powf = estbeta-sgnbeta*z1score*sdebeta*beta > 0,
? ? ? ? sdepower = sdebeta)
? )
}
res <- lapply(seq.int(simus), sim.base)
powf <- sapply(res, function(x) x$powf)
sdepower <- sapply(res, function(x) x$sdepower)
powapprox <- apply(powf, 1, sum)
To parallelize, replace lapply above with mclapply.
? Do I need a collect statement?
No, if you use mclapply it's all done for you.
How do I kill the worker processes?
No, they go away once they're done computing.
One thing I'm not accounting for is the random number generating process.? I've seen a couple of ways of doing this, one using mclapply and another using rlecuyer, but I'm not sure how to make them work.? Suggestions? Is there a way to replicate the same random numbers used in the conventional loop in the parallel loop?
There is but you may need read up on that - you have to use a generator that produces a streaming sequence such that you can skip forward (also see recent discussion here and R-devel). For the above you may get away without it by simply setting different seeds in each iteration.
? My machine has 4-Intel? Xeon? Processor X5650 CPUs. When I type: ? library(multicore) multicore:::detectCores() ? it returns 4.? However, the Intel cutsheet says each of these proceesors has 6 cores.? How can I acces them?
You can set any number of cores you want using mc.cores=... argument - the detected cores are just a default if you don't specify anything. What OS/distro is this? It is very unusual to see the number of cores reported wrongly by the system ... Cheers, Simon
I know there's a point of diminishing returns but I can't figure out what it is unless I test it.
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
On 15 March 2012 at 09:43, Scott Raynaud wrote:
| So what I meant by the latest version was that I think that I want to do would be | easier in package parallel, but I'm a noob at this so I don't know for sure.? Anyway, | I need 2.14.1 to get to that package. | | My IS people insist that the latest version of R avaialble via apt-get is | 2.13.1.? Anything later they claim will have to be compiled.? True? False. See the R FAQ. Dirk
"Outside of a dog, a book is a man's best friend. Inside of a dog, it is too dark to read." -- Groucho Marx
(Apologies if this has already been discussed) I've found it unexpectedly painless to compile R from source on a linux machine, if you'd rather not wait for your IS people. From memory, I believe the necessary commands are: ./configure --without-X make -----Original Message----- From: r-sig-hpc-bounces at r-project.org [mailto:r-sig-hpc-bounces at r-project.org] On Behalf Of Scott Raynaud Sent: Thursday, March 15, 2012 12:43 PM To: r-sig-hpc at r-project.org Subject: Re: [R-sig-hpc] Parallel loop My IS people insist that the latest version of R avaialble via apt-get is 2.13.1.? Anything later they claim will have to be compiled.? True?
On Thu, Mar 15, 2012 at 05:27:23PM +0000, Michael Spiegel wrote:
(Apologies if this has already been discussed) I've found it unexpectedly painless to compile R from source on a linux machine, if you'd rather not wait for your IS people. From memory, I believe the necessary commands are: ./configure --without-X make
One must also have some sort of FORTRAN compiler available. Many systems don't. Norm Matloff
On 12-03-15 02:02 PM, Norm Matloff wrote:
On Thu, Mar 15, 2012 at 05:27:23PM +0000, Michael Spiegel wrote:
(Apologies if this has already been discussed) I've found it unexpectedly painless to compile R from source on a linux machine, if you'd rather not wait for your IS people. From memory, I believe the necessary commands are: ./configure --without-X make
One must also have some sort of FORTRAN compiler available. Many systems don't. Norm Matloff
This strikes me a a pretty severe constraint on a system that is for HPC. Don't you find that limits the local tuning you might do, the availability of packages, etc.? Is there a reason for not installing a fortran compiler? Do you have a C compiler? (I am still trying to understand the various piece of parallel computing, and the sorts of environments people use, but this caught me by surprise.) Paul
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc