An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090421/37f12f55/attachment-0001.pl>
My surprising experience in trying out REvolution's R
7 messages · Bert Gunter, David Smith, Dieter Menne +1 more
Jason: Thanks for posting this ... interesting. A guess: Depends on the problem, the hardware, the matrix libraries,... e.g. in relatively small problems, Revolution's overhead may consume more time and resources than the problem warrants. In others, you may see many fold improvements. Very dangerous to generalize from an example or two (as I recently experienced to my own chagrin). Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics 650-467-7374 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jason Liao Sent: Tuesday, April 21, 2009 7:26 AM To: r-help at r-project.org Subject: [R] My surprising experience in trying out REvolution's R I care a lot about R's speed. So I decided to give REvolution's R (http://revolution-computing.com/) a try, which bills itself as an optimized R. Note that I used the free version. My machine is a Intel core 2 duo under Windows XP professional. The code I run is in the end of this post. First, the regular R 1.9. It takes 2 minutes and 6 seconds, CPU usage 50% Next, REvolution's R. It takes 2 minutes and 10 seconds, CPU usage 100%. In other words, REvolution's R consumes double the CPU with slightly less speed. The above has been replicated a few times (as a statistician of course). Anyone has any insight on this? Anyway, my high hope was dashed. rm(list=ls(all=TRUE)) library(MASS); ###small and common functions################################## glmm.sample.y <- function(b, D.u, x, z) { pp <- matrix(0, nrow = n, ncol = m); zero <- numeric(n.random); ran <- t( mvrnorm(m, zero, D.u) ); for(j in 1:m) pp[, j] <- as.vector( x[, , j] %*% b + z[, , j] %*% ran[ ,j] ); pp <- exp(pp)/( 1+exp(pp) ); y <- runif(m*n); ifelse(y<pp, 1, 0); } ################# quadratic.form <- function(A, x) { return( as.vector(x %*% A %*% x) ) } quadratic.form2 <- function(A, x) { x <- chol(A) %*% x; colSums(x*x) } ####################################################################### REML.b.D.u <- function(b, D.u, x, y, z, n.iter) { u.mean.initial <- array( 0, c(n.random, m) ); TT <- matrix(0, n.random, n.random); for(i in 1:n.iter) { TT <- TT*(1-1/i) + bias(b, D.u, x, z)/i; obj <- sample.u(b, D.u, x, y, z, u.mean.initial); b <- b - solve(obj$Hessian, obj$score); D.u <- obj$uu + TT; u.mean.initial <- obj$u.mean.initial; print(i); print(date()); print("inside REML"); print(TT); if(i==n.iter) write(TT, file="c:/liao/reml/simu50_8.dat", append=T); print(D.u); } list(b=b, D.u=D.u); } bias <- function(b, D.u, x, z) { yy <- glmm.sample.y(b, D.u, x, z); #this is the sampling stage obj1 <- mle(b, D.u, x, yy, z, F, F, n.iter=50); obj2 <- mle(b, D.u, x, yy, z, T, F, n.iter=50); obj1$uu - obj2$uu } ################################################## mle <- function(b, D.u, x, y, z, indx1, indx2, n.iter) { u.mean.initial <- array( 0, c(n.random, m) ); for(i in 1:n.iter) { obj <- sample.u(b, D.u, x, y, z, u.mean.initial); if(indx1) b <- b - solve(obj$Hessian, obj$score); if(indx2) D.u <- obj$uu; u.mean.initial <- obj$u.mean.initial; } list(b=b, D.u=D.u, uu=obj$uu, u.mean.initial=u.mean.initial); } ############################################## sample.u <- function(b, D.u, x, y, z, u.mean.initial) { D.u.inv <- solve(D.u); uu <- matrix(0, n.random, n.random); score <- numeric(n.fixed); Hessian <- matrix(0, n.fixed, n.fixed); for(i in 1:m) { ada.part <- as.vector(x[, , i] %*% b); obj <- sample.u.one( D.u.inv, ada.part, x[, , i], y[, i], z[, ,i], u.mean.initial[, i] ); uu <- uu + obj$uu; score <- score + obj$score; Hessian <- Hessian + obj$Hessian; u.mean.initial[, i] <- obj$u.mean.initial; } uu <- uu/m; list(uu=uu, score=score, Hessian=Hessian, u.mean.initial=u.mean.initial); } ########################################## sample.u.one <- function(D.u.inv, ada.part, x, y, z, u.mean.initial) #ada.part, x, y, z for one subject only { fn <- log.like(y, z, D.u.inv, ada.part); obj <- optim(u.mean.initial, fn, control=list(fnscale=-1), hessian = T) u.mean.initial <- obj$par; var.root <- solve(-obj$hessian); var.root <- t( chol(var.root) ); u <- matrix( rnorm(n.random*n.simu), nrow=n.random, ncol=n.simu ); log.prob1 <- -colSums(u*u)/2; u <- u.mean.initial + var.root %*% u; ada <- exp( ada.part + z %*% u ); ada <- ada/(1+ada); #it is probability now log.prob2 <- colSums( y*log(ada) + (1-y)*log(1-ada) ); log.prob2 <- -quadratic.form2(D.u.inv, u)/2 + log.prob2; weight <- exp(log.prob2 - log.prob1); weight <- weight/sum(weight); ada <- t(ada); pi <- colSums(ada*weight); product <- colSums( ada*(1-ada)*weight ); score <- as.vector( (y - pi) %*% x ); Hessian <- -t(x) %*% ( rep(product, n.fixed)*x ); obj <- cov.wt( t(u), weight, center=T ); uu <- obj$cov + obj$center %*% t(obj$center); list(uu=uu, score=score, Hessian=Hessian, u.mean.initial=obj$center); } ######################################### log.like <- function(y, z, D.u.inv, ada.part) { function(u) { ada <- exp( ada.part + z %*% u ); ada <- ada/(1+ada); sum( y*log(ada) + (1-y)*log(1-ada) ) -quadratic.form(D.u.inv, u)/2; } } main <- function() { b <- c(.5, .5 , -1, -.5); b <- c(.5, .5 , -1, -.5, 0, 0, 0, 0); D.u <- diag( c(.5, .5) ); x <- array(0, c(n, n.fixed, m) ); x[ ,1, ] <- 1; for(i in 1:n) x[i, 2, ] <- (i-5)/4; x[, 3, 1:trunc(m/2)] <- 0; x[, 3, trunc(m/2+1):m] <- 1; x[, 4, ] <- x[, 2, ]*x[, 3, ]; x[1, 5:n.fixed, ] <- rnorm(4*m); for(i in 2:n) x[i, 5:n.fixed,] <- x[1, 5:n.fixed, ]; z <- x[, 1:2, ]; for(i in 1:200) { print(i); print(date()); y <- glmm.sample.y(b, D.u, x, z); obj <- mle(b, D.u, x, y, z, F, T, n.iter=50); print("estimate with b known") print(obj$D.u); obj <- mle(b, D.u, x, y, z, T, T, n.iter=50); print("profile MLE"); print(obj$D.u); obj <- REML.b.D.u(b, D.u, x, y, z, n.iter=50); print("REML type estimate") print(obj$D.u); } } ################################################### n <- 10; #number of observation for each cluster m <- 50; #number of clusters n.fixed <- 8; n.random <- 2; n.simu <- 2000; main(); ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
That's our guess too. We're running some tests now on the code to see what's going on, and it's entirely possible the performance gains are a function of the problem size, so we're testing that too. Changes in R between 2.7.2 (upon which REvolution R is currently based) and 2.9.0 are also a confounding factor. (I'm assuming the timings reported were on 2.9, not 1.9 as stated.) I'll report back here when I have more details. # David Smith
On Tue, Apr 21, 2009 at 10:32 AM, Bert Gunter <gunter.berton at gene.com> wrote:
A guess: Depends on the problem, the hardware, the matrix libraries,... e.g. in relatively small problems, Revolution's overhead may consume more time and resources than the problem warrants. In others, you may see many fold improvements. Very dangerous to generalize from an example or two (as I recently experienced to my own chagrin).
-- David M Smith <david at revolution-computing.com> Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events
Jason Liao <JLIAO <at> hes.hmc.psu.edu> writes:
I care a lot about R's speed. So I decided to give REvolution's R (http://revolution-computing.com/) a try, which bills itself as an optimized R. Note that I used the free version. My machine is a Intel core 2 duo under Windows XP professional. The code I run is in the end of this post. First, the regular R 1.9. It takes 2 minutes and 6 seconds, CPU usage 50% Next, REvolution's R. It takes 2 minutes and 10 seconds, CPU usage 100%. In other words, REvolution's R consumes double the CPU with slightly less speed.
The fact that it is the same time with only the 50%/100% makes me think this is an artifact of the CPU indicator. Try to assign only one CPU to the REvolution task. Wild guess: also 2 minutes plus minus Dieter
1 day later
We've taken a look at this in a bit more detail; it's a very interesting example. Although the code uses several functions that exploit the parallel processing in REvolution R (notably %*% and chol), this was one of those situations where the overheads of threading pretty much balanced any performance gains: the individual matrices for the operations were too small. For some examples where the performance gains do show, see: http://www.revolution-computing.com/products/r-performance.php A more promising avenue for speeding up this code lies in parallelizing the outer for(i in 1:200) loop... # David Smith
On Tue, Apr 21, 2009 at 9:26 AM, Jason Liao <JLIAO at hes.hmc.psu.edu> wrote:
I care a lot about R's speed. So I decided to give REvolution's R (http://revolution-computing.com/) a try, which bills itself as an optimized R. Note that I used the free version. My machine is a Intel core 2 duo under Windows XP professional. The code I run is in the end of this post. First, the regular R 1.9. It takes 2 minutes and 6 seconds, CPU usage 50% Next, REvolution's R. It takes 2 minutes and 10 seconds, CPU usage 100%. In other words, REvolution's R consumes double the CPU with slightly less speed. The above has been replicated a few times (as a statistician of course). Anyone has any insight on this? Anyway, my high hope was dashed.
David M Smith <david at revolution-computing.com> Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events
David, thanks for following this through. I do not know how big a matrix needs to be before the multi-core multi-threading will start save time. But it seems useful to build this protection in your distribution so that it will not do multi-core when multi-threading is more likely to do harm. Jason -----Original Message----- From: David M Smith [mailto:david at revolution-computing.com] Sent: Thursday, April 23, 2009 6:05 PM To: Jason Liao Cc: r-help at r-project.org Subject: Re: [R] My surprising experience in trying out REvolution's R We've taken a look at this in a bit more detail; it's a very interesting example. Although the code uses several functions that exploit the parallel processing in REvolution R (notably %*% and chol), this was one of those situations where the overheads of threading pretty much balanced any performance gains: the individual matrices for the operations were too small. For some examples where the performance gains do show, see: http://www.revolution-computing.com/products/r-performance.php A more promising avenue for speeding up this code lies in parallelizing the outer for(i in 1:200) loop... # David Smith On Tue, Apr 21, 2009 at 9:26 AM, Jason Liao <JLIAO at hes.hmc.psu.edu> wrote:
I care a lot about R's speed. So I decided to give REvolution's R (http://revolution-computing.com/) a try, which bills itself as an optimized R. Note that I used the free version. My machine is a Intel core 2 duo under Windows XP professional. The
code
I run is in the end of this post. First, the regular R 1.9. It takes 2 minutes and 6 seconds, CPU usage 50% Next, REvolution's R. It takes 2 minutes and 10 seconds, CPU usage
100%.
In other words, REvolution's R consumes double the CPU with slightly less speed. The above has been replicated a few times (as a statistician of
course).
Anyone has any insight on this? Anyway, my high hope was dashed.
David M Smith <david at revolution-computing.com> Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events
You can always turn the multi-threading off with: setMKLthreads(1) We tested your code with this setting, and it ran in exactly the same time as CRAN R. # David Smith
On Fri, Apr 24, 2009 at 7:34 AM, Jason Liao <JLIAO at hes.hmc.psu.edu> wrote:
David, thanks for following this through. I do not know how big a matrix needs to be before the multi-core multi-threading will start save time. But it seems useful to build this protection in your distribution so that it will not do multi-core when multi-threading is more likely to do harm. Jason -----Original Message----- From: David M Smith [mailto:david at revolution-computing.com] Sent: Thursday, April 23, 2009 6:05 PM To: Jason Liao Cc: r-help at r-project.org Subject: Re: [R] My surprising experience in trying out REvolution's R We've taken a look at this in a bit more detail; it's a very interesting example. ?Although the code uses several functions that exploit the parallel processing in REvolution R (notably %*% and chol), this was one of those situations where the overheads of threading pretty much balanced any performance gains: the individual matrices for the operations were too small. For some examples where the performance gains do show, see: http://www.revolution-computing.com/products/r-performance.php A more promising avenue for speeding up this code lies in parallelizing the outer for(i in 1:200) loop... # David Smith On Tue, Apr 21, 2009 at 9:26 AM, Jason Liao <JLIAO at hes.hmc.psu.edu> wrote:
I care a lot about R's speed. So I decided to give REvolution's R (http://revolution-computing.com/) a try, which bills itself as an optimized R. Note that I used the free version. My machine is a Intel core 2 duo under Windows XP professional. The
code
I run is in the end of this post. First, the regular R 1.9. It takes 2 minutes and 6 seconds, CPU usage 50% Next, REvolution's R. It takes 2 minutes and 10 seconds, CPU usage
100%.
In other words, REvolution's R consumes double the CPU with slightly less speed. The above has been replicated a few times (as a statistician of
course).
Anyone has any insight on this? Anyway, my high hope was dashed.
-- David M Smith <david at revolution-computing.com> Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events
David M Smith <david at revolution-computing.com> Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events