Skip to content

Possible Improvement of the R code

4 messages · Li Li, Berend Hasselman, William Dunlap

#
On 17-09-2012, at 00:51, li li wrote:

            
You can use the compiler package.
It also helps if you don't repeat certain calculations. For example (param[(i-1),3]+param[(i-1),4]/2) is computed three times.
Once is enough.

See this example where your code has been put in function f1. The simplified code is in function f3.
Functions f2 and f4 are the compiled versions of f1 and f3.

library(compiler)
library(rbenchmark)
param <- matrix(0, 11, 5)
colnames(param) <- c("p", "q", "r", "2s", "t")
param[1,] <- c(0.5, 0.5, 0.4, 0.5, 0.1)

# your calculation
f1 <- function(param) {
    for (i in 2:11){
        param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
        param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
        param[i,3] <- param[(i-1),1]*(param[(i-1),3]+param[(i-1),4]/2)
        param[i,4] <- param[(i-1),1]*(param[(i-1),4]/2+param[(i-1),5])+param[(i-1),2]*(param[(i-1),3]+param[(i-1),4]/2)
        param[i,5] <- param[(i-1),2]*(param[(i-1),4]/2+param[(i-1),5])
    }

    param
}

f2 <- cmpfun(f1)

# modified by replacing identical sub-expressions with result
f3 <- function(param) {
    for (i in 2:11){
        param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
        param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
        param[i,3] <- param[(i-1),1]*param[i,1]
        param[i,4] <- param[(i-1),1]*param[i,2]+param[(i-1),2]*param[i,1]
        param[i,5] <- param[(i-1),2]*param[i,2]
    }

    param
}

f4 <- cmpfun(f3)

z1 <- f1(param)
z2 <- f2(param)
z3 <- f3(param)
z4 <- f4(param)

Running in R
[1] TRUE
[1] TRUE
[1] TRUE
test replications elapsed relative
1 f1(param)         5000   3.748    2.502
2 f2(param)         5000   2.104    1.405
3 f3(param)         5000   2.745    1.832
4 f4(param)         5000   1.498    1.000

f4 is quite an improvement over f1.
It's quite possible that more can be gained but I'm too lazy to investigate further.

Berend
#
Unlike C or C++, subscripting vector or matrix is not almost free.
Also, subscripting a matrix tends to make more time than subscripting a
vector so it can help to pull out the previous row of params once
per iteration, use vector subscripting on the that row to build the new
row, and then insert the new row back into params with one matrix
subscripting call:
f5 <- function(param) {
    for(i in 2:11) {
        # do matrix subscripting only twice per iteration
        prev <- param[i-1, ]
        param[i,] <- c(
            prev[3] + prev[4]/2,
            prev[4]/2 + prev[5],
            prev[1] * (prev[3] + prev[4]/2),
            prev[1] * (prev[4]/2 + prev[5]) + prev[2] * (prev[3] + prev[4]/2),
            prev[2] * (prev[4]/2 + prev[5]))
    }
    param
}
You can also do as much as possible to not repeat any subscripting or
sums:
f6 <- function(param) {
    for(i in 2:11) {
        # do any subscripting only twice/iteration and
        # repeated sums only once/iteration.
        prev <- param[i-1, ]
        p1 <- prev[1]
        p2 <- prev[2]
        p42 <- prev[4]/2
        p342 <- prev[3] + p42
        p425 <- p42 + prev[5]
        param[i,] <- c(
            p342,
            p425,
            p1 * p342,
            p1 * p425 + p2 * p342,
            p2 * p425)
    }
    param
}

I was curious about how much the compiler package could replace hand-optimization
so I timed these with and without compiling.  (f1 and f3 are Berend's f1 and f3; I renamed
his f2 and f4 to f1c and f3c to indicate they were the compiled versions.)
f1  f1c   f3  f3c   f5  f5c   f6 f6c
user.self 5.03 2.82 3.74 1.89 2.73 1.78 1.54 0.9
sys.self  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.0
elapsed   5.03 2.83 3.75 1.90 2.76 1.77 1.54 0.9
raw compiled
f1 5.03     2.82
f3 3.74     1.89
f5 2.73     1.78
f6 1.54     0.90
It looks like both hand- and machine-optimization help quite a bit here.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Getting rid of one more matrix subscripting call trims the
time by another 10-20%.  With the following function the times
for 10^4 reps for a a 11-row input matrix become
      raw compiled
  f1 5.57     3.04
  f3 3.73     2.14
  f5 3.45     1.95
  f6 1.63     0.90
  f7 1.32     0.81

f7 <- function(param) {
    prev <- param[1,]
    for(i in 2:nrow(param)) {
        # do vector and matrix subscripting only once/iteration
        # and repeated sums only once/iteration.
        p1 <- prev[1]
        p2 <- prev[2]
        p42 <- prev[4]/2
        p342 <- prev[3] + p42
        p425 <- p42 + prev[5]
        param[i,] <- prev <- c(
            p342,
            p425,
            p1 * p342,
            p1 * p425 + p2 * p342,
            p2 * p425)
    }
    param
}
f7c <- cmpfun(f7)




Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com