Possible Improvement of the R code
On 17-09-2012, at 00:51, li li wrote:
Dear all,
In the following code, I was trying to compute each row of the "param"
iteratively based
on the first row.
This likely is not the best way. Can anyone suggest a simpler way to
improve the code.
Thanks a lot!
Hannah
param <- matrix(0, 11, 5)
colnames(param) <- c("p", "q", "r", "2s", "t")
param[1,] <- c(0.5, 0.5, 0.4, 0.5, 0.1)
for (i in 2:11){
param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
param[i,3] <- param[(i-1),1]*(param[(i-1),3]+param[(i-1),4]/2)
param[i,4] <- param[(i-1),1]*(param[(i-1),4]/2+param[(i-1),5])+param[(i-1),2
]*(param[(i-1),3]+param[(i-1),4]/2)
param[i,5] <- param[(i-1),2]*(param[(i-1),4]/2+param[(i-1),5])
}
You can use the compiler package.
It also helps if you don't repeat certain calculations. For example (param[(i-1),3]+param[(i-1),4]/2) is computed three times.
Once is enough.
See this example where your code has been put in function f1. The simplified code is in function f3.
Functions f2 and f4 are the compiled versions of f1 and f3.
library(compiler)
library(rbenchmark)
param <- matrix(0, 11, 5)
colnames(param) <- c("p", "q", "r", "2s", "t")
param[1,] <- c(0.5, 0.5, 0.4, 0.5, 0.1)
# your calculation
f1 <- function(param) {
for (i in 2:11){
param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
param[i,3] <- param[(i-1),1]*(param[(i-1),3]+param[(i-1),4]/2)
param[i,4] <- param[(i-1),1]*(param[(i-1),4]/2+param[(i-1),5])+param[(i-1),2]*(param[(i-1),3]+param[(i-1),4]/2)
param[i,5] <- param[(i-1),2]*(param[(i-1),4]/2+param[(i-1),5])
}
param
}
f2 <- cmpfun(f1)
# modified by replacing identical sub-expressions with result
f3 <- function(param) {
for (i in 2:11){
param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
param[i,3] <- param[(i-1),1]*param[i,1]
param[i,4] <- param[(i-1),1]*param[i,2]+param[(i-1),2]*param[i,1]
param[i,5] <- param[(i-1),2]*param[i,2]
}
param
}
f4 <- cmpfun(f3)
z1 <- f1(param)
z2 <- f2(param)
z3 <- f3(param)
z4 <- f4(param)
Running in R
all.equal(z2,z1)
[1] TRUE
all.equal(z3,z1)
[1] TRUE
all.equal(z4,z1)
[1] TRUE
benchmark(f1(param), f2(param), f3(param), f4(param),replications=5000, columns=c("test", "replications", "elapsed", "relative"))
test replications elapsed relative 1 f1(param) 5000 3.748 2.502 2 f2(param) 5000 2.104 1.405 3 f3(param) 5000 2.745 1.832 4 f4(param) 5000 1.498 1.000 f4 is quite an improvement over f1. It's quite possible that more can be gained but I'm too lazy to investigate further. Berend