very fast OLS regression?
On Wed, 2009-03-25 at 22:11 +0100, Dimitris Rizopoulos wrote:
check the following options:
ols1 <- function (y, x) {
coef(lm(y ~ x - 1))
}
ols2 <- function (y, x) {
xy <- t(x)%*%y
xxi <- solve(t(x)%*%x)
b <- as.vector(xxi%*%xy)
b
}
ols3 <- function (y, x) {
XtX <- crossprod(x)
Xty <- crossprod(x, y)
solve(XtX, Xty)
}
ols4 <- function (y, x) {
lm.fit(x, y)$coefficients
}
# check timings
MC <- 500
N <- 10000
set.seed(0)
x <- matrix(rnorm(N*MC), N, MC)
y <- matrix(rnorm(N*MC), N, MC)
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))
Hi Dimitris and Ivo I think this is not a fair comparison, look this x[8,100]<-NA system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc])) user system elapsed 8.765 0.000 8.762 system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc])) Error in solve.default(t(x) %*% x) : system is computationally singular: reciprocal condition number = 0 Timing stopped at: 0 0 0.002 system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc])) Error in solve.default(XtX, Xty) : system is computationally singular: reciprocal condition number = 0 Timing stopped at: 0 0 0.001 system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE])) Error in lm.fit(x, y) : NA/NaN/Inf in foreign function call (arg 1) Timing stopped at: 0 0 0.001 So routines ols2, ols3 and ols4 only functional in fully matrix if have one NA this functions don't run.
Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil