Dear R experts:
I just tried some simple test that told me that hand computing the OLS
coefficients is about 3-10 times as fast as using the built-in lm()
function. (code included below.) Most of the time, I do not care,
because I like the convenience, and I presume some of the time goes
into saving a lot of stuff that I may or may not need. But when I do
want to learn the properties of an estimator whose input contains a
regression, I do care about speed.
What is the recommended fastest way to get regression coefficients in
R? (Is Gentlemen's weighted-least-squares algorithm implemented in a
low-level C form somewhere? that one was always lightning fast for
me.)
regards,
/ivo
bybuiltin = function( y, x ) coef(lm( y ~ x -1 ));
byhand = function( y, x ) {
xy<-t(x)%*%y;
xxi<- solve(t(x)%*%x)
b<-as.vector(xxi%*%xy)
## I will need these later, too:
## res<-y-as.vector(x%*%b)
## soa[i]<-b[2]
## sigmas[i]<-sd(res)
b;
}
MC=500;
N=10000;
set.seed(0);
x= matrix( rnorm(N*MC), nrow=N, ncol=MC );
y= matrix( rnorm(N*MC), nrow=N, ncol=MC );
ptm = proc.time()
for (mc in 1:MC) byhand(y[,mc],x[,mc]);
cat("By hand took ", proc.time()-ptm, "\n");
ptm = proc.time()
for (mc in 1:MC) bybuiltin(y[,mc],x[,mc]);
cat("By built-in took ", proc.time()-ptm, "\n");
very fast OLS regression?
12 messages · Dimitris Rizopoulos, Ravi Varadhan, Bert Gunter +5 more
check the following options:
ols1 <- function (y, x) {
coef(lm(y ~ x - 1))
}
ols2 <- function (y, x) {
xy <- t(x)%*%y
xxi <- solve(t(x)%*%x)
b <- as.vector(xxi%*%xy)
b
}
ols3 <- function (y, x) {
XtX <- crossprod(x)
Xty <- crossprod(x, y)
solve(XtX, Xty)
}
ols4 <- function (y, x) {
lm.fit(x, y)$coefficients
}
# check timings
MC <- 500
N <- 10000
set.seed(0)
x <- matrix(rnorm(N*MC), N, MC)
y <- matrix(rnorm(N*MC), N, MC)
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))
I hope it helps.
Best,
Dimitris
ivo welch wrote:
Dear R experts:
I just tried some simple test that told me that hand computing the OLS
coefficients is about 3-10 times as fast as using the built-in lm()
function. (code included below.) Most of the time, I do not care,
because I like the convenience, and I presume some of the time goes
into saving a lot of stuff that I may or may not need. But when I do
want to learn the properties of an estimator whose input contains a
regression, I do care about speed.
What is the recommended fastest way to get regression coefficients in
R? (Is Gentlemen's weighted-least-squares algorithm implemented in a
low-level C form somewhere? that one was always lightning fast for
me.)
regards,
/ivo
bybuiltin = function( y, x ) coef(lm( y ~ x -1 ));
byhand = function( y, x ) {
xy<-t(x)%*%y;
xxi<- solve(t(x)%*%x)
b<-as.vector(xxi%*%xy)
## I will need these later, too:
## res<-y-as.vector(x%*%b)
## soa[i]<-b[2]
## sigmas[i]<-sd(res)
b;
}
MC=500;
N=10000;
set.seed(0);
x= matrix( rnorm(N*MC), nrow=N, ncol=MC );
y= matrix( rnorm(N*MC), nrow=N, ncol=MC );
ptm = proc.time()
for (mc in 1:MC) byhand(y[,mc],x[,mc]);
cat("By hand took ", proc.time()-ptm, "\n");
ptm = proc.time()
for (mc in 1:MC) bybuiltin(y[,mc],x[,mc]);
cat("By built-in took ", proc.time()-ptm, "\n");
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014
thanks, dimitris. I also added Bill Dunlap's "solve(qr(x),y)" function as ols5. here is what I get in terms of speed on a Mac Pro: ols1 6.779 3.591 10.37 0 0 ols2 0.515 0.21 0.725 0 0 ols3 0.576 0.403 0.971 0 0 ols4 1.143 1.251 2.395 0 0 ols5 0.683 0.565 1.248 0 0 so the naive matrix operations are fastest. I would have thought that alternatives to the naive stuff I learned in my linear algebra course would be quicker. still, ols3 and ols5 are competitive. the built-in lm() is really problematic. is ols3 (or perhaps even ols5) preferable in terms of accuracy? I think I can deal with 20% speed slow-down (but not with a factor 10 speed slow-down). regards, /iaw On Wed, Mar 25, 2009 at 5:11 PM, Dimitris Rizopoulos
<d.rizopoulos at erasmusmc.nl> wrote:
check the following options:
ols1 <- function (y, x) {
? ?coef(lm(y ~ x - 1))
}
ols2 <- function (y, x) {
? ?xy <- t(x)%*%y
? ?xxi <- solve(t(x)%*%x)
? ?b <- as.vector(xxi%*%xy)
? ?b
}
ols3 <- function (y, x) {
? ?XtX <- crossprod(x)
? ?Xty <- crossprod(x, y)
? ?solve(XtX, Xty)
}
ols4 <- function (y, x) {
? ?lm.fit(x, y)$coefficients
}
Ivo, I would be careful about dealing with situations where x may be numerically ill-conditioned. Your code will do quite badly in such cases. Here is an extreme illustration: set.seed(123) x <- matrix(rnorm(20), 10, 2) x <- cbind(x, x[,1]) # x is clearly rank-deficient y <- c(x %*% c(1,2,3))
byhand(y, x)
Error in solve.default(t(x) %*% x) : Lapack routine dgesv: system is exactly singular
A better approach would be to explicitly use "QR" decomposition in your byhand() function:
byhand.qr = function( y, x, tol=1.e-07 ) {
qr.x <- qr(x, tol=tol, LAPACK=TRUE)
b<-qr.coef(qr.x, y)
## I will need these later, too:
## res<- qr.res(qr.x, y)
## soa[i]<-b[2]
## sigmas[i]<-sd(res)
b;
}
ans <- byhand.qr(y, x) ans
[1] 1.795725 2.000000 2.204275
You can verify that this a "valid" solution:
y - c(x %*% ans)
[1] 9.436896e-16 2.498002e-16 -8.881784e-16 0.000000e+00 -4.440892e-16 1.776357e-15 4.440892e-16 0.000000e+00 4.440892e-16 -4.440892e-16
However, this is not the "unique" solution, since there an infinite number of solutions. We can get "the" solution that has the minimum length by using Moore-Penrose inverse:
require(MASS) ans.min <- c(ginv(x) %*% y) ans.min
[1] 2 2 2
You can verify that ans.min has a smaller L2-norm than ans, and it is unique. Hope this helps, Ravi. ____________________________________________________________________ Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvaradhan at jhmi.edu ----- Original Message ----- From: ivo welch <ivowel at gmail.com> Date: Wednesday, March 25, 2009 4:31 pm Subject: [R] very fast OLS regression? To: r-help <r-help at stat.math.ethz.ch>
Dear R experts:
I just tried some simple test that told me that hand computing the OLS
coefficients is about 3-10 times as fast as using the built-in lm()
function. (code included below.) Most of the time, I do not care,
because I like the convenience, and I presume some of the time goes
into saving a lot of stuff that I may or may not need. But when I do
want to learn the properties of an estimator whose input contains a
regression, I do care about speed.
What is the recommended fastest way to get regression coefficients in
R? (Is Gentlemen's weighted-least-squares algorithm implemented in a
low-level C form somewhere? that one was always lightning fast for
me.)
regards,
/ivo
bybuiltin = function( y, x ) coef(lm( y ~ x -1 ));
byhand = function( y, x ) {
xy<-t(x)%*%y;
xxi<- solve(t(x)%*%x)
b<-as.vector(xxi%*%xy)
## I will need these later, too:
## res<-y-as.vector(x%*%b)
## soa[i]<-b[2]
## sigmas[i]<-sd(res)
b;
}
MC=500;
N=10000;
set.seed(0);
x= matrix( rnorm(N*MC), nrow=N, ncol=MC );
y= matrix( rnorm(N*MC), nrow=N, ncol=MC );
ptm = proc.time()
for (mc in 1:MC) byhand(y[,mc],x[,mc]);
cat("By hand took ", proc.time()-ptm, "\n");
ptm = proc.time()
for (mc in 1:MC) bybuiltin(y[,mc],x[,mc]);
cat("By built-in took ", proc.time()-ptm, "\n");
______________________________________________ R-help at r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code.
lm is slow because it has to set up the design matrix (X) each time. See ?model.matrix and ?model.matrix.lm for how to do this once separately from lm (and then use lm.fit). I am far from an expert on numerical algebra, but I'm pretty sure your comments below are incorrect in the sense that "naive" methods are **not** universally better then efficient numerical algebra methods using,say, QR decompositions. It will depend very strongly on the size and specific nature of the problem. Big Oh Complexity statements ( O(n) or O(n^2), etc.) are, after all, asymptotic. There is also typically a tradeoff between computational accuracy (especially for ill-conditioned matrices) and speed which your remarks seem to neglect. -- Bert Bert Gunter Genentech Nonclinical Biostatistics 650-467-7374 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of ivo welch Sent: Wednesday, March 25, 2009 2:30 PM To: Dimitris Rizopoulos Cc: r-help Subject: Re: [R] very fast OLS regression? thanks, dimitris. I also added Bill Dunlap's "solve(qr(x),y)" function as ols5. here is what I get in terms of speed on a Mac Pro: ols1 6.779 3.591 10.37 0 0 ols2 0.515 0.21 0.725 0 0 ols3 0.576 0.403 0.971 0 0 ols4 1.143 1.251 2.395 0 0 ols5 0.683 0.565 1.248 0 0 so the naive matrix operations are fastest. I would have thought that alternatives to the naive stuff I learned in my linear algebra course would be quicker. still, ols3 and ols5 are competitive. the built-in lm() is really problematic. is ols3 (or perhaps even ols5) preferable in terms of accuracy? I think I can deal with 20% speed slow-down (but not with a factor 10 speed slow-down). regards, /iaw On Wed, Mar 25, 2009 at 5:11 PM, Dimitris Rizopoulos
<d.rizopoulos at erasmusmc.nl> wrote:
check the following options:
ols1 <- function (y, x) {
? ?coef(lm(y ~ x - 1))
}
ols2 <- function (y, x) {
? ?xy <- t(x)%*%y
? ?xxi <- solve(t(x)%*%x)
? ?b <- as.vector(xxi%*%xy)
? ?b
}
ols3 <- function (y, x) {
? ?XtX <- crossprod(x)
? ?Xty <- crossprod(x, y)
? ?solve(XtX, Xty)
}
ols4 <- function (y, x) {
? ?lm.fit(x, y)$coefficients
}
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Yes, Bert. Any least-squares solution that forms X'X and then inverts it is not to be recommended. If X is nearly rank-deficient, then X'X will be more strongly so. The QR decomposition approach in my byhand.qr() function is reliable and fast. Ravi. ____________________________________________________________________ Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvaradhan at jhmi.edu ----- Original Message ----- From: Bert Gunter <gunter.berton at gene.com> Date: Wednesday, March 25, 2009 6:03 pm Subject: Re: [R] very fast OLS regression? To: 'ivo welch' <ivowel at gmail.com>, 'Dimitris Rizopoulos' <d.rizopoulos at erasmusmc.nl> Cc: 'r-help' <r-help at stat.math.ethz.ch>
lm is slow because it has to set up the design matrix (X) each time. See ?model.matrix and ?model.matrix.lm for how to do this once separately from lm (and then use lm.fit). I am far from an expert on numerical algebra, but I'm pretty sure your comments below are incorrect in the sense that "naive" methods are **not** universally better then efficient numerical algebra methods using,say, QR decompositions. It will depend very strongly on the size and specific nature of the problem. Big Oh Complexity statements ( O(n) or O(n^2), etc.) are, after all, asymptotic. There is also typically a tradeoff between computational accuracy (especially for ill-conditioned matrices) and speed which your remarks seem to neglect. -- Bert Bert Gunter Genentech Nonclinical Biostatistics 650-467-7374 -----Original Message----- From: r-help-bounces at r-project.org [ On Behalf Of ivo welch Sent: Wednesday, March 25, 2009 2:30 PM To: Dimitris Rizopoulos Cc: r-help Subject: Re: [R] very fast OLS regression? thanks, dimitris. I also added Bill Dunlap's "solve(qr(x),y)" function as ols5. here is what I get in terms of speed on a Mac Pro: ols1 6.779 3.591 10.37 0 0 ols2 0.515 0.21 0.725 0 0 ols3 0.576 0.403 0.971 0 0 ols4 1.143 1.251 2.395 0 0 ols5 0.683 0.565 1.248 0 0 so the naive matrix operations are fastest. I would have thought that alternatives to the naive stuff I learned in my linear algebra course would be quicker. still, ols3 and ols5 are competitive. the built-in lm() is really problematic. is ols3 (or perhaps even ols5) preferable in terms of accuracy? I think I can deal with 20% speed slow-down (but not with a factor 10 speed slow-down). regards, /iaw On Wed, Mar 25, 2009 at 5:11 PM, Dimitris Rizopoulos <d.rizopoulos at erasmusmc.nl> wrote:
> check the following options:
>
> ols1 <- function (y, x) {
> ? ?coef(lm(y ~ x - 1))
> }
>
> ols2 <- function (y, x) {
> ? ?xy <- t(x)%*%y
> ? ?xxi <- solve(t(x)%*%x)
> ? ?b <- as.vector(xxi%*%xy)
> ? ?b
> }
>
> ols3 <- function (y, x) {
> ? ?XtX <- crossprod(x)
> ? ?Xty <- crossprod(x, y)
> ? ?solve(XtX, Xty)
> }
>
> ols4 <- function (y, x) {
> ? ?lm.fit(x, y)$coefficients
> }
>
______________________________________________ R-help at r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code.
On Wed, 2009-03-25 at 16:28 -0400, ivo welch wrote:
Dear R experts: I just tried some simple test that told me that hand computing the OLS coefficients is about 3-10 times as fast as using the built-in lm() function. (code included below.) Most of the time, I do not care, because I like the convenience, and I presume some of the time goes into saving a lot of stuff that I may or may not need. But when I do want to learn the properties of an estimator whose input contains a regression, I do care about speed. What is the recommended fastest way to get regression coefficients in R? (Is Gentlemen's weighted-least-squares algorithm implemented in a low-level C form somewhere? that one was always lightning fast for me.)
No one has yet mentioned Doug Bates' article in R News on this topic, which compares timings of various methods for least squares computations. Douglas Bates. Least squares calculations in R. R News, 4(1):17-20, June 2004. A Cholesky decomposition solution proved fastest in base R code, with an even faster version developed using sparse matrices and the Matrix package. You can find Doug's article here: http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf HTH G
regards,
/ivo
bybuiltin = function( y, x ) coef(lm( y ~ x -1 ));
byhand = function( y, x ) {
xy<-t(x)%*%y;
xxi<- solve(t(x)%*%x)
b<-as.vector(xxi%*%xy)
## I will need these later, too:
## res<-y-as.vector(x%*%b)
## soa[i]<-b[2]
## sigmas[i]<-sd(res)
b;
}
MC=500;
N=10000;
set.seed(0);
x= matrix( rnorm(N*MC), nrow=N, ncol=MC );
y= matrix( rnorm(N*MC), nrow=N, ncol=MC );
ptm = proc.time()
for (mc in 1:MC) byhand(y[,mc],x[,mc]);
cat("By hand took ", proc.time()-ptm, "\n");
ptm = proc.time()
for (mc in 1:MC) bybuiltin(y[,mc],x[,mc]);
cat("By built-in took ", proc.time()-ptm, "\n");
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
On Wed, 25 Mar 2009, ivo welch wrote:
thanks, dimitris. I also added Bill Dunlap's "solve(qr(x),y)" function as ols5. here is what I get in terms of speed on a Mac Pro: ols1 6.779 3.591 10.37 0 0 ols2 0.515 0.21 0.725 0 0 ols3 0.576 0.403 0.971 0 0 ols4 1.143 1.251 2.395 0 0 ols5 0.683 0.565 1.248 0 0 so the naive matrix operations are fastest. I would have thought that alternatives to the naive stuff I learned in my linear algebra course would be quicker. still, ols3 and ols5 are competitive. the built-in lm() is really problematic. is ols3 (or perhaps even ols5) preferable in terms of accuracy? I think I can deal with 20% speed slow-down (but not with a factor 10 speed slow-down).
ols5 is more accurate in terms of round-off error, and so it is how the internal computations are done in lm and lsfit, but it is relatively unusual for this to matter given double precision.
If you want to repeat the regression with the same x and many y you can do much better by taking the QR decomposition just once and applying to a matrix of all the ys.
It's also possible that using chol() and backsolve() rather than solve() would speed up ols2 and ols3, at least for large enough matrices.
-thomas
regards, /iaw On Wed, Mar 25, 2009 at 5:11 PM, Dimitris Rizopoulos <d.rizopoulos at erasmusmc.nl> wrote:
check the following options:
ols1 <- function (y, x) {
? ?coef(lm(y ~ x - 1))
}
ols2 <- function (y, x) {
? ?xy <- t(x)%*%y
? ?xxi <- solve(t(x)%*%x)
? ?b <- as.vector(xxi%*%xy)
? ?b
}
ols3 <- function (y, x) {
? ?XtX <- crossprod(x)
? ?Xty <- crossprod(x, y)
? ?solve(XtX, Xty)
}
ols4 <- function (y, x) {
? ?lm.fit(x, y)$coefficients
}
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle
On Wed, 25 Mar 2009, Ravi Varadhan wrote:
Yes, Bert. Any least-squares solution that forms X'X and then inverts it is not to be recommended. If X is nearly rank-deficient, then X'X will be more strongly so. The QR decomposition approach in my byhand.qr() function is reliable and fast.
Forming the matrix of crossproducts and using cholesky decomposition is faster, so it does depend on the intended use.
In a simulation, the OP's situation, you may well know that X is not nearly rank deficient, in which case the speed advantage may be worthwhile. After all, even if the condition number of X is 10^5 you will still have five or six accurate digits in the result.
If you are writing code that will be used in ways you have no control over, then of course it makes sense to use the more stable QR decomposition.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
On Wed, 2009-03-25 at 22:11 +0100, Dimitris Rizopoulos wrote:
check the following options:
ols1 <- function (y, x) {
coef(lm(y ~ x - 1))
}
ols2 <- function (y, x) {
xy <- t(x)%*%y
xxi <- solve(t(x)%*%x)
b <- as.vector(xxi%*%xy)
b
}
ols3 <- function (y, x) {
XtX <- crossprod(x)
Xty <- crossprod(x, y)
solve(XtX, Xty)
}
ols4 <- function (y, x) {
lm.fit(x, y)$coefficients
}
# check timings
MC <- 500
N <- 10000
set.seed(0)
x <- matrix(rnorm(N*MC), N, MC)
y <- matrix(rnorm(N*MC), N, MC)
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc]))
invisible({gc(); gc(); gc()})
system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE]))
Hi Dimitris and Ivo I think this is not a fair comparison, look this x[8,100]<-NA system.time(for (mc in 1:MC) ols1(y[, mc], x[, mc])) user system elapsed 8.765 0.000 8.762 system.time(for (mc in 1:MC) ols2(y[, mc], x[, mc])) Error in solve.default(t(x) %*% x) : system is computationally singular: reciprocal condition number = 0 Timing stopped at: 0 0 0.002 system.time(for (mc in 1:MC) ols3(y[, mc], x[, mc])) Error in solve.default(XtX, Xty) : system is computationally singular: reciprocal condition number = 0 Timing stopped at: 0 0 0.001 system.time(for (mc in 1:MC) ols4(y[, mc], x[, mc, drop = FALSE])) Error in lm.fit(x, y) : NA/NaN/Inf in foreign function call (arg 1) Timing stopped at: 0 0 0.001 So routines ols2, ols3 and ols4 only functional in fully matrix if have one NA this functions don't run.
Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil
2 days later
On Wed, Mar 25, 2009 at 5:15 PM, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:
On Wed, 2009-03-25 at 16:28 -0400, ivo welch wrote:
Dear R experts: I just tried some simple test that told me that hand computing the OLS coefficients is about 3-10 times as fast as using the built-in lm() function. ? (code included below.) ?Most of the time, I do not care, because I like the convenience, and I presume some of the time goes into saving a lot of stuff that I may or may not need. ?But when I do want to learn the properties of an estimator whose input contains a regression, I do care about speed. What is the recommended fastest way to get regression coefficients in R? ?(Is Gentlemen's weighted-least-squares algorithm implemented in a low-level C form somewhere? ?that one was always lightning fast for me.)
No one has yet mentioned Doug Bates' article in R News on this topic, which compares timings of various methods for least squares computations. Douglas Bates. Least squares calculations in R. R News, 4(1):17-20, June 2004. A Cholesky decomposition solution proved fastest in base R code, with an even faster version developed using sparse matrices and the Matrix package. You can find Doug's article here: http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf
An updated version is available as the "Comparisons" vignette in the Matrix package.
regards,
/ivo
bybuiltin = function( y, x ) ? coef(lm( y ~ x -1 ));
byhand = function( y, x ) {
? xy<-t(x)%*%y;
? xxi<- solve(t(x)%*%x)
? b<-as.vector(xxi%*%xy)
? ## I will need these later, too:
? ## res<-y-as.vector(x%*%b)
? ## soa[i]<-b[2]
? ## sigmas[i]<-sd(res)
? b;
}
MC=500;
N=10000;
set.seed(0);
x= matrix( rnorm(N*MC), nrow=N, ncol=MC );
y= matrix( rnorm(N*MC), nrow=N, ncol=MC );
ptm = proc.time()
for (mc in 1:MC) byhand(y[,mc],x[,mc]);
cat("By hand took ", proc.time()-ptm, "\n");
ptm = proc.time()
for (mc in 1:MC) bybuiltin(y[,mc],x[,mc]);
cat("By built-in took ", proc.time()-ptm, "\n");
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ?Dr. Gavin Simpson ? ? ? ? ? ? [t] +44 (0)20 7679 0522 ?ECRC, UCL Geography, ? ? ? ? ?[f] +44 (0)20 7679 0565 ?Pearson Building, ? ? ? ? ? ? [e] gavin.simpsonATNOSPAMucl.ac.uk ?Gower Street, London ? ? ? ? ?[w] http://www.ucl.ac.uk/~ucfagls/ ?UK. WC1E 6BT. ? ? ? ? ? ? ? ? [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090328/0d791b6a/attachment-0002.pl>