lm is slow because it has to set up the design matrix (X) each time. See
?model.matrix and ?model.matrix.lm for how to do this once
separately from
lm (and then use lm.fit).
I am far from an expert on numerical algebra, but I'm pretty sure your
comments below are incorrect in the sense that "naive" methods are **not**
universally better then efficient numerical algebra methods
using,say, QR
decompositions. It will depend very strongly on the size and specific
nature
of the problem. Big Oh Complexity statements ( O(n) or O(n^2), etc.)
are,
after all, asymptotic. There is also typically a tradeoff between
computational accuracy (especially for ill-conditioned matrices) and
speed
which your remarks seem to neglect.
-- Bert
Bert Gunter
Genentech Nonclinical Biostatistics
650-467-7374
-----Original Message-----
From: r-help-bounces at r-project.org [ On
Behalf Of ivo welch
Sent: Wednesday, March 25, 2009 2:30 PM
To: Dimitris Rizopoulos
Cc: r-help
Subject: Re: [R] very fast OLS regression?
thanks, dimitris. I also added Bill Dunlap's "solve(qr(x),y)"
function as ols5. here is what I get in terms of speed on a Mac Pro:
ols1 6.779 3.591 10.37 0 0
ols2 0.515 0.21 0.725 0 0
ols3 0.576 0.403 0.971 0 0
ols4 1.143 1.251 2.395 0 0
ols5 0.683 0.565 1.248 0 0
so the naive matrix operations are fastest. I would have thought that
alternatives to the naive stuff I learned in my linear algebra course
would be quicker. still, ols3 and ols5 are competitive. the
built-in lm() is really problematic. is ols3 (or perhaps even ols5)
preferable in terms of accuracy? I think I can deal with 20% speed
slow-down (but not with a factor 10 speed slow-down).
regards,
/iaw
On Wed, Mar 25, 2009 at 5:11 PM, Dimitris Rizopoulos
<d.rizopoulos at erasmusmc.nl> wrote:
> check the following options:
>
> ols1 <- function (y, x) {
> ? ?coef(lm(y ~ x - 1))
> }
>
> ols2 <- function (y, x) {
> ? ?xy <- t(x)%*%y
> ? ?xxi <- solve(t(x)%*%x)
> ? ?b <- as.vector(xxi%*%xy)
> ? ?b
> }
>
> ols3 <- function (y, x) {
> ? ?XtX <- crossprod(x)
> ? ?Xty <- crossprod(x, y)
> ? ?solve(XtX, Xty)
> }
>
> ols4 <- function (y, x) {
> ? ?lm.fit(x, y)$coefficients
> }
>