Is it possible to avoid the loop in the following function (or make the
function otherwise more efficient) and can someone point me to a
possible solution? (It would be great if hours could be reduced to
seconds :-).
# ---------------------------------------------
RanEigen=function(items=x,cases=y,sample=z)
{
X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
EV=eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values
for (i in 2:sample)
{
X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values)
}
REigV=(cbind(1:items,colMeans(EV)))
REigV[,2]=as.numeric(formatC(REigV[,2],format="f",digits=7,flag="
",width=10))
colnames(REigV)=c(' ','Eigenvalue')
rownames(REigV)=rep('',items)
return(REigV)
}
# ---------------------------------------------
Thanks in advance,
Dirk
*************************************************
Dr. Dirk Enzmann
Criminological Research Institute of Lower Saxony
Luetzerodestr. 9
D-30161 Hannover
Germany
phone: +49-511-348.36.32
fax: +49-511-348.36.10
email: ENZMANN at KFN.uni-hannover.de
http://www.kfn.de
Avoiding loops to spare time and memory
3 messages · Dirk Enzmann, Brian Ripley, David Brahm
Have you profiled your code (see `Writing R Extensions')? My guess is that most of the time is being spent in eigen(); if so you would have to use a different approach to gain much speed. BTW, please make more use of the space bar: your code is nigh unreadable (and so I've not tried to read it in detail). `Writing R Extensions' also shows you how to format code well.
On Thu, 8 May 2003, Dirk Enzmann wrote:
Is it possible to avoid the loop in the following function (or make the
function otherwise more efficient) and can someone point me to a
possible solution? (It would be great if hours could be reduced to
seconds :-).
# ---------------------------------------------
RanEigen=function(items=x,cases=y,sample=z)
{
X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
EV=eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values
for (i in 2:sample)
{
X=matrix(rnorm(cases*items),nrow=cases,byrow=F)
S=crossprod(X-rep(1,cases) %*% t(colMeans(X)))
EV=rbind(EV,eigen((1/sqrt(diag(S))*diag(items))%*%S%*%(1/sqrt(diag(S))*diag(items)),only.values=T)$values)
}
REigV=(cbind(1:items,colMeans(EV)))
REigV[,2]=as.numeric(formatC(REigV[,2],format="f",digits=7,flag="
",width=10))
colnames(REigV)=c(' ','Eigenvalue')
rownames(REigV)=rep('',items)
return(REigV)
}
# ---------------------------------------------
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Dirk Enzmann <enzmann at kfn.uni-hannover.de> wrote:
Is it possible to avoid the loop in the following function (or make the function otherwise more efficient)...
Prof Brian Ripley <ripley at stats.ox.ac.uk> replied:
My guess is that most of the time is being spent in eigen(); if so you would have to use a different approach to gain much speed.
If M is an (p x q) matrix with q>>p, then at most the first p eigenvalues of
M'M are nonzero, and they can be found more efficiently using the left side of:
La.svd(M)$d^2 = eigen(crossprod(M))$values[1:p]
I've re-written your function for the situation items >> cases (I leave the
opposite situation as an exercise). I also threw in some sweep's, a zapsmall,
and a few other tricks. A timing test shows it runs 57x faster for this
example (with items=500 and cases=10):
R> set.seed(1)
R> system.time(res1 <- RanEigen.orig(500,10,30))
[1] 229.80 0.07 230.38 0.00 0.00
R> set.seed(1)
R> system.time(res2 <- RanEigen(500,10,30))
[1] 4.00 0.02 4.07 0.00 0.00
R> all.equal(res1, res2)
[1] TRUE
and here's my function:
RanEigen <- function(items, cases, sample) {
if (items < cases) stop("I assume items >= cases")
EV <- matrix(0, sample, items)
for (i in 1:sample) {
X <- matrix(rnorm(cases*items), nrow=cases)
Xb <- sweep(X, 2, colMeans(X))
S <- crossprod(Xb)
Xc <- sweep(Xb, 2, 1/sqrt(diag(S)), "*")
EV[i, 1:cases] <- zapsmall(La.svd(Xc)$d^2) # The rest are zero
}
REigV <- cbind(1:items, colMeans(EV))
dimnames(REigV) <- list(rep('',items), c(' ','Eigenvalue'))
REigV
}
-- David Brahm (brahm at alum.mit.edu)