Skip to content

using mclapply (multi core apply) to do matrix multiplication

10 messages · Alaios, Rainer M Krug, Ernest Adrogué +1 more

#
7-02-2012, 00:29 (-0800); Alaios escriu:
If I understand correctly, R uses a specialized library called BLAS to
do matrix multiplications. I doubt re-implementing the matrix
multiplication code at R-level would be any faster. What you can try
is replace BLAS with a multicore version of BLAS although it's not
easy if you have to compile it yourself.

Also, you may try to re-think the problem you're trying to solve.
Maybe there's a different approach that is less computation-intensive.
#
On 07/02/12 11:31, Alaios wrote:
You definitaly can go this way, but I would STRONGLY recommend to search 
for "parallel BLAS", check in the R-admin manual the section "Linear 
Algebra" which deals with BLAS et al, and e.g. 
http://www.r-bloggers.com/compiling-64-bit-r-2-10-1-with-mkl-in-linux/

My guess is that a paralelization on the C level in the BLAS et al. 
library will be MUCH faster then a paralelization on R level.

Also, there is a R-sig-hpc mailing list for these kind of questions.

Cheers,

Rainer

  
    
#
On 07/02/12 12:02, Alaios wrote:
Me neither on pur cluster - but that won't stop you from compiling and 
installing R in your home directory. By doing this, you have even more 
control.

Cheers and good luck,

Rainer

for the many cores system and our system administrator want

  
    
#
7-02-2012, 02:31 (-0800); Alaios escriu:
I never used mclapply, but anyway here's a matrix multiplication
function that uses lapply. Because the two lapply's are nested I don't
think you can parallelize the two... I would only make the second one
work with multiple cores

mmult <- function(a, b) {
  a <- as.matrix(a)
  b <- as.matrix(b)
  if (ncol(a) != nrow(b))
    stop('non-conforming matrices')
  out <- lapply(1:ncol(b), function(j)
                lapply(1:nrow(a), function(i) sum(a[i,] * b[,j])))
  array(unlist(out), c(nrow(a), ncol(b)))
}

Also, I'm pretty sure that there are better algorithms.

If you do this it would be interesting if you measured the execution
time of the different alternatives and post the results :)
#
7-02-2012, 03:32 (-0800); Alaios escriu:
This article includes an overview of different BLAS libraries along
with benchmarks:

http://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf

It looks like using single-threaded ATLAS is already an improvement
over LAPACK in most cases. I use Debian and it's straightforward to
replace one with the other: you only have to install the
libatlas3gf-base package and remove liblapack3gf and libblas3gf.

Unfortunately, Debian does not include a multi-threaded version of
ATLAS although they provide instructions on how to recompile the
package yourself with multi-threading enabled.

I don't know about SUSE, sorry.
#
What is the nature of the matrices?  Are they sparse or derived 
from sparse matrices?  If they are sparse, have you looked at the 
packages available in R for sparse matrices?


             library(sos)



             summary(sp <- findFn('sparse', 999))


will identify help pages in contributed packages containing "sparse". 
The primary one is "Matrix", but there are others.


       If they are not sparse but are derived from sparse matrices, you 
might be able to do some theoretical work.  Of course, this only makes 
sense if you have a specific class of problems that generates the 
matrices, which seems plausible since you said you had square matrices 
of dimension 2^14.


       Hope this helps.
       Spencer
On 2/7/2012 4:36 AM, Ernest Adrogu? wrote: