Skip to content

Problem about config Rmpi with LAM on 64 bits

3 messages · Marce, Stefan Theussl, Hao Yu

#
Hi all, now I'm trying to configure Rmpi in a 64 bit-cluster with LAM.
I have installed R 2.6.2 and LAM 7.1.4. When I compiled LAM, I
configured it with --enable-shared option, and all went ok. After
that, I defined LD_LIBRARY_PATH with the libs of LAM that I want to
use.
The problem is when I want to install Rmpi (0.5.5 and 0.5.6 give me
the same error), this one:
Error in dyn.load(file, ...) :
  imposible cargar la biblioteca compartida
'/home/aplicaciones/R/lib64/R/library/Rmpi/libs/Rmpi.so':
  /home/aplicaciones/R/lib64/R/library/Rmpi/libs/Rmpi.so: undefined
symbol: lam_mpi_double
Error en library(Rmpi) : .First.lib failed for 'Rmpi'
Error in dyn.unload(file.path(libpath, "libs", paste("Rmpi",
.Platform$dynlib.ext,  :
  la biblioteca din?mica/compartida
'/home/aplicaciones/R/lib64/R/library/Rmpi/libs/Rmpi.so' no fu?
cargada
.

If you do an ldd of Rmpi.so:

[marce at cluster ~]$ ldd /home/aplicaciones/R/lib64/R/library/Rmpi/libs/Rmpi.so
        libmpi.so.0 => /home/aplicaciones/LAM/lib/libmpi.so.0
(0x0000002a95663000)
        liblam.so.0 => /home/aplicaciones/LAM/lib/liblam.so.0
(0x0000002a9580c000)
        libutil.so.1 => /lib64/libutil.so.1 (0x0000002a95978000)
        libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000002a95a7c000)
        libc.so.6 => /lib64/tls/libc.so.6 (0x0000002a95b91000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000002a95dc6000)
        /lib64/ld-linux-x86-64.so.2 (0x000000552aaaa000)

I have installed R and LAM-MPI in a shared file-system (with Lustre)
by all nodes. Install LAM localy in all nodes is a better option?
Somebody has configured a cluster with R, LAM and Rmpi? Now I am in a
"bypass mode", I don't know the course to follow :s.

Thanks for all

PD:  The output of laminfo is:

[marce at cluster ~]$ /home/aplicaciones/LAM/bin/laminfo
         LAM/MPI: 7.1.4
         Prefix: /home/aplicaciones/LAM/
         Architecture: x86_64-unknown-linux-gnu
         Configured by: root
         Configured on: Wed Dec 10 11:45:49 CET 2008
         Configure host: cluster
         Memory manager: ptmalloc2
         C bindings: yes
         C++ bindings: yes
         Fortran bindings: yes
         C compiler: gcc
         C++ compiler: g++
         Fortran compiler: g77
         Fortran symbols: double_underscore
         C profiling: yes
         C++ profiling: yes
         Fortran profiling: yes
         C++ exceptions: no
         Thread support: yes
         ROMIO support: yes
         IMPI support: no
         Debug support: no
         Purify clean: no
            SSI boot: globus (API v1.1, Module v0.6)
            SSI boot: rsh (API v1.1, Module v1.1)
            SSI boot: slurm (API v1.1, Module v1.0)
            SSI coll: lam_basic (API v1.1, Module v7.1)
            SSI coll: shmem (API v1.1, Module v1.0)
            SSI coll: smp (API v1.1, Module v1.2)
            SSI rpi: crtcp (API v1.1, Module v1.1)
            SSI rpi: lamd (API v1.0, Module v7.1)
            SSI rpi: sysv (API v1.0, Module v7.1)
            SSI rpi: tcp (API v1.0, Module v7.1)
            SSI rpi: usysv (API v1.0, Module v7.1)
            SSI cr: self (API v1.0, Module v1.0)
#
Hello Marce,

I can load version 0.5.6 as well as 0.5.5 of Rmpi without any problems 
on our cluster. My LAM installation is on a shared filesystem (although 
NFS), so this should really not be a problem. The only thing what is 
different is that I use R 2.8.0 patched and LAM version 7.1.3. I didn't 
upgrade to the new LAM yet.

Anyway, have you tried to to compile and run a simple MPI program? Is 
there a reason why you use R 2.6.2? Otherwise I would recommend to use 
the latest version of R.

Furthermore, I strongly recommend to use gfortran instead of g77 to 
compile LAM. I see from your laminfo output that you have

Fortran compiler: g77
Fortran symbols: double_underscore

whereas I have

Fortran compiler: gfortran
Fortran symbols: underscore

I typically call configure with these parameters:

./configure --enable-shared CC=gcc CXX=g++  FC=gfortran 
--prefix=/some/path/on/NFS_storage

Hope this helps.

Best,
Stefan
Marce wrote:
#
Hi Marce,

R and Lam on nfs should work. Your errors seem indicating that R cannot
find the path to lam lib. Can you make sure LAM is working: try
lamboot -v
lamnodes
lamexec C hostname

Any errors will indicate Lam is not configured right and Rmpi will not work.

If Lam is working but R still has the same errors, can you modify the R
script within R's bin dir to add something like (in the third line)
LD_LIBRARY_PATH=/usr/local/lam/lib

This is how I configure R and Lam to work on my cluster. I have LAM and
OpenMPI coexist but I separate them on different paths (not in PATH). I
installed two R's and modify R script with different LD_LIBRARY_PATH and
both R's work under different MPI.

Hao
Marce wrote: