Rmpi loads 2 versions of the same library [SOLVED, BUT..]
I'm happy to report that Rmpi now loads only my personal MPI libraries.
I believe the critical change was to the dlopen code in Rmpi.c to be
//Ross Boylan changes order to search for mpi.so before mpi.so.1
// 2014-03-13
if (!dlopen("libmpi.so", RTLD_GLOBAL | RTLD_LAZY)
&& !dlopen("libmpi.so.0", RTLD_GLOBAL | RTLD_LAZY)){
but I changed a lot of other things too: rebuilt MPI with special
options, rebuilt local copy of R set for local MPI; rebuilt Rmpi against
both. I followed the advice from Bennet Fauber here:
http://www.open-mpi.org/community/lists/users/2014/03/23823.php
(though I didn't do precisely what he said).
I'm not so happy to report that the original problem that motivated the
whole exercise remains; in fact it's gotten slightly worse.
mpi.isend.Robj does not seem to be working properly. I am sending to a
fake receiver (at rank 1) that does nothing but print a message when it
gets a message. r is a list with
length(serialize(r, NULL))
length(serialize(r, NULL)) [1] 599499
mpi.send.Robj(1, 1, 4)
Fake Assembler: 0 4 numeric
mpi.send.Robj(r, 1, 4) # send of r works
NULL
Fake Assembler: 0 4 list
mpi.isend.Robj(1, 1, 4) # isend of number works
Fake Assembler: 0 4 numeric
mpi.isend.Robj(r, 1, 4) # sometimes this used to work the first time mpi.isend.Robj(r, 1, 4)
mpi.send.Robj(r, 1, 4) # sometimes used to get previous message unstuck
# never get the command prompt back Ross
On Thu, 2014-03-13 at 12:16 -0700, Ross Boylan wrote:
I've been trying to get Rmpi to work with my personal copy of MPI, which
is newer than the system's. Even when I set LD_LIBRARY_PATH
appropriately, and build Rmpi with
export LD_LIBRARY_PATH=/home/ross/install/lib:$LD_LIBRARY_PATH
export PATH=/home/ross/install/bin:$PATH
# Not sure what I should use for --with-mpi
R CMD INSTALL Rmpi --configure-args='--with-Rmpi-include=/home/ross/install/include --with-Rmpi-libpath=/home/ross/install/lib --with\
-mpi=/home/ross/install --with-Rmpi-type=OPENMPI'
I find that the R process opens both the system and personal copies of
mpi-related libs (according to lsof and /proc/nnn/map). ldd on my
Rmpi.so shows only references to my local copies. I think the paths
show by ldd are simply advisory.
I think the cause is this code in Rmpi.c:
if (!dlopen("libmpi.so.0", RTLD_GLOBAL | RTLD_LAZY)
&& !dlopen("libmpi.so", RTLD_GLOBAL | RTLD_LAZY)){
http://www.stats.uwo.ca/faculty/yu/Rmpi/changelogs.htm notes
----------------------------------
2007-10-24, version 0.5-5:
dlopen has been used to load libmpi.so explicitly. This is mainly useful for Rmpi under OpenMPI where one might see many error messages:
mca: base: component_find: unable to open osc pt2pt: file not found (ignored)
if libmpi.so is not loaded with RTLD_GLOBAL flag.
-------------------------------------
I'm not sure which version of mpi ends up getting used.
I also don't know why libmpi.so.0 is preferred to libmpi.so.1 in the
explicit load above.
Using LD_DEBUG shows
24312: file=libmpi.so.1 [0]; needed by /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so [0]
24312: find library=libmpi.so.1 [0]; searching
24312: search path=/usr/lib64/R/lib:/home/ross/install/lib (LD_LIBRARY_PATH)
24312: trying file=/usr/lib64/R/lib/libmpi.so.1
24312: trying file=/home/ross/install/lib/libmpi.so.1
and, later,
24312: file=libmpi.so.0 [0]; needed by /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so [0]
24312: find library=libmpi.so.0 [0]; searching
24312: search path=/usr/lib64/R/lib:/home/ross/install/lib (LD_LIBRARY_PATH)
24312: trying file=/usr/lib64/R/lib/libmpi.so.0
24312: trying file=/home/ross/install/lib/libmpi.so.0
24312: search cache=/etc/ld.so.cache
24312: trying file=/usr/lib/libmpi.so.0
Does anyone know what's going on?
Ross Boylan
P.S. This might be relevant:
24300: calling init: /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so
24300:
24300: opening file=/home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so [0]; direct_opencount=1
24300:
24300: /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so: error: symbol lookup error: undefined symbol: R_init_Rmpi (fatal)
24300: /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so: error: symbol lookup error: undefined symbol: R_init_Rmpi (fatal)
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc