Rmpi loads 2 versions of the same library [possible cause]
On Thu, 2014-03-13 at 12:57 -0700, Ross Boylan wrote:
I'm not so happy to report that the original problem that motivated the whole exercise remains; in fact it's gotten slightly worse. mpi.isend.Robj does not seem to be working properly. I am sending to a fake receiver (at rank 1) that does nothing but print a message when it gets a message. r is a list with
Switching to mpi.send.Robj allowed everything to work. I speculate that R was garbage collecting the bytes to be sent before MPI_send had finished transmitting them. 1. Messages from mpi.isend were arriving at the MPI level. The problem was that they were corrupt, and when the receiver (in Rmpi code R code) tried to unserialize them it threw an error and stopped the process. 2. I tried to compare the bytes sent to the bytes received, and they don't entirely fit the garbage collection theory since the first difference was at the 10th byte, though most differences were later. I would expect at least the first part of the buffer to go out correctly. However, I'm not sure if I had the correct before bytes. (When I tried to save the object being sent everything worked, and so I had to compare different runs). It is also possible fuller use will disprove the theory that mpi.send solves the problem. Ross