Skip to content

using openbabel plugins in R

6 messages · Kevin Horan, Dirk Eddelbuettel, Simon Urbanek

#
I posted this in openbabel-devel but didn't get much help, so hopefully 
someone here can help. I don't think its too openbabel specific.

I would like to make use of open babel from within the R language.
Initially I just need to do some format conversions, but may expand the
usage to other parts of OpenBabel as well. I am familiar with embedding
C/C++ code in R, but I'm having some trouble with the plugin mechanism
of OpenBabel in this case. The  problem is that the formats are not
available when I run the OpenBabel code from within R. So, for example,
if I search for the SDF format like so:
      OBFormat *format = conv.FindFormat("SDF");
I always get back a 0 value. The same chunk of code executed outside of
R, as a normal stand-alone program, works fine. So does anyone know how
I can ensure that the formats get loaded? Thanks.
      One other thing to mention, someone might suggest linking against a
static version of openbabel which includes all the plugins. I would like
to avoid that if possible since this needs to work in an R package that
will be distributed across platforms, so it would be hard to ask people
to compile a special, static, version of openbabel just to compile this
R package. Since it needs to work on windows, mac and linux, it would be
nice  if I can make use of any existing installed shared obenbabel
libraries. If it turns out it can't be done, then I'll go down that
path. Thanks.

      Here is an example of the problem:

test program (obtest2.cc):

     #include <iostream>
     #include <openbabel/obconversion.h>
     #include <R.h>
     #include <Rinternals.h>

     extern "C" {   SEXP test(); }

     int main(){
         test();
     }
     SEXP  test()
     {
         OpenBabel::OBConversion conv;
         OpenBabel::OBFormat *format = conv.FindFormat("SDF");   //    
search for SDF format
         std::cout<<"format: "<<format<<std::endl;        // print out 
search result, either 0   or an address
         return R_NilValue;
     }


compile:
      g++  -I/usr/include/openbabel-2.0 -I/usr/share/R/include -fpic -c 
obtest2.cc -o obtest2.o
      g++ -o obtest2 obtest2.o -fpic  -lopenbabel 
-lR                       # Executable
      g++ -shared -o libobtest2.so obtest2.o -fpic  -lopenbabel -lR  # R 
library

Run executable:
      $ ./obtest2
      format: 0x7f1858275d20  #found some result, this is what I expect

Run in R:
      R>dyn.load("libobtest2.so")
      R>.Call("test")
          format: 0              # the format was not found, so 0 was
returned
          NULL


After some more experimentation, I have discovered I can get it to work 
in the following way, but I think it is a bit impractical. If I compile 
the shared library as:
     g++ -shared -o libobtest2.so obtest2.o -fpic 
/usr/lib/openbabel/2.2.3/mdlformat.so -lopenbabel -lR
so the name of one of the plugins is specified. Then, in R I run:

     R>dyn.load("/usr/lib/openbabel/2.2.3/mdlformat.so")
     R>dyn.load("libobtest2.so")
     R>.Call("test")
         format: 0x7fe114c96d20
         NULL
So then it works. But this requires that I know the full path to every 
plugin when the code is compiled and when the library is loaded. Is 
there a practical way to do this, say, if this were part of an R 
package? I have also tried compiling the shared library as before, 
without the plugin, and then just loading the plugin with dyn.load but 
this does not work. It seems like it should though, does anyone know why 
it doesn't? Conversely, if you compile with the plugin specified, but 
don't load it with dyn.load it seg faults.
     The way it works normally in OpenBabel is that each plugin is its 
own shared library and then they get loaded at run time with the dlopen 
function (on linux at least). I have verified that this code is still 
being executed when called from within R, but it doesn't work for some 
reason.
     Also, swig does not help.

Thanks.

Kevin
#
On 25 March 2013 at 12:50, Kevin Horan wrote:
| I posted this in openbabel-devel but didn't get much help, so hopefully 
| someone here can help. I don't think its too openbabel specific.
| 
| I would like to make use of open babel from within the R language.
| Initially I just need to do some format conversions, but may expand the
| usage to other parts of OpenBabel as well. I am familiar with embedding
| C/C++ code in R, but I'm having some trouble with the plugin mechanism
| of OpenBabel in this case. The  problem is that the formats are not
| available when I run the OpenBabel code from within R. So, for example,
| if I search for the SDF format like so:
|       OBFormat *format = conv.FindFormat("SDF");

[...]

|      The way it works normally in OpenBabel is that each plugin is its 
| own shared library and then they get loaded at run time with the dlopen 
| function (on linux at least). I have verified that this code is still 
| being executed when called from within R, but it doesn't work for some 
| reason.

I would try to start from the smallest possible working examples.  R itself
uses dlopen (see eg $RHOME/modules/ for the files it loads), and so does
OpenBabel. Maybe some wires get crossed. You may well have to dig down with
the debugger and see what assumptions / environment variables / ... are valid
or invalid between the working and non-working case.

Dirk
1 day later
#
After some more testing I have found that it actually does work if I 
compile without the plugin library but load it with dyn.load. I'm not 
sure why this wasn't working before. It only works though if the plugin 
library is loaded before libobtest2.so (the open babel main lib basically).
     So, to clarify, the following works now:

g++ -shared -o libobtest2.so obtest2.o -fpic -lopenbabel -lR

R>dyn.load("/usr/lib/openbabel/2.2.3/mdlformat.so")
R>dyn.load("libobtest2.so")
R>.Call("test")
   format: 0x7fe114c96d20  #this is the correct result
   NULL


     But now I have a chicken and egg problem. The plugin libraries are 
not stored in a standard directory, but open babel provides a function 
to list their paths. So I need to load the open babel library to fetch 
the plugin paths, then I can load the plugins, but, oops, too late, the 
open babel library is already loaded so loading the plugins now doesn't 
work. I tried using dyn.unload("libobtest2.so") but it didn't work. It 
seems like I'd have to compile a small executable program that uses 
openbabel to fetch the plugin paths, then run it as an external program 
from within R, then load the plugins, then load the open babel lib.
     Does it make any sense that the order in which these are loaded 
affects things? Is there a way to load the plugin lib later and still 
have  it work? If the order does have to be maintained, any better ideas 
how to accomplish this? Thanks.
     Also, here is the dlopen command that openbabel uses:
         dlopen(lib_name.c_str(), RTLD_LAZY | RTLD_GLOBAL)


Kevin
On 03/26/2013 06:54 AM, Dirk Eddelbuettel wrote:
#
On 27 March 2013 at 10:02, Kevin Horan wrote:
| After some more testing I have found that it actually does work if I 
| compile without the plugin library but load it with dyn.load. I'm not 
| sure why this wasn't working before. It only works though if the plugin 
| library is loaded before libobtest2.so (the open babel main lib basically).
|      So, to clarify, the following works now:
| 
| g++ -shared -o libobtest2.so obtest2.o -fpic -lopenbabel -lR
| 
| R>dyn.load("/usr/lib/openbabel/2.2.3/mdlformat.so")
| R>dyn.load("libobtest2.so")
| R>.Call("test")
|    format: 0x7fe114c96d20  #this is the correct result
|    NULL
| 
| 
|      But now I have a chicken and egg problem. The plugin libraries are 
| not stored in a standard directory, but open babel provides a function 
| to list their paths. So I need to load the open babel library to fetch 
| the plugin paths, then I can load the plugins, but, oops, too late, the 
| open babel library is already loaded so loading the plugins now doesn't 

Can use something like pkg-config to query the path?  Eg R offers this
(beyond its own "R CMD config ..." interface:

    edd at max:~$ pkg-config --libs-only-L libR
    -L/usr/lib/R/lib  
    edd at max:~$ 

| work. I tried using dyn.unload("libobtest2.so") but it didn't work. It 
| seems like I'd have to compile a small executable program that uses 
| openbabel to fetch the plugin paths, then run it as an external program 
| from within R, then load the plugins, then load the open babel lib.

Yup. And you could do the test / probing (if that is the last resort) at the
configure test.

|      Does it make any sense that the order in which these are loaded 
| affects things? Is there a way to load the plugin lib later and still 
| have  it work? If the order does have to be maintained, any better ideas 
| how to accomplish this? Thanks.
|      Also, here is the dlopen command that openbabel uses:
|          dlopen(lib_name.c_str(), RTLD_LAZY | RTLD_GLOBAL)

That rings a bell. We once had what I think was that very same issue with
Rmpi as the OpenMPI libraries have there symbols split over several shared
libraries.  But that was many many years ago and I have forgotten what we did
then ...

Dirk

 
| Kevin
| 
|
| On 03/26/2013 06:54 AM, Dirk Eddelbuettel wrote:
| > On 25 March 2013 at 12:50, Kevin Horan wrote:
| > | I posted this in openbabel-devel but didn't get much help, so hopefully
| > | someone here can help. I don't think its too openbabel specific.
| > |
| > | I would like to make use of open babel from within the R language.
| > | Initially I just need to do some format conversions, but may expand the
| > | usage to other parts of OpenBabel as well. I am familiar with embedding
| > | C/C++ code in R, but I'm having some trouble with the plugin mechanism
| > | of OpenBabel in this case. The  problem is that the formats are not
| > | available when I run the OpenBabel code from within R. So, for example,
| > | if I search for the SDF format like so:
| > |       OBFormat *format = conv.FindFormat("SDF");
| >
| > [...]
| >
| > |      The way it works normally in OpenBabel is that each plugin is its
| > | own shared library and then they get loaded at run time with the dlopen
| > | function (on linux at least). I have verified that this code is still
| > | being executed when called from within R, but it doesn't work for some
| > | reason.
| >
| > I would try to start from the smallest possible working examples.  R itself
| > uses dlopen (see eg $RHOME/modules/ for the files it loads), and so does
| > OpenBabel. Maybe some wires get crossed. You may well have to dig down with
| > the debugger and see what assumptions / environment variables / ... are valid
| > or invalid between the working and non-working case.
| >
| > Dirk
| >
| 
| ______________________________________________
| R-devel at r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-devel
#
On Mar 27, 2013, at 1:02 PM, Kevin Horan wrote:

            
Run the egg in a separate R process (i.e. use system() to call another R process which loads libobtest2.so and calls the API to get the path). Then load the modules followed by the .so. 

Note that all this is inherently fragile and probably not portable (e.g. it seems to assume flat namespaces). Another way (only slightly less fragile) is to link libobtest2.so against the modules directly.

The real issue seems in openbabel - there is really no reason why it shouldn't be loading the modules. I didn't look at it, but it could be that it is simply trying to detect things in the wrong namespace and thus mis-detecting something in R as its own. It's obviously a bad design in openbabel as it's polluting the global namespace, but that's another story... (Linux users won't notice as Linux doesn't support two-level namespaces AFAIK).

Cheers,
Simon
#
Thanks for all the suggestions. I have discovered that the problem is 
fixed in openbabel 2.3.x. I had actually been testing with open babel 
2.3.2 (as well as 2.2.3), but I was running it from the build directory 
since I didn't want to install it on the system as I was only testing 
with it. Because of this, 2.3.2 failed in the same way as 2.2.3 (the 
version actually installed), but for a different reason (It couldn't 
find the modules directory ). Actually installing 2.3.2 fixed the problem.

Kevin
On 03/27/2013 11:20 AM, Simon Urbanek wrote: