Including multiple third party libraries in an extension
On Nov 13, 2011, at 9:55 PM, Tyler Pirtle wrote:
On Sun, Nov 13, 2011 at 6:25 PM, Simon Urbanek <simon.urbanek at r-project.org> wrote: On Nov 13, 2011, at 6:48 PM, Tyler Pirtle wrote:
On Sun, Nov 13, 2011 at 7:27 AM, Uwe Ligges <ligges at statistik.tu-dortmund.de> wrote:
On 13.11.2011 05:22, Tyler Pirtle wrote:
On Sat, Nov 12, 2011 at 8:08 PM, Tyler Pirtle<rtp at google.com> wrote:
Thanks Simon, a few replies...
On Sat, Nov 12, 2011 at 6:14 AM, Simon Urbanek<
simon.urbanek at r-project.org> wrote:
Tyler,
On Nov 11, 2011, at 7:55 PM, Tyler Pirtle wrote:
Hi,
I've got a C extension structured roughly like:
package/
src/
Makevars
foo.c
some-lib/...
some-other-lib/..
where foo.c and Makevars define dependencies on some-lib and
some-other-lib. Currently I'm having
Makevars configure;make install some-lib and some-other-lib into a local
build directory, which produces
shard libraries that ultimately I reference for foo.o in PKG_LIBS.
I'm concerned about distribution. I've setup the appropriate magic with
rpath for the packages .so
That is certainly non-portable and won't work for a vast majority of
users.
Yea I figured, but apparently I have other, more pressing problems.. ;)
(meaning
that when the final .so is produced the dynamic libraries dependencies
on
some-lib and some-other-lib
will prefer the location built in src/some-lib/... and
src/some-other-lib/... But does this preclude me from
being able to distribute a binary package?
Yes. And I doubt the package will work the way you described it at all,
because the "deep" .so won't be even installed. Also there are potential
issues in multi-arch R (please consider testing that as well).
Understood. I wasn't a fan of any of the potential solutions I'd seen (one
of wich included source-only availability).
I've seen some other folks using the inst/ or data/ dirs for purposes like
this, but I agree it's ugly and has
issues. You raise a great point, too, about multi-arch R. I have potential
users that are definitely on
heterogeneous architectures, I noticed that when I R CMD INSTALL --build .
to check my current build,
I end up with a src-${ARCH} for both x86_64 and i386 - is there more
explicit multiarch testing I should be
doing?
If I do want to build a binary
distribution, is there a way I can
package up everything needed, not just the resulting .so?
Or, are there better ways to bundle extension-specific third party
dependencies? ;) I'd rather not have
my users have to install obscure libraries globally on their systems.
Typically the best solution is to compile the dependencies as
--disable-shared --enable-static --with-pic (in autoconf speak - you don't
need to actually use autoconf). That way your .so has all its dependencies
inside and you avoid all run-time hassle. Note that it is very unlikely
that you can take advantage of the dynamic nature of the dependencies
(since no one else knows about them anyway) so there is not real point to
build them dynamically.
That is a much better solution and the one I've been looking for! I was
afraid I'd have to manually specific all the dependency objects but if I
just disable
shared than that makes much more sense, I can let the compiler and linker
do the work for me.
Also note that typically you want to use the package-level configure to
run subconfigures, and *not* Makevars. (There may be reasons for an
exception to that convention, but you need to be aware of the differences
in multi-arch builds since Makevars builds all architectures at once from
separate copies of the src directories whereas the presence of configure
allows you to treat your package as one architecture at a time and you can
pass-though parameters).
Understood. Is src/ still the appropriate directory then for my third
party packages? Also, do you happen to know of any packages off-hand that I
can use
as a reference?
Thanks Simon! Your insights here are invaluable. I really appreciate it.
Tyler
Ah, also a few more questions...
I don't really understand the flow for developing multi-arch extensions.
Does configure run only once?
Depends on the platform. For example: If you are on Windows and have a configure.win, you can tell R to run it for each architecture: See the R Installation and Administration manual and also
R CMD INSTALL --help which has, e.g., under Windows:
--force-biarch attempt to build both architectures
even if there is a non-empty configure.win
Once per arch? What is the state of
src-${ARCH} by the time the src/Makevars or Makefile is executed? Is any of
this actually in the manual and am I just missing it? ;)
The Makevars/-file is executed for each architecture.
And why does R_ARCH start with a '/'? ;)
It is typically used as part of a path's name.
Uwe Ligges
Thanks Uwe, very helpful stuff. I have the problem that I can't configure all my
third party packages at once since they're inter-dependent, so I have to deal with
R_ARCH in my Makefile.
You should not need to since it's irrelevant for you as a package author, it is used internally by R. (Also note that Makevars are preferred to Makefile since it is much more fragile to re-create the R build process in the latter and thus the latter is only used in very special circumstances) That explains a few details then, I thought I was ultimately responsible for producing binaries, but as you pointed out below thats not the case... And I misspoke - I'm using a Makevars, I saw the warning elsewhere as well.
I'm afraid I don't understand at all how portability is managed with respect to packages. I mean, I'm not sure how multi-arch and CRAN all sort of fit together to make my package ultimately available via binary distribution to users an all sorts of platforms. How does all this work?
As long as your code is portable and you use R's facilities (instead of creating your own), it's all automatic. Packages are built on each platform separately and then distributed on CRAN. To answer your previous question: for multi-arch platforms (on CRAN that is Windows and Mac OS X) the package is built separately for each architecture if your package contains configure or Makefile. Otherwise it is built in one go (see R-admin 6.3.4). I guess thats the interesting question, is my code portable? Thats something else that I don't fully understand, why are all architectures built if configure or Makefile are missing? I guess I don't really understand the purpose of multiple sub-architectures (maybe, for example, if I were on windows and building both natively and with cygwin? Is that the purpose?).
No. Several OSes support multiple architectures, for example Mac OS X 10.5 supports PowerPC and Intel, each of them with 32-bit or 64-bit. That gives a total of 4 architectures: i386, x86_64, ppc and ppc64. Therefore R binary for that platform has to support multiple architectures so the way this is done is to keep only one set of non-binary files and several sets of binary files, each for one architecture. On Windows there are 32-bit and 64-bit binaries (i386 and x64) so there are two sets of binary files - 32-bit and for 64-bit. This allows common distribution without the mess of having multiple builds for each architecture.
I'm not sure I get you when you say its "built in one go" - what is? My package? It seems to be building just my (guessed) arch as well.
No, if you don't have configure and Makefile, R will build every architecture it supports, e.g.: * installing *source* package 'fastmatch' ... ** package 'fastmatch' successfully unpacked and MD5 sums checked ** libs *** arch - i386 gcc-4.2 -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -I/usr/local/include -fPIC -g -O2 -c fastmatch.c -o fastmatch.o gcc-4.2 -arch i386 -std=gnu99 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/lib -o fastmatch.so fastmatch.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation installing to /private/tmp/rl/fastmatch/libs/i386 *** arch - ppc gcc-4.2 -arch ppc -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/ppc -I/usr/local/include -fPIC -g -O2 -c fastmatch.c -o fastmatch.o gcc-4.2 -arch ppc -std=gnu99 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/lib -o fastmatch.so fastmatch.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation installing to /private/tmp/rl/fastmatch/libs/ppc *** arch - x86_64 gcc-4.2 -arch x86_64 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/x86_64 -I/usr/local/include -fPIC -g -O2 -c fastmatch.c -o fastmatch.o gcc-4.2 -arch x86_64 -std=gnu99 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/lib -o fastmatch.so fastmatch.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation installing to /private/tmp/rl/fastmatch/libs/x86_64 ** R ** preparing package for lazy loading ** help *** installing help indices ** building package indices ... ** testing if installed package can be loaded * DONE (fastmatch) As you can see, it compiled and installed the binaries for all three architectures (Intel 32-bit, PowerPC 32-bin and Intel 64-bit -- we don't support ppc64 anymore). That is possible since R is taking care of all the building so it can do the right thing with me even having to specify what yo do. Cheers, Simon
Say I can test and am willing to support certain architectures and certain OS distributions, say Mac OS X, Linux, Windows, etc. and I can verify that my package builds in those environments (under some minimal set of conditions). What is CRAN's purpose then? Am I meant to submit a binary build for each arch/OS as separate packages?
You're not supposed to supply any binaries. CRAN builds them from the sources you provide. Thanks for the clarification. Cheers, Simon
My apologies for these questions, I'm quite new to this community, and all of your
help has been amazing, I really do appreciate it. Please point me at any relevant
documentation as well, I'm happy to go read.
Hopefully I can contribute something back in a timely fashion here that will be
helpful to a wider audience ;)
Thanks,
Tyler
thanks again!
Tyler
Cheers,
Simon
[[alternative HTML version deleted]]
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel