Embedding R and registering routines
Simon Urbanek wrote:
Duncan, I see your point. But in that case Apache is the one managing the life of the so, not R, and in many cases unloading the module would also mean to unload R in which case the problem doesn't arise.
Not quite. If apache loads mod_perl and mod_R, say, and "we" make routines in mod_perl available to R, when apache unloads mod_perl but not mod_R, we have to tell R that these are no longer available.
Also that general case requires that R and the embedding application agree on the dylib loading method so they can share the handle. This may not be trivial across platforms.
If by handle we are talking about the handle to the DLL, then while, I can see some potential complications in strange cases, generally it is not an issue. The registration mechanism precisely avoids sharing the handle and deals directly with pointers to routines. Indeed, it is getting away from global variables found by name lookup.
So on the whole I agree, but I'm not quite convinced yet that it's worth the extra effort.. maybe at some point ;) ...
Neither am I; just cautious about making things too simple at one step which makes the entire thing more complex in subsequent steps. I think we have data on that ... :-)
Cheers, Simon On May 1, 2007, at 5:47 PM, Duncan Temple Lang wrote:
Simon Urbanek wrote:
Duncan, are you going to take care of this? I have a quick solution for R- devel that adds a special entry if requested.
If you want to go ahead, be my guest. I'm somewhat occupied for the next few days...
I'm not quite convinced that we need as much flexibility as adding arbitrary DllInfos, because the embedding application is a really special concept (everything else is dynamically loaded except for the application). In a sense "base" does that for non-embedded R and the distinction is that it doesn't allow dynamic lookup. I don't think adding arbitrary DllInfos is wise, because we would have to expose DLL handles etc. - do we really want to do that? And as for adding NULL-handle DLLInfos, there is only one legitimate use and that is the embedding application, so anything else looks more like abuse to me... (just lazy solution to not have to determine the dll). Also the embedded DllInfo cannot be unloaded by design, so it doesn't need anything complicated...
I agree that we don't necessarily want to expose the entire DllInfo structure (but we don't need to - just a constructor function to create a new instance), and also that the embedded case is special. However, Jeff's example illustrates that it is not as simple as the host application maing symbols available to R. In fact, it is not apache that is making the symbols available to R, it is the code in mod_R.so. And it might be that we want to make routines available from a different module dynamically loaded into apache. Now, we can do this by shovelling them all into the "embed" DllInfo, but that is almost the same as putting them all into "base" as we have lost the provenance of the registration. And so if we want to unload an apache module and therefore unregister the routines it provided to R, our life is somewhat more complex. I am not saying that we absolutely need this level of generality. Clearly we have lived without it for a while. However, it does arise in other embedded situations such as when we put R inside Java, Python, Perl, Postgres, ... as each of them can load other .so's. I do believe that we want to and can merge a lot of this inter-system functionality in an increasingly transparent way, and keeping things separate with reflection information is vital for this. And of course, once we make a particular feature such as "add to embed" into R, we are loate to take it out and we live with these constraints for a long time. But in this case, it is not a big deal, so please go ahead if you have the time and want to. Thanks, D.
Cheers, Simon On May 1, 2007, at 4:24 PM, Duncan Temple Lang wrote:
Simon Urbanek wrote:
Since I'm not sure I really understand Jeff's question this is just
my interpretation, but I think the point was that you may want to
register symbols *not* from a DLL but from the embedding
application
itself (e.g. like R.app GUI that embeds libR registers its entry
for
quartz.save). I would welcome a support for this, because the
current
dirty hack (don't do this at home, kids!) is to use R_getDllInfo
("base") and append the entry instead of overwriting it. It is an
ugly hack, but I don't think we have any API for this. Maybe a
worthwhile endeavor would be to simply add something like
R_getDllInfo
("embedded") reserved specifically for such purposes (or "R" or
whatever...).
I think we are all talking about the same thing and the code that I posted does that for a DLL coming from an arbitrary package rather than base. Rather than having yet another global concept, i.e. "embedded", we could allow users to add their own R_DllInfo and so allow mire than on in the same session. The only issue is removing them, freeing the memory, and so on. But this is relatively easy to do, and various implementations suggest themselves. Thanks for the feedback. D.
Cheers, Simon On May 1, 2007, at 1:56 PM, Duncan Temple Lang wrote:
Jeffrey Horner wrote:
Hello, The use of .Call and the like all depend on loading shared libraries and registering routines from it. Also, .Primitive and .Internal depend on routines being registered in the R binary. And applications that embed R can override routines declared in Rinterfac.h, but is there a way for an application embedding R to register other routines defined in the application without loading a shared library or re-compiling R?
I think I understand the question, and if so, the answer is yes!
I have put some code near the end of the message that illustrates
(tests) this idea.
The basic idea is that after you initialize R and load your
RApache package with its .so, you can ask for the corresponding
DllInfo object for that RApache.so. (You need the full path.)
Then, you call R_registerRoutines() with that object as the first
argument and your collection of routines for .C, .Call, .Fortran,
etc.
And then those routines are available to R via the corresponding
interface function.
This is currently slightly strained in two ways.
Firstly, R_registerRoutines() just overwrites any existing
registered
entries. So we should have something that allows us to append to
this. We could add something, if this is a worthwhile approach and
others want to chime in with comments.
Also we are adding these symbols to a table to which they do not
really belong, i.e. pretending they are the same as the
routines in
RApache.so. But it works. Ideally, we would like to be able to
create
and add our own special type of DllInfo. A class system from an
object-oriented language would really help here. But we also
would
need to make this possible via the R API.
(Another hacky, unreliable way is using global symbols.
It is possible for R to resolve symbols on some platforms
by looking in the application's global symbol table.
So R could find symbols in the executable. Of course, you load
mod_R.so and so its symbols are not likely to be in the global
symbol
as I doubt very much Apache loads modules globally.
And we would also have to bed R slightly to make this work.
)
main.c:
-----------------------------
#include <Rinternals.h>
#include <Rembedded.h>
#include <R_ext/Rdynload.h>
void
foo(int *x)
{
fprintf(stderr, "In foo\n");
*x = 101;
}
SEXP
bar(SEXP n)
{
return(ScalarInteger(INTEGER(n)[0] * 2));
}
void
unregistered()
{
fprintf(stderr, "In unregistered\n");
}
static R_CallMethodDef callMethods[] = {
{"bar", (DL_FUNC) &bar, 1},
{NULL, NULL, 0}
};
static R_CMethodDef cmethods[] = {
{"foo", (DL_FUNC) &foo, 1}, /* type { INTSXP }*/
{NULL, NULL, 0}
};
void
registerApplicationRoutinesWithR()
{
DllInfo *dll;
dll = R_getDllInfo("/home/duncan/Rpackage/XML/libs/XML.so");
R_registerRoutines(dll, cmethods, callMethods, NULL, NULL);
}
int
main(int argc, char *argv[])
{
int errorOccurred = 0;
SEXP e;
Rf_initEmbeddedR(argc, argv);
registerApplicationRoutinesWithR();
PROTECT(e = allocVector(LANGSXP, 2));
SETCAR(e, Rf_install("source"));
SETCAR(CDR(e), mkString("test.R"));
R_tryEval(e, R_GlobalEnv, &errorOccurred);
return(0);
}
test.R:
---------------------------
print(.C("foo", x= as.integer(1))$x)
print(.Call("bar", as.integer(3)))
GNUmakefile:
-------------------------------------
CFLAGS=-g -I$(R_HOME)/include
main: main.o
$(CC) -o $@ $^ -L$(R_HOME)/lib -lR
The only such way I've found that comes close to a solution to this is creating an RObjectTable and attaching that to the search path. Assignments to variables in that environment can call the table's get routine which is defined in the application, and I think that might be an interesting solution for a new RApache implementation. For the RApache Project, the mod_R.c shared library get's loaded into the apache process and its purpose is to initializes R. Next, it calls 'library(RApache)' to load RApache.so, a package that implements the RApache API. This two-library system works, but the implementation is too complex. I'd like to simplify down to just one shared library. Any comments, suggestion are much appreciated. Thanks, Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Duncan Temple Lang duncan at wald.ucdavis.edu Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Bldg. fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
-- Duncan Temple Lang duncan at wald.ucdavis.edu Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Bldg. fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA
-- Duncan Temple Lang duncan at wald.ucdavis.edu Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Bldg. fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Duncan Temple Lang duncan at wald.ucdavis.edu Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Bldg. fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : https://stat.ethz.ch/pipermail/r-devel/attachments/20070502/ca65bfa0/attachment.bin