Skip to content

On-demand importing of a package

22 messages · R. Michael Weylandt, Joshua Wiley, William Dunlap +6 more

#
Dear All,

in some functions of my package, I use the Matrix S4 class, as defined
in the Matrix package.

I don't want to depend on Matrix, however, because my package is
perfectly fine without Matrix, most of the functionality does not need
Matrix. Matrix is so included in the 'Suggests' line.

I load Matrix via require(), from the functions that really need it.
This mostly works fine, but I have an issue now that I cannot sort
out.

If I define a function like this in my package:

f <- function() {
  require(Matrix)
  res <- sparseMatrix(dims=c(5, 5), i=1:5, j=1:5, x=1:5)
  y <- rowSums(res)
  res / y
}

then calling it from the R prompt I get
Error in rowSums(res) : 'x' must be an array of at least two dimensions

which basically means that the rowSums() in the base package is
called, not the S4 generic in the Matrix package. Why is that?
Is there any way to work around this problem, without depending on Matrix?

I am doing this on R 2.14.0, x86_64-apple-darwin9.8.0.

Thank You, Best Regards,
Gabor
#
How about calling Matrix's namespace directly?

Matrix:::rowSums()

Michael
On Tue, Nov 22, 2011 at 3:16 PM, G?bor Cs?rdi <csardi at rmki.kfki.hu> wrote:
#
Hi G?bor,

You could import rowSums.  This will not fully attach Matrix.  I am
not sure there is a really good solution for what you want to do.  To
fully use and validate your package, Matrix appears to be required.
This is different from simply, for example, enhancing the Matrix
package.

You could just write the functions assuming matrix is there, make sure
the examples are marked don't run, and tell users if they want to use
them, they need to load Matrix first.  I do this with OpenMx in my
package.

Cheers,

Josh
On Tue, Nov 22, 2011 at 12:16 PM, G?bor Cs?rdi <csardi at rmki.kfki.hu> wrote:

  
    
#
Thanks, I have tried that, it does not work, because rowSums() calls
callGeneric():
Error in callGeneric() :
  'callGeneric' must be called from a generic function or method

G.

On Tue, Nov 22, 2011 at 3:20 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:

  
    
#
Hi Josh,
On Tue, Nov 22, 2011 at 3:31 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
importing rowSums() from NAMESPACE requires having Matrix in the
'Depends:' line, I think. I would like to avoid that, if possible.

Sure, to validate my package it is required. That is fine. For "fully"
using it might be required, but most users don't fully use a package,
they just use 1-20-50% of the functionality, depending on the size of
the package. My package does not depend on Matrix, except in less than
1% of its (quite many) functions.
Hmmm, this is close to what I want to do. Actually my functions work,
even if Matrix is not there, but they work better with Matrix. The
problem is that if I do this, then I get the error I've shown in my
initial email. I.e. Matrix is there, it is loaded, but the rowSums()
generic is not called for some reasons.

Thanks again,
Gabor

  
    
#
On 11/22/2011 01:16 PM, G?bor Cs?rdi wrote:
No need to Depend:. Use

Imports: Matrix

plus in the NAMESPACE file

  importFrom(Matrix, rowSums)

Why do you not want to do this? Matrix is available for everyone, 
Imports: doesn't influence the package search path. There is a cost 
associated with loading the library in the first place, but...?
I'm more into black-and-white -- it either needs Matrix or not; 
apparently it does.
In another message you mention

 > Matrix:::rowSums(W)
Error in callGeneric() :
   'callGeneric' must be called from a generic function or method

but something else is going on -- you don't get to call methods 
directly; you're getting Matrix::rowSums (it's exported, so no need for 
a :::, see getNamespaceExports("Matrix")). Maybe traceback() after the 
error would be insightful?

Martin

  
    
#
I also wondered why it is important to not mention
Matrix in the DEPSCRIPTION file's Depends or Imports
lines, even though some functions in the package
require it.  If this is a hard requirement you could
split your package into two packages, pkgBasic and
pkgEnhanced.  pkgBasic would not not depend on Matrix
and pkgEnhanced would depend on Matrix and pkgBasic.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On Tue, Nov 22, 2011 at 4:27 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
[...]
Not just loading, installing a package has a cost, too. Dependencies
are bad, they might make my package fail, and I have no control over
them. It's not just 'Matrix', I have this issue with other packages as
well.

Anyway, 'Imports: Matrix' is just a workaround I think. Or is the
example in my initial mail expected to fail? Why is that? Why can I
call some functions from 'Matrix' that way and why can't I call
others?
It's a matter of opinion, I guess. I find it very annoying when I need
to install a bunch of packages from which I don't use any code, just
because some tiny bit of a package I need uses them. I would like to
spare my users from this.

[...]
Another poster suggested this, that's why I tried. It is clear that I
should not call it directly. All I want to do is having a function
like this:

f <- function() {
 if (require(Matrix)) {
   res <- sparseMatrix(dims=c(5, 5), i=1:5, j=1:5, x=1:5)
 } else {
   res <- diag(1:5)
 }
 y <- rowSums(res)
 res / y
}

Setting the subjective bit, about depending or not, aside, is there
really no solution for this? The code in the manual page examples work
fine without importing the package and just loading it if needed and
available. Why doesn't the code within the package?

Thanks for the patience,
Gabor
[...]
#
If "Suggests" doesn't work for you, perhaps you need to put more effort into reinventing the wheel, and depend less on other packages.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.
"G?bor Cs?rdi" <csardi at rmki.kfki.hu> wrote:

            
#
On 11/22/2011 03:06 PM, G?bor Cs?rdi wrote:
If I create a package that does not Import: Matrix (btw, Matrix is 
distributed with all R), with only a function f, and with exports(f) in 
NAMESPACE, I get

 > library(pkgA)
 > f()
Loading required package: Matrix
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

     det

Error in rowSums(res) : 'x' must be an array of at least two dimensions

This is because (a) Matrix is attached to the user search path but (b) 
because f is defined in the NAMESPACE of pkgA, rowSums is looked for 
first in the pkgA NAMESPACE, and in then in the search of the package 
(which includes base) and then in the user search path. It is found in 
base, and the search ends there.

If I modify f to use  y <- Matrix::rowSums(res) I get

 > f()
Loading required package: Matrix
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

     det

5 x 5 sparse Matrix of class "dgCMatrix"

[1,] 1 . . . .
[2,] . 1 . . .
[3,] . . 1 . .
[4,] . . . 1 .
[5,] . . . . 1

I start off with the correct rowSums, and continue from there.

Martin

  
    
#
Dear Martin,

thanks a lot, this all makes sense and looks great. I suspected some
S4 trickery and totally forgot that the base package is imported
automatically.

Unfortunately I still get

Error in callGeneric() :
  'callGeneric' must be called from a generic function or method

for my real function, but it works fine for the toy f() function, so I
think I can sort this out from here.

Best Regards,
Gabor
On Tue, Nov 22, 2011 at 6:57 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:

  
    
#
On Tue, Nov 22, 2011 at 3:16 PM, G?bor Cs?rdi <csardi at rmki.kfki.hu> wrote:
Try adding these three lines to the package:

rowSums <- function(x, na.rm = FALSE, dims = 1L) UseMethod("rowSums")
rowSums.dgCMatrix <- Matrix:::rowSums
rowSums.default <- base::rowSums
#
On 23.11.2011 03:18, Gabor Grothendieck wrote:
Folks, please not, just import relevant functionality from the 
*recommended* package Matrix.
Messing around even more is certainly less helpful than importing 
relevant part from a Namespace/package that you will use anyway.

Best,
Uwe Ligges
#
2011/11/23 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
The real problem is how to deal with conditional dependencies and
importing is just as much a kludge as anything else.  In the problem
under discussion it has the undesirable property that Matrix is always
imported even though its almost never needed.

Additional conditional dependency features may be needed in R.  All
the scenarios in which conditional dependency are involved need to be
thought about since there may be interaction among them.

Some features might be:

- dynamically import another package.
- uncouple package installation and loading.  Right now
install.packages has a dep= argument that causes the Suggests packages
to be installed too.  There should be some way for the package
developer to specify this rather than make the user specify it.  For
example, if Matrix were not a recommended package and most users
wanted to use it in the problem above but a few wanted to use a
package that conflicts with it then it would be nice if the package in
question could force dep=TRUE without having the user do it.  For
example, perhaps there would be an
  Installs: Matrix
line in the DESCRIPTION file to tell it to install Matrix at install
time but not load it automatically at package load time -- the package
would have to require it itself.  (sqldf has this problem since most,
but not all, users want RSQLite but to put it in Suggests would make
most users use install.packages("sqldf", dep = TRUE) which makes it
harder to install whereas putting it in Depends means its always
loaded and could conflict with some other database backend.)
#
Just for the records, the solution was to make the matrix 'dgCmatrix"
instead of 'dsCmatrix', 'dgCmatrix' works and 'dsCmatrix' does not. I
suspect that this has to do something with the S4 class hierarchy in
Matrix, but I am not sure.

This is a quite ugly workaround, since it depends on some internal
Matrix features, so I might end up just importing Matrix, as many of
you suggested in the first place....

However, I agree with Gabor Grotherdieck that the issue is still
there, in general.

Another example would be (optionally) using the 'snow' (or now
'parallel') package. I would like to add optional parallel processing
to some of my functions in a package, without actually requiring the
installation of snow/parallel. If snow is there and the user wants to
use it, then it is used, otherwise not.

Actually, 'snow' works similarly. It (optionally) calls function from
the Rmpi package, but Rmpi is only suggested, there is no hard
dependency. This seems to work well for snow, but in my case I the S4
features in Matrix interfere.

Gabor
On Tue, Nov 22, 2011 at 8:09 PM, G?bor Cs?rdi <csardi at rmki.kfki.hu> wrote:

  
    
#
On 23.11.2011 16:38, G?bor Cs?rdi wrote:
Errr, parallel as a base package is *always* installed for R >= 2.14.0.


  If snow is there and the user wants to
Well the is an important difference here: Rmpi does not produce name 
clashes and its functionality is found when it is simply attached to the 
search path in the case of snow. The problem with Matrix is that you 
import from base and hence base names are found before Matrix names.
Hence require()ing Matrix as a suggest and calling directly 
Matrix::rowSums explicitly and so on should work. And exactly this is 
what Namespaces are made for: deal appropriately with name clashes by 
importing into the own namespace.

Best,
Uwe Ligges
1 day later
#
On 23.11.2011 14:59, Gabor Grothendieck wrote:
Errr, if I understand this correctly, your arguments are now orthogonal 
to your original comments.

Before you told us it is important to be able to run stuff without 
having Matrix available or just load on demand since it may not be 
available to the users. Now you tell us you want to make it available 
without having any need to use it?

Uwe Ligges
#
2011/11/25 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
I was framing this in terms of the Matrix example, but perhaps its
easier to understand with the actual example which motivated this for
me.  That is, the feature is that whenever sqldf is installed then
RSQLite is installed too without having RSQLite automatically load
when sqldf loads.

Currently the only way to arrange that is to put RSQLite into Suggests
and then instruct the user to use install.packages(..., dep = TRUE),
say.   The problem with that is that it burdens the user with this
installation detail.

sqldf nearly always uses RSQLite so it should be installed when sqldf
is without the user having to do anything special.  We don't know at
install time whether RSQLite will be used or not but are willing to
have it unnecessarily installed even if its not needed in order to
make it easier for the majority who do use it.

However, just because RSQLite is installed does not mean that we want
RSQLite to be loaded automatically too.  sqldf can determine whether
the user wants to use the sqlite backend or one of several other
backends and require() RSQLite or not depending on whether its
actually to be used in that session.

Currently, if RSQLite is in Depends then its always loaded and if its
in Suggests then we can't be sure its been installed so neither of
these work the way we want.  The two things are tied together (i.e.
coupled) but here we want to separate them.  We always want RSQLite to
be installed without making the user specify it on the
install.packages() call yet we want the ability to dynamically
require() it rather than have it automatically loaded when sqldf is
loaded.

One way this might be implemented would be to have an Installs: line,
say, in the DESCRIPTION file which lists packages which are to be
installed at the same time but not automatically loaded.   It would be
the same as Depends except Depends also loads the package whereas
Installs does not -- it only installs the dependency and the package
itself has to require it if it wants it loaded.

The dynamic part is currently only possible if we use Suggests but
that forces the user rather than the package developer to specify
whether to install it.

(One variation of this is that Installs might only specify that the
dependency is installed by default but the user could still override
it on the install.packages() call by specifying not to install it.)

The point here is that the loading is dynamic but installation always occurs.

The Matrix situation was also a situation where dynamic action is
important.  Its not identical to the sqldf case but I was mentioning
them both in case there were any interaction among them since the
generic category of dynamic action for package installation and
loading might be considered together in case there is interaction
among features.
#
On Fri, Nov 25, 2011 at 1:21 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
[...]
[...]

I think that the following procedure has the result that you want:

Put in the DESCRIPTION file:

Imports: RSQLite

And in the R code write something like:

RSQLite::AnRSQLiteFunction()
#
On Fri, Nov 25, 2011 at 11:52 AM, Jakson Alves de Aquino
<jalvesaq at gmail.com> wrote:
I had been thinking of using Imports in DESCRIPTION but was concerned
that that would put RSQLite objects ahead of everything else on
sqldf's search path even when not wanted but I gather you are
intending that Imports be used in DESCRIPTION: but _not_ in the
NAMESPACE file.  I think that that would likely work. I will test it
out to be sure. What I would probably want to do is to require()
RSQLite in case the user wants to mix sqldf and RSQLite calls and I
will check whether the check procedure allows that if the package is
only named in Imports but, if not, it might be sufficient to put
RSQLite in both Imports and Suggests.  Thanks.
#
On Fri, Nov 25, 2011 at 2:40 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
I have done this with the 'descr' package. It wasn't necessary to put
the imported packages in two places, only in the "Imports" field. This
was enough to make R install all dependencies but not load then along
with 'descr'.
#
On Fri, Nov 25, 2011 at 2:47 PM, Jakson Alves de Aquino
<jalvesaq at gmail.com> wrote:
I just tried it but I wanted to require() RSQLite so that the user can
access its facilities as well and although putting it just in Imports
does work the check complains about requiring a package that has not
been declared unless I put it in Suggests as well.  If I don't do a
require() then it would not be necessary to put it in Suggests so
there seems to be a slight difference between descr and sqldf.