Skip to content

What is preferable - a single large package or a few smaller packages?

3 messages · Peter Langfelder, Hervé Pagès, Brian Ripley

#
Hi all,

I maintain the WGCNA package which at present has nearly 200
functions. In the future there will be more. Curious whether it would
be preferable or useful to split the package into a couple different
ones with different aims. Obviously, when one calls a function in R,
package name spaces have to be traversed to find the matching name -
does the speed of this depend on how functions are  partitioned into
packages? Any other considerations? My knowledge of R internals in
this regard is pretty non-existent - thanks for any pointers.

Best,

Peter
#
Hi Peter,
On 05/29/2013 03:38 PM, Peter Langfelder wrote:
Other important considerations are maintainability and
user-friendliness. If you think the package can keep growing and still
remain relatively easy to maintain, then maybe you don't need to split
it. But if the package becomes too hard to maintain and/or can
naturally be divided into more or less independent departments, and
if the end-user generally doesn't need all functionalities from all
departments for a typical work flow, then you might want to split.
That will benefit both: the user and you. That will also make it easier
to have other people collaborate to the whole thing (if one day you
decide you need some help for that).

The impact on the speed of function name lookup would be the last thing
I would worry about.

My 2 cents.

H.

  
    
#
On 29/05/2013 23:38, Peter Langfelder wrote:
Namespace environments are hashed, so essentially lookup is independent 
of size.  And since lazy-loading the memory footprint depends far more 
on what has been used in the session than the number of functions.

In any case, 200 functions is not a 'large' package.  'stats' has nearly 
1100 in its namespace ....  Performance for really large packages was 
improved to the point of a being a non-issue before 2.0.0.