Skip to content

Non-GPL packages for R

21 messages · Gabor Grothendieck, Spencer Graves, Duncan Murdoch +10 more

#
Subject: Non-GPL packages for R

Packages that are not licensed in a way that permits re-distribution on
CRAN are frequently a source of comment and concern on R-help and other
lists. A good example of this problem is the Rdonlp2 package that has 
caused a lot of annoyance for a number of optimization users in R. They 
are also an issue for efforts like Dirk Eddelbuettel's cran2deb.

There are, however, a number of circumstances where non-GPL equivalent
packages may be important to users. This can imply that users need to
both install an R package and one or more dependencies that must be
separately obtained and licensed. One such situation is where a new
program is still under development and the license is not clear, as in
the recent work we pursued with respect to Mike Powell's BOBYQA. We
wanted to verify if this were useful before we considered distribution,
and Powell had been offering copies of his code on request. Thus we
could experiment, but not redistribute. Recently Powell's approval to
redistribute has been obtained.

We believe that it is important that non-redistributable codes be
excluded from CRAN, but that they could be available on a repository
such as r-forge. However, we would like to see a clearer indication of
the license status on r-forge. One possibility is an inclusion of a
statement and/or icon indicating such status e.g., green for GPL or
equivalent, amber for uncertain, red for restricted. Another may be a
division of directories, so that GPL-equivalent packages are kept
separate from uncertain or restricted licensed ones.

We welcome comments and suggestions on both the concept and the
technicalities.

John Nash & Ravi Varadhan
#
The SystemRequirements: field of the DESCRIPTION file normally
lists external dependencies whether free or non-free.
On Thu, Sep 10, 2009 at 1:50 PM, Prof. John C Nash <nashjc at uottawa.ca> wrote:
#
On 10 September 2009 at 14:26, Gabor Grothendieck wrote:
| The SystemRequirements: field of the DESCRIPTION file normally
| lists external dependencies whether free or non-free.

Moreover, the (aptly named) field 'License:' in DESCRIPTION is now much more
parseable and contains pertinent information. A number of more 'challenging'
packages basically pass the buck on with an entry

	    License: file LICENSE

which refers to a file in the sources one needs to read to decide.

This is e.g. at the basis of Charles' and my decision about what we think we
cannot build via cran2deb [1]: non-free, non-distributable, non-commercial or
otherwise nasty licenses.  There are a couple of packages we exclude for this
(or related reasons), and we have been meaning to summarise them with a
simple html summary from the database table we use for cran2deb, but have not
yet gotten around to it.

Just like John and Ravi, I would actually be in favour of somewhat stricter
enforcements.  If someone decides not to take part in the gift economy that
brought him or her R (and many other things, including at least 1880+ CRAN
packages with sane licenses) then we may as well decide not to waste our time
and resources on his project either and simply exclude it.  

So consider this as a qualified thumbs-up for John and Ravi's suggestion of a
clearer line in the sand.

Dirk

[1] cran2deb is at http://debian.cran.r-project.org and provides 1800+ Debian
'testing' binaries for amd64 and i386 that are continuously updated as new
packages appear on CRAN. With that 'apt-get install r-cran-foo' becomes a
reality for almost every value of foo out of the set of CRAN packages.


|
| On Thu, Sep 10, 2009 at 1:50 PM, Prof. John C Nash <nashjc at uottawa.ca> wrote:
| > Subject: Non-GPL packages for R
| >
| > Packages that are not licensed in a way that permits re-distribution on
| > CRAN are frequently a source of comment and concern on R-help and other
| > lists. A good example of this problem is the Rdonlp2 package that has caused
| > a lot of annoyance for a number of optimization users in R. They are also an
| > issue for efforts like Dirk Eddelbuettel's cran2deb.
| >
| > There are, however, a number of circumstances where non-GPL equivalent
| > packages may be important to users. This can imply that users need to
| > both install an R package and one or more dependencies that must be
| > separately obtained and licensed. One such situation is where a new
| > program is still under development and the license is not clear, as in
| > the recent work we pursued with respect to Mike Powell's BOBYQA. We
| > wanted to verify if this were useful before we considered distribution,
| > and Powell had been offering copies of his code on request. Thus we
| > could experiment, but not redistribute. Recently Powell's approval to
| > redistribute has been obtained.
| >
| > We believe that it is important that non-redistributable codes be
| > excluded from CRAN, but that they could be available on a repository
| > such as r-forge. However, we would like to see a clearer indication of
| > the license status on r-forge. One possibility is an inclusion of a
| > statement and/or icon indicating such status e.g., green for GPL or
| > equivalent, amber for uncertain, red for restricted. Another may be a
| > division of directories, so that GPL-equivalent packages are kept
| > separate from uncertain or restricted licensed ones.
| >
| > We welcome comments and suggestions on both the concept and the
| > technicalities.
| >
| > John Nash & Ravi Varadhan
| >
| > ______________________________________________
| > R-devel at r-project.org mailing list
| > https://stat.ethz.ch/mailman/listinfo/r-devel
| >
| 
| ______________________________________________
| R-devel at r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-devel
#
I will offer my opinion as a user and contributer to R packages 
via R-Forge and CRAN: 


           1.  How difficult would it be to split CRAN into two parts, 
depending on whether the package carried an acceptable license allowing 
free distribution?  The second might carry a name like RANC (R Archive 
Network - Commercial), and the first would retain the CRAN name. 


           2.  R-Forge allows public access to the source code, at least 
for some packages.  Moreover, users applying for R-Forge support must 
specify the license they plan to use.  Support may be denied for a 
project that does not use one of the standard public distribution 
licenses like GPL. 


      Spencer
Dirk Eddelbuettel wrote:

  
    
#
On 10/09/2009 6:57 PM, spencerg wrote:
To this I would say, try it.  Don't ask volunteers to do some work that 
suits you; do it yourself.

Duncan Murdoch
#
+1

Commit to freedom if you want the free services of CRAN, etc ...
On 09/11/2009 12:13 AM, Dirk Eddelbuettel wrote:

  
    
#
The responses to my posting yesterday seem to indicate more consensus 
than I expected:
1) CRAN should be restricted to GPL-equivalent licensed packages
2) r-forge could be left "buyer beware" using DESCRIPTION information
3) We may want a specific repository for restricted packages (RANC?)

How to proceed? A short search on Rseek did not turn up a chain of 
command for CRAN.

I'm prepared to help out with documentation etc. to move changes 
forward. They are not, in my opinion, likely to cause a lot of trouble 
for most users, and should simplify things over time.

JN
#
Prof. John C Nash wrote:
GPL-_compatible_ would be the word. However, this is not what has been
done in the past. There are packages with "non-commercial use" licences,
and the survival package was among them for quite a while. As far as I
know, the CRAN policy has been to ensure only that redistribution is
legal and that whatever license is used is visible to the user. People
who have responded on the list do not necessarily speak for CRAN. In the
final analysis, the maintainers must decide what is maintainable.

The problem with Rdonlp2 seems to have been that the interface packages
claimed to be LGPL2 without the main copyright holder's consent (and it
seems that he cannot grant consent for reasons of TU-Darmstadt
policies). It is hard to safeguard agaist that sort of thing. CRAN
maintainers must assume that legalities have been cleared and accept the
license in good faith.

(Even within the Free Software world there are current issues with,
e.g., incompatibilities between GPL v.2 and v.3, and also with the
Eclipse license. Don't get me started...)

  
    
#
Hi,
Peter Dalgaard wrote:
Umm, I had thought that it was well established that responders need 
not represent the population being surveyed.  I doubt that there is 
consensus at the level you are suggesting (certainly I don't agree) and 
as Peter indicates below the issue is: what is maintainable with the 
resources we have, not what is the best solution given unlimited resources.

   Personally, I would like to see something that was a bit easier to 
deal with programmatically that indicated when a package was GPL (or 
Open source actually) compatible and when it is not.  This could then be 
used to write a decent function to identify suspect packages so that 
users would know when they should be concerned.

   It is also the case that things are not so simple, as dependencies 
can make a package unusable even if it is itself GPL-compatible.  This 
also makes the notion of some simple split into free and non-free (or 
what ever split you want) less trivial than is being suggested.

   Robert
#
You are suggesting we create and maintain an *empty* repository?

All packages on CRAN should be freely redistributable by/within CRAN.
If you find a package which is not, pls let us know---such packages must
be removed from CRAN.

I think you are mistaking the situation about "non-free" packages: these
typically restrict usage for commercial purposes.

-k

        
#
I thought I had already explained the last time the GPL-only suggestion
came up that this will not happen for CRAN.

But again: we have invested considerable time into getting the license
specs standardized, and writing code to compute on these.  Time
permitting, R 2.10.0 will feature code that allows specifying license
filters which can be customized according to individuals' needs.  But I
see no point in physically representing one particular license profile.

Btw, there are less non-free packages on CRAN than packages which claim
to be free but have non-free installation dependencies: some would argue
that the latter is impossible from a license perspective.  I feel little
desire to start arguing about this, as being able to control package
installation by license filters will resolve matters anyway.

-k

        
#
On 11 September 2009 at 16:37, Peter Dalgaard wrote:
| who have responded on the list do not necessarily speak for CRAN. In the
| final analysis, the maintainers must decide what is maintainable.

Fully agreed. As 'maintainers' of cran2deb, Charles and I decided to 'shoot
first, ask questions later' as we clearly wanted to avoid creating any sort
of trouble for our generous CRAN hosts (currently just the Vienna master) are
effectively re-distributing our compilations (of its own content).  

So we pro-actively chose to excludes some packages.  To put some meat on this
particular bone, the current set packages blacklistes for 'nonfree-ness' is:

  sqlite> select package, explanation from blacklist_packages where nonfree;
  package               explanation
  --------------------  ----------------------------------------
  mclust                non-commercial license
  mclust02              non-commercial license
  ConvCalendar          no modification or distribution rights
  SDDA                  non-commercial CSIRO license
  conf.design           non-commercial license
  isa2                  non-commercial creative commons license
  optmatch              non-commercial license
  rankreg               non-commercial license
  realized              non-commercial license
  rngwell19937          non-commercial license
  tnet                  non-commercial creative commons license
  spatialkernel         contains non-commercial gpc code
  Bhat                  non-commercial license
  PTAk                  non-commercial license
  PredictiveRegression  non-commercial license
  RLadyBug              contains some code under non-commercial
  mapproj               non-commercial license
  mathgraph             non-commercial license
  sqlite>

| (Even within the Free Software world there are current issues with,
| e.g., incompatibilities between GPL v.2 and v.3, and also with the
| Eclipse license. Don't get me started...)

Yes. There is a potential problem with gcc 4.4 compilation of GPL-2 code. If
that comes to a boil we all (as in: GPL 2 users) are in a spot of bother.
On 11 September 2009 at 07:48, Robert Gentleman wrote:
|    It is also the case that things are not so simple, as dependencies 
| can make a package unusable even if it is itself GPL-compatible.  This 

Yes, in the case of cran2deb / CRAN there are just three blacklists because
of dependencies on nonfree CRAN content, most often it is dependencies on
other stuff incl BioC which we do not include (for mostly technical /
historical reasons; contact Charles or me offline if you'd like to work on
changing this)

  sqlite> select package,explanation from blacklist_packages where unsatisfied_dependency;
  package               explanation
  --------------------  ----------------------------------------
  ROracle               requires Oracle to build and run
  Rlsf                  requires LSF cluster/grid system librari
  Rsge                  requires SGE cluster/grid system librari
  CarbonEL              requires OS X system
  VhayuR                requires Vhayu database system
  gputools              requires Nvidia CUDA compiler and GPU-en
  klaR                  requires SVMlight which is non-free
  wgaim                 requires asreml-R
  svGUI                 requires Komodo from OpenKomodo.org whic
  RScaLAPACK            requires MPICH2 and Blacs and ScaLAPACK
  caMassClass           requires PROcess from BioConductor
  Rcplex                requires CPLEX libraries
  ADaCGH                BioC depends: tilingArray
  DAAGbio               BioC depends: limma
  GFMaps                BioC depends: affy
  GOSim                 BioC depends: GO.db
  Metabonomic           BioC depends: PROcess
  classGraph            BioC depends: Rgraphviz
  gcExplorer            BioC depends: Rgraphviz
  logilasso             BioC depends: Rgraphviz
  pcalg                 BioC depends: Rgraphviz
  celsius               BioC depends: BioBase
  multtest              BioC depends: BioBase
  hopach                BioC depends: BioBase
  GExMap                BioC depends: multtest,BioBase
  LMGene                BioC depends: multtest,BioBase
  PCS                   BioC depends: multtest,BioBase
  SubpathwayMiner       BioC depends: KEGG.db
  gene2pathway          BioC depends: KEGG.db
  PhViD                 BioC depends: LBE
  SNPMaP                BioC depends: affxparser
  qdg                   BioC depends: pcalg,Rgraphviz
  lsa                   Ohat depends: Rstem
  mpm                   BioC depends: geneplotter
  sisus                 BioC depends: annotate
  metaMA                BioC depends: limma
  clustTool             non-free depends: mclust02
  clustvarsel           non-free depends: mclust02
  SpectralGEM           non-free depends: optmatch
  bayesCGH              BioC depends: snapCGH
  crosshybDetector      missing depends: marray
  FEST                  needs MERLIN <http://www.sph.umich.edu/c
  aroma.affymetrix      BioC depends: aroma.light
  aroma.core            BioC depends: aroma.light
  aroma.apd             BioC depends: aroma.light
  sqlite>


| also makes the notion of some simple split into free and non-free (or 
| what ever split you want) less trivial than is being suggested.

That sounds like the Ostrich defense :) Nobody claimed it was easy or
non-controversial, but it seems some of us feel that it should be discussed
as the status quo may be something we can improve upon.  

E.g. I think that 'License: file LICENSE' is not good enough.  Some sort of
marker at the DESCRIPTIOn level would help.  How many levels we put into an
appropriate factor variable is open for discussion. But for argument's sake:
why don't we start with a binary toggle of whether or not one of the licenses
in http://www.r-project.org/Licenses/ aka share/licenses/ is met?

CRAN has been a huge success (and I am sure the success puts a strain on its
maintainers).  Given that it has become the 800 pound gorilla, may not use
some of that weight to nudge folks to a set of common licenses?

Dirk
#
One complication is that its possible that a package can use a non-free
component but can also be used without it.  The fame package could
be used with fame or without fame for a long time but more recently the
non-fame portion was factored out into the tis package.  The VhayuR
package is similar in that it can be used without Vhayu.  In that case it
can use flat files instead of the Vhayu database.
On Fri, Sep 11, 2009 at 11:44 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
#
At 08:07 11/09/2009, Romain Francois wrote:

            
It seems to me very reasonable for people to be asked to distribute 
their software via some other route if they cannot join in the spirit 
of the enterprise. So add my vote in with Romain's.
Michael Dewey
http://www.aghmed.fsnet.co.uk
#
On 11 September 2009 at 12:19, Gabor Grothendieck wrote:
| One complication is that its possible that a package can use a non-free
| component but can also be used without it.  The fame package could
| be used with fame or without fame for a long time but more recently the
| non-fame portion was factored out into the tis package.  The VhayuR
| package is similar in that it can be used without Vhayu.  In that case it
| can use flat files instead of the Vhayu database.

So in cases where a package used to not build with 'freely available' (and
preferably available as Debian packages) tools but does so now we welcome
hints so that we can update the blacklist.  All it does, really, is to save a
few cpu cycles when we have the expectation that 'R CMD INSTALL' is almost
surely going to fail.

Dirk
#
On 11 September 2009 at 17:25, Kurt Hornik wrote:
| I thought I had already explained the last time the GPL-only suggestion
| came up that this will not happen for CRAN.
| 
| But again: we have invested considerable time into getting the license
| specs standardized, and writing code to compute on these.  Time
| permitting, R 2.10.0 will feature code that allows specifying license
| filters which can be customized according to individuals' needs.  But I
| see no point in physically representing one particular license profile.
| 
| Btw, there are less non-free packages on CRAN than packages which claim
| to be free but have non-free installation dependencies: some would argue
| that the latter is impossible from a license perspective.  I feel little
| desire to start arguing about this, as being able to control package
| installation by license filters will resolve matters anyway.

Indeed, that would possibly solve some our (as in cran2deb) worries too.  So
a nig Thanks! for working on this, and of course for providing CRAN in the
first place.

Dirk
#
Dirk Eddelbuettel wrote:
I second that.  People all over the world are more quantitative 
than they would otherwise be because the R project including CRAN (and 
R-Forge) make it economically feasible for them to access and use high 
quality software to better understand their world and communicate that 
improved understanding  more effectively to others.  Knowledge is power, 
and this increased knowledge gives more people more control over their 
lives.  We are not laying brick but building a cathedral. 


      Spencer

  
    
#
License filters will work for me. My offer stands to help on 
documentation,or to act as a "stooge" to test tools in this area. Thanks 
to those who responded. And for myself, "GPL compatible" was my intended 
expression.

JN
#
John,
On Sep 11, 2009, at 9:07 , Prof. John C Nash wrote:

            
I would definitely vote against that - I think this is not what the  
most people here agreed with (and the subject [non-GPL] and your  
wording [non-redistributable code] are two entirely different things).  
GPL is more restrictive than most open source licenses so with the  
above you'd throw out a lot of "real" open source packages (namely  
those with more permissive open source licenses). The point was open  
distribution as Peter pointed out so GPL-compatible licenses would be  
one possibility (although it also disallows some open source licenses).

Cheers,
Simon
#
Comrades,

When talk turns to the purity of the revolution, and purge of packages then
the guillotine can't be far behind.  We all remember Lenin berating the
"renegade Kautsky" for his "pragmatism," and we know where that led...

So let me put in a good word for pragmatism, and incidentally for saving one
of my 
own packages, SparseM, and perhaps eventually my neck.  Last week Kurt asked
me to look into a SparseM licensing quirk based on an inquiry from the
Fedora
folks.  SparseM is GPL except for one routine cholesky.f written at Oakridge
Lab by E. Ng and B. Peyton.  Our version of the code was redistributed in
the
package PCx which was copywrited by the U. of Chicago, who specified that
commercial users should contact someone at Argonne National Lab.  Since the
beginning we have retained this language in the License file of SparseM,
even
though the code in question was not actually developed as a part of PCx.

I contacted one of the original PCx developers who responded as follows:

	The routine you mention was distributed with PCx but not part 
	of it as you see from the legalese and not covered by the PCx 
	copyright.  I tried to interest the authors of that code 
	in legal issues in around 1997 but could not get them 
	motivated (frankly I also can't get too interested).

To which I heartily concurred.  If someone who is worried about getting sued
would like to dig into this can of worms, then fine.  But life is too short
for the rest of us.  This is quite a murky business, we shouldn't create 
incentives to make it murkier by covering up relevant language on licensing.
But surely we can also all agree that CRAN has been a fantastic success, and
adding new constraints on its operation is ill-advised.

Roger


url:    www.econ.uiuc.edu/~roger            Roger Koenker
email    rkoenker at uiuc.edu            Department of Economics
vox:     217-333-4558                University of Illinois
fax:       217-244-6678                Urbana, IL 61801

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
#
On Fri, Sep 11, 2009 at 1:48 PM, rudjer <rkoenker at uiuc.edu> wrote:
It is unfortunately common in the numerical analysis community,
especially those still using Fortran, to have a rather vague approach
to licensing. ("I'll send a copy of my code to anyone who asks for it
but put in some language that if someone is going to get fantastically
rich from it then they owe me money too")  In the Open Source
community licensing is very important - it is what makes Open Source
software, including CRAN, possible. Most non-lawyers don't find the
study and discussion of licenses to be terribly fascinating but they
are the foundation of Open Source software. If the authors of Fortran
subroutines feel that it is too much of a bother to pay attention to
licenses (or to learn post-1950's programming languages) then
evolution will run its course and they will be left behind.  It's
annoying in that so little software from the numerical analysis
community is covered by suitable licenses but that will change.

Tim Davis's C and C++-based sparse matrix code that is incorporated in
the Matrix package is licensed under the GPL or LGPL.  Why mess around
with antiquated software and vague or non-existent licenses when there
are better alternatives?  It is painful to need to recode old Fortran
routines in modern programming languages and under real licenses but
it is the only way we will ever bring numerical analysis into the
post-Beatles era.