Skip to content

License status of CRAN packages

20 messages · Gabor Grothendieck, Dirk Eddelbuettel, Marc Schwartz +5 more

#
(Subject: renamed as thread hijacked from the ParallelR thread   --Dirk)
On 23 April 2009 at 14:44, Gabor Grothendieck wrote:
| Aside from R there are the add-on packages.
| 
| A frequency table showing the licenses of the CRAN packages indicates
| that the all or almost all packages have some sort of free software license
| with GPL licenses being most common. (A few packages have restrictions
| to noncommercial use and that may conflict with GPL, not sure.)   That is
| not to say that there are no other types of packages but any such packages
| are not on CRAN.

I fear that is not quite the case.  There are quite a few packages like that.

Charles Blundell and I have continued to work on his Google Summer of Code
2008 project of fully automatically creating Debian packages from CRAN
sources.  As this can be seen as redistributing (or at least as making
redistribution easier), we have tried to be careful about the licenses.  We
currently build about 1500 out of 1650 or so 'buildable' packages, but we
stop if we do not explicitly know the licenses (and Charles had e.g. created
several dozen variants of writing out 'GPL' in less clear ways so that we
could include those packages).  Moreover one big show stopper is 'File
license' shown below as applicable for 38 packages You need to explicitly
study all of those 'license' files; some are free and some aren't.  The
trouble is that you cannot tell and you end up labelling packages as 'maybe
not free' even if they are (an example here may be mlbench).

That does not scale.  Ultimately, I fear we need someone to sit down and
classify CRAN sources packages into appropriate buckets of 'freeness' of use,
redistribution etc.  And/or to remap all packages to a smaller, saner set of
licenses (as e.g. those in licenses.db).  Section 1.1.1 of 'R Extensions' is
quite clear about this, but I would like this to go further.  

Given that install.packages() does not check, I am afraid that we are not
going far enough in preventing users from accessing packages that they may
not be able to access and use under the terms of the license file.
Ultimately, this may mean moving some packages to a 'non-free' repository
tree as well.

I'd love to hear comments and concrete suggestions.

Dirk


|          AGPL (>3.0), with attribution as per LICENSE file
|                                                             1
|                                   AGPL 3.0 (with attribution)
|                                                             1
|                                            Apache License 2.0
|                                                             2
|                                                  Artistic-2.0
|                                                             5
|                                              Artistic License
|                                                             2
|                                          Artistic License 2.0
|                                                             1
|                      avas is public domain, ace is on Statlib
|                                                             1
|                                                           BSD
|                                                            16
|                                                        CeCILL
|                                                             1
|                                                      CeCILL-2
|                                                             2
|                             Common Public License Version 1.0
|                                                             2
|        Distribution and use for non-commercial purposes only.
|                                                             1
|                                                  file LICENCE
|                                                             2
|                                                  file LICENSE
|                                                            38
|   Fortran code: ACM, free for non-commercial use, R functions
|                                                             1
|                              free for non-commercial purposes
|                                                             1
|                                       Free for nonprofit use.
|                                                             1
|                       Free. See the LICENCE file for details.
|                                                             1
|                                    GNU General Public License
|                                                             3
|                          GNU General Public License Version 2
|                                                             4
|                                                           GPL
|                                                           222
|                                                         GPL-2
|                                                           316
|                                          GPL-2 | file LICENCE
|                                                             1
|                                          GPL-2 | file LICENSE
|                                                             7
|                                                 GPL-2 | GPL-3
|                                                            13
|   GPL-2.  Contributions from Randall C. Johnson are Copyright
|                                                             1
|   GPL-2; incorporates by permission code of W. Bachman (wrtab
|                                                             1
|                                                         GPL-3
|                                                            38
|                                                  GPL (≥ 2)
|                                                           872
|                                   GPL (≥ 2) | file LICENSE
|                                                             1
|                                                GPL (≥ 2.0)
|                                                             2
|                                                  GPL (≥ 3)
|                                                            34
|                                                GPL (≥ 3.0)
|                                                             1
|                                                    GPL (== 2)
|                                                             1
|                                            GPL | file LICENSE
|                                                             1
|                                                    GPL | LGPL
|                                                             1
|                                                GPL 2 or newer
|                                                             1
|                                GPL AFFERO 3.0 (with citation)
|                                                             1
|      GPL version 2 or newer. Copyright statement for ptolemy:
|                                                             1
|    GPL version 2 or newer. The terms of this license are in a
|                                                             2
| GPL version 2 or newer. This library is Copyright (C) 2007 by
|                                                             1
|                                           GPL2 | file LICENSE
|                                                             1
|                                                          LGPL
|                                                            21
|                                                        LGPL-2
|                                                             3
|                                                      LGPL-2.1
|                                                             6
|                                                        LGPL-3
|                                                            19
|                                                 LGPL (≥ 2)
|                                                             2
|                                               LGPL (≥ 2.0)
|                                                             5
|                                               LGPL (≥ 2.1)
|                                                             9
|                                                           MIT
|                                                             8
|                                    Mozilla Public License 1.1
|                                                             1
|                Original ??, extensions GPL version 2 or newer
|                                                             1
|   R functions: GPL, Fortran code: ACM, free for noncommercial
|                                                             1
|                               S original available at statlib
|                                                             1
|    The caMassClass Software License, Version 1.0 (See COPYING
|                                                             1
|    The software may be distributed free of charge and used by
|                                                             3
| This package was written by Hans Peter Wolf. This software is
|                                                             1
|   This software may be re-distributed freely and used for any
|                                                             3
|              Unclear (Fortran) -- code in Statlib's ./S/adapt
|                                                             1
|                                                     Unlimited
|                                                            18
|                 Unlimited distribution for noncommercial use.
|                                                             1
|                                       Unlimited distribution.
|                                                             1
|                                                           X11
|                                                             7
| 
| ______________________________________________
| R-devel at r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-devel
#
On Thu, Apr 23, 2009 at 3:08 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
Not the case?  My post included a list of all License fields from the
DESCRIPTION file of every CRAN package so the list is definitive.
#
On 23 April 2009 at 15:32, Gabor Grothendieck wrote:
| On Thu, Apr 23, 2009 at 3:08 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
| >
| > (Subject: renamed as thread hijacked from the ParallelR thread   --Dirk)
| >
| > On 23 April 2009 at 14:44, Gabor Grothendieck wrote:
| > | Aside from R there are the add-on packages.
| > |
| > | A frequency table showing the licenses of the CRAN packages indicates
| > | that the all or almost all packages have some sort of free software license
| > | with GPL licenses being most common. (A few packages have restrictions
| > | to noncommercial use and that may conflict with GPL, not sure.)   That is
| > | not to say that there are no other types of packages but any such packages
| > | are not on CRAN.
| >
| > I fear that is not quite the case.  There are quite a few packages like that.
| 
| Not the case?  My post included a list of all License fields from the
| DESCRIPTION file of every CRAN package so the list is definitive.

Correct me if I am wrong in the paragraph you kindly left standing above, you
seem to suggest that

	"all or almost all packages have some sort of free software license" 

and that while non-free licenses may exist, 

	"any such packages are not on CRAN".

I believe this statement to be false.

There are packages with restrictive licenese on CRAN.  They were contained in
the list of licenses you assembled, and my point is that it is overly hard to
identify them (if one were to tty to avoid using these packages).

As a non-exhautive list with possible misclassifications, cran2deb currently
has these packasges as 'maybe not free' and does not build them:

     BARD,BayesDA,CoCo,ConvCalendar,FAiR,PTAk,RScaLAPACK,Rcsdp,SDDA,SGP,
     alphahull,ash,asypow,caMassClass,gpclib,mapproj,matlab,mclust,mclust02,
     mlbench,optmatch,rankreg,realized,rngwell19937,rtiff,rwt,scagnostics,
     sgeostat,spatialkernel,tlnise,xgobi

We are missing some recently added packages, and we may yet flag several from
the list above as free. Some may be listed because of non-free Depends:

But to take a concrete example, 'realized' is not something I am supposed to
install at work.  Yet install.packages() currently has not way knowing that.

Are we approximately on the same page ?

Dirk
#
Of the 31 packages listed:
 [1] "BARD"          "BayesDA"       "CoCo"          "ConvCalendar"
 [5] "FAiR"          "PTAk"          "RScaLAPACK"    "Rcsdp"
 [9] "SDDA"          "SGP"           "alphahull"     "ash"
[13] "asypow"        "caMassClass"   "gpclib"        "mapproj"
[17] "matlab"        "mclust"        "mclust02"      "mlbench"
[21] "optmatch"      "rankreg"       "realized"      "rngwell19937"
[25] "rtiff"         "rwt"           "scagnostics"   "sgeostat"
[29] "spatialkernel" "tlnise"        "xgobi"

the license fields are AGPL or GPL for 3 and specified in a separate
file "file LICENSE" so about 30 of 1700 < 2% are question marks.
To me that is not inconsistent with all or nearly all being free software
licenses but at any rate this quantifies it a bit better.  (A couple are
not listed below as I got a read error when trying to access its summary
from the CRAN site.  Its possible those 2 are not actually on CRAN.)

                            BARD
                               "AGPL 3.0 (with attribution)"
                                                     BayesDA
                                              "GPL (&ge; 2)"
                                                        CoCo
                                              "file LICENSE"
                                                ConvCalendar
                                              "file LICENCE"
                                                        FAiR
                                              "file LICENSE"
                                                        PTAk
                                              "file LICENSE"
                                                       Rcsdp
                                              "file LICENSE"
                                                        SDDA
                                              "file LICENSE"
                                                         SGP
                                              "file LICENSE"
                                                   alphahull
                                              "file LICENSE"
                                                         ash
                           "S original available at statlib"
                                                      asypow
                                              "file LICENSE"
                                                 caMassClass
"The caMassClass Software License, Version 1.0 (See COPYING"
                                                      gpclib
                                              "file LICENSE"
                                                     mapproj
    "Distribution and use for non-commercial purposes only."
                                                      matlab
                                              "file LICENSE"
                                                      mclust
                                              "file LICENSE"
                                                    mclust02
                                              "file LICENSE"
                                                     mlbench
                                              "file LICENSE"
                                                    optmatch
                                              "file LICENSE"
                                                     rankreg
                                   "Free for nonprofit use."
                                                    realized
                                              "file LICENSE"
                                                rngwell19937
                                              "file LICENSE"
                                                       rtiff
                                              "file LICENSE"
                                                         rwt
                                              "file LICENSE"
                                                 scagnostics
                                              "file LICENSE"
                                                    sgeostat
            "Original ??, extensions GPL version 2 or newer"
                                               spatialkernel
                                              "file LICENSE"
                                                      tlnise
                                              "file LICENSE"


2009/4/23 Dirk Eddelbuettel <edd at debian.org>:
#
On Apr 23, 2009, at 3:02 PM, Dirk Eddelbuettel wrote:

            
There is a list of acceptable entries that are defined as part of the  
specs in R-exts (see page 4). Perhaps this needs to be "tightened" a  
bit, at least in so far as packages passing R CMD check for the  
purpose of inclusion on CRAN. That would include perhaps altering the  
ability to use the 'file LICENSE' option, which at present leaves the  
door wide open for non-standard approaches. It may also have to check  
for DEPENDS and whether they too are on CRAN and passed the  
appropriate license checks.

Packages that fail this check should not be included on CRAN and the  
package author would then be obligated to find other distribution  
resources or contact the CRAN maintainers to advocate that their  
licensing schema should be acceptable.

Then the end user can at least have some comfort in knowing that  
anything they get from CRAN comes under a compatible license for  
general use without restriction. They would have to intentionally use  
other sources for packages that fail the CRAN requirements.

If other distribution venues, such as Debian/Ubuntu/Fedora elect to  
tighten those restrictions even further when making .debs or RPMs  
available, then that is a decision that they get to make and end users  
will need to be aware of those as well. Albeit I don't envision the  
aforementioned Linux distros including packages that should be a  
problem for most end users relative to usage restrictions given their  
own license review processes.

HTH,

Marc Schwartz
#
In some other software systems there are separate repositories for
free and non-free add-ons.  That way its clear what you are downloading
yet there are good outlets for both types of software.  There has been some
discussion of future features that CRAN might have that might make
this even easier to do.   My opinion is that R will suffer if it were not
to support both types of software but at the same time its reasonable
to make it clear which type you are getting before you download it.
On Thu, Apr 23, 2009 at 4:35 PM, Marc Schwartz <marc_schwartz at me.com> wrote:
#
Dirk Eddelbuettel <edd <at> debian.org> writes:
Small point: FAiR is free. The file LICENSE thing just clarifies that most of
the code is AGPL but a couple files can't be included under the AGPL and are
plain GPL. As far as I can see, R does not give me the option of saying so in a
"standard" way, e.g. putting License: AGPL (>= 3) in the DESCRIPTION file would
only be 95% accurate and putting License: AGPL (>= 3) | GPL (>= 3) is misleading.

Ben
#
On 23 April 2009 at 15:35, Marc Schwartz wrote:
| There is a list of acceptable entries that are defined as part of the  
| specs in R-exts (see page 4). Perhaps this needs to be "tightened" a  
| bit, at least in so far as packages passing R CMD check for the  
| purpose of inclusion on CRAN. That would include perhaps altering the  
| ability to use the 'file LICENSE' option, which at present leaves the  
| door wide open for non-standard approaches. It may also have to check  
| for DEPENDS and whether they too are on CRAN and passed the  
| appropriate license checks.

Exactly. 

| Packages that fail this check should not be included on CRAN and the  
| package author would then be obligated to find other distribution  
| resources or contact the CRAN maintainers to advocate that their  
| licensing schema should be acceptable.
| 
| Then the end user can at least have some comfort in knowing that  
| anything they get from CRAN comes under a compatible license for  
| general use without restriction. They would have to intentionally use  
| other sources for packages that fail the CRAN requirements.

Exactly.  I think we may have to work on tightening the standards of CRAN
re-distribution.

| If other distribution venues, such as Debian/Ubuntu/Fedora elect to  

cran2deb does not have inclusion to Debian in mind. What Charles and I are
thinking about is something aking to the Windows situation: suitable i386 and
amd64 binaries (for Debian Linux) provided from CRAN for as many packages as
possible.

Dirk
#
On 23 April 2009 at 16:35, Gabor Grothendieck wrote:
| Of the 31 packages listed:
|  [1] "BARD"          "BayesDA"       "CoCo"          "ConvCalendar"
|  [5] "FAiR"          "PTAk"          "RScaLAPACK"    "Rcsdp"
|  [9] "SDDA"          "SGP"           "alphahull"     "ash"
| [13] "asypow"        "caMassClass"   "gpclib"        "mapproj"
| [17] "matlab"        "mclust"        "mclust02"      "mlbench"
| [21] "optmatch"      "rankreg"       "realized"      "rngwell19937"
| [25] "rtiff"         "rwt"           "scagnostics"   "sgeostat"
| [29] "spatialkernel" "tlnise"        "xgobi"
| 
| the license fields are AGPL or GPL for 3 and specified in a separate
| file "file LICENSE" so about 30 of 1700 < 2% are question marks.

My point is that you currently need to manually parse 'file LICENSE'.  

And as I said, we did not claim that our set was exhaustive, current or
perfect. We just can't automate anything better given the current framework.
And I think we all should be able to do better in scripted approaches.  I
still think you're proving my point.  

| To me that is not inconsistent with all or nearly all being free software

I doubt that "all or nearly all" would equated to "exactly all" by a
court. You only need one bad apple to spoil the lot.

Dirk
#
On Thu, Apr 23, 2009 at 4:59 PM, Ben Goodrich <goodrich at fas.harvard.edu> wrote:
How about "

License: AGPL except for 2 GPL files
#
Gabor Grothendieck wrote:
If that would make anyone's life easier without making anyone else's
life harder, I would be happy to put that in the DESCRIPTION file. I
have been doing file LICENSE because it parses on

http://cran.r-project.org/web/packages/FAiR/index.html

and people can click to the LICENSE link to read the details if they are
interested. But maybe that is not optimal. Dirk?

Ben
#
I don't know about the legal definitions of all, but a few years back the British Medical Journal had a filler article that looked at some surveys of what people thought different words meant (you can get at the filler by going to http://www.bmj.com/cgi/content/full/333/7565/442 and downloading the pdf version of the article then scrolling to the end).

According to this, when people say always they could mean anywhere from 91-100% of the time and when they say never it could be 0-2% of the time.

This doesn't prove anything, but I thought it was an interesting side note to the discussion.
#
This will be a non-canonical license spec and hence I would ask you to
change back to a canonical one (file LICENSE in your case).
The current scheme for license specs standardizes the markup to allow
for computing on the specs.  (And in fact, the code I put into 2.9.0
allows for standardizing most non-standard specs.)  For standard
licenses we can easily provide the text as well as maintain info on
whether the license was classified as "free" (e.g. by the FSF) or "open"
etc.  

AGPL, unfortunately, allows supplements, and hence cannot fully be
standardized.  We've been thinking about extending the current scheme to
indicate a base license plus supplements, but this is still work in
progress.

-k
#
Nothing, although the spec is not canonical as per R-exts, see
http://www.r-project.org/nosvn/R.check/r-devel-linux-ix86/BayesDA-00check.html:

* checking DESCRIPTION meta-information ... NOTE
Non-standard license specification:
GPL version 2 or any later version
Standardizable: TRUE
Standardized license specification:
GPL (>= 2)

But as I wrote, the new code in 2.9.0 standardizes when it can ...

-k

            

        
#
On 24 April 2009 at 10:18, Kjetil Halvorsen wrote:
| On Thu, Apr 23, 2009 at 4:59 PM, Ben Goodrich <goodrich at fas.harvard.edu>wrote:
| 
| > Dirk Eddelbuettel <edd <at> debian.org> writes:
| > > As a non-exhautive list with possible misclassifications, cran2deb
| > currently
| > > has these packasges as 'maybe not free' and does not build them:
| > >
| > >      BARD,BayesDA,CoCo,ConvCalendar,FAiR,PTAk,RScaLAPACK,Rcsdp,SDDA,SGP,
| >
| 
|  BayesDA has
|    License:       GPL version 2 or any later version
| 
| what is unclear about that?

Pick any one of

  a)  the license may have been different when that list above was established
 
  b)  the wording, which by the way is not in the format mandated by the R
      Extentions manual, does not match any of the more than dozen free forms
      of saying GPL that we already accept automatically

  c)  the package is fine but it Depends: on another package flagged non-free
      and is hence non-free for us

  d)  lack of volunteer time to manually check and adjust among 1700 packages
      with a steady inflow of new packages

  e)  all of the above

  f)  none of the above.

Dirk
#
Kurt Hornik wrote:
This would be helpful. I would just reemphasize that a package that
includes some AGPL code and some GPL3 code is standard as far as the FSF
is concerned, e.g. from section 13 of the AGPL:

"Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single combined
work, and to convey the resulting work. The terms of this License will
continue to apply to the part which is the covered work, but the work
with which it is combined will remain governed by version 3 of the GNU
General Public License."

So, I think that CRAN should at least have a canonical spec that covers
*this* situation. Other situations may be more complicated to handle
elegantly.

Thanks,
Ben
#
On Fri, Apr 24, 2009 at 11:44 AM, Ben Goodrich <goodrich at fas.harvard.edu> wrote:
Another possibility is to simply standardize the set of licenses that CRAN
supports.  GPL licenses (GPl-2, GPL-2.1, GPL-3, LGPL), MIT and
X11 already cover 98% of all packages on CRAN.   If there truly is an
advantage to the AGPL license perhaps a standard version could be offered
in the set.  Perhaps, for the 2% of packages that want a different license
a second repository could be made available.
#
Hi all,

I think for the common licences, we should also add BSD licence... for  
example my pkg randtoolbox (which is currently with incompatible  
licences) will probably be in a near future with the BSD licence.

Anyway I like the idea of two different repositories for GPL like  
licensed pkg and other packages.

Christophe

Le 24 avr. 09 ? 18:20, Gabor Grothendieck a ?crit :
--
Christophe Dutang
Ph. D. student at ISFA, Lyon, France
website: http://dutangc.free.fr
#
I don't have a strong opinion about partitioning the repository, but I
don't think partitioning based on whether the license is commonly used
for R packages is terribly helpful. AGPL and AGPL + GPL3 are not common
licensing schemes for R packages currently, but from the perspective of
a useR, there is no relevant distinction between these two rare cases
and the more common case of GPL3. So why should packages be put in
separate repositories based on this non-distinction? A partition based
on whether the package is free according to the FSF definition seems
more plausible to me.

Ben
Christophe Dutang wrote: