Skip to content

R license for a derived data-only package

3 messages · Michael Friendly, Simon Urbanek, Paul Gilbert

#
I'm looking for guidance or advice about the R license to use in 
preparing a package containing the
Baseball Database from http://baseball1.com/statistics/
My main purpose is to make it available to students in a course, and to 
develop it with others
I'd like to put it on R-Forge, and then perhaps make it public on CRAN.

However, the page above bears a very restrictive copyright notice and 
limited license:

This database is copyright 1996-2010 by Sean Lahman. A license is granted
for individual use for research purposes. It may not be re-distributed
without permission. Any commercial use, or other dissemination of the
database in part or in whole is prohibited. Use of this database
constitutes acceptance of these terms.

I've written several times to the author asking permission for my 
intended wider use, but have
received no reply.

What makes this perplexing is that I am apparently free to "distribute" 
this by sending links
in an email or posting them on a web page, so that others actually 
download them for
personal use.  The R package, however would be considered a "derived 
work", I think,
since it contains .RData files I created and .Rd documentation.  Does 
the original
limited license apply to this?

AFAICS, none of the R licenses described at: 
http://www.r-project.org/Licenses/
seem to cover this situation, although they seem to apply to the R 
package, not the
data on which it is based.

The TeX archive CTAN defines a wider range of licenses, including a 
bunch of non-free ones,
http://ctan.mirror.rafal.ca/help/Catalogue/licenses.html

But I don't know if any of these are acceptable in R packages (e.g., 
will pass R CMD check).
I'd rather not have to consult a lawyer, so any guidance is welcome.
#
On Sep 16, 2011, at 10:32 AM, Michael Friendly wrote:

            
The way people have dealt with this in the past is to create a package that displays the license and downloads the data. The way I read it (but I am not a lawyer and the wording is very ambiguous) you cannot redistribute it in any form (not even in original form) so the only way to obtain it is to download in from the site. This also implies that the conversion to .RData has to be done at (or after) install time from the download and can't be done at build time.

This does not constitute a legal advice, it is just my personal opinion.

Cheers,
Simon
#
Michael

You might look at my TSzip package on CRAN. There are examples in the Guide: http://cran.at.r-project.org/web/packages/TSzip/vignettes/Guide.pdf. The package may not work directly, since it is designed for time series data (the example is daily market data.) But the general idea of just getting the data directly from their web site rather than re-distributing it should work with simple modifications. (If it is time series data you need for your course, there are also examples of other data sets in the guides for TSxls, TSgetSymbol, and TShistQuote.)

HTH,
Paul
====================================================================================

La version fran?aise suit le texte anglais.

------------------------------------------------------------------------------------

This email may contain privileged and/or confidential information, and the Bank of
Canada does not waive any related rights. Any distribution, use, or copying of this
email or the information it contains by other than the intended recipient is
unauthorized. If you received this email in error please delete it immediately from
your system and notify the sender promptly by email that you have done so. 

------------------------------------------------------------------------------------

Le pr?sent courriel peut contenir de l'information privil?gi?e ou confidentielle.
La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion,
utilisation ou copie de ce courriel ou des renseignements qu'il contient par une
personne autre que le ou les destinataires d?sign?s est interdite. Si vous recevez
ce courriel par erreur, veuillez le supprimer imm?diatement et envoyer sans d?lai ?
l'exp?diteur un message ?lectronique pour l'aviser que vous avez ?limin? de votre
ordinateur toute copie du courriel re?u.