Skip to content

Including a binary Python Interpreter into a binary R-package for MS Windows

5 messages · Gabor Grothendieck, gvsteen at yahoo.com, Simon Urbanek +1 more

#
2009/8/30 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
[snip]
[snip]
[snip]
Hi Uwe, 

Note: I will send this email cc. to the R-devel list, which I joined today. I think it may be of interest to other people as well. 

Thank you for your answer, although it disappointed me a bit. I had already spent quite some time building a stand-alone windows binary of a new package "write2xls". This package provides the same R interface to Python as the other package "dataframes2xls". As you know it enable users to create xls files. The special thing about "write2xls" is that it does not have any dependencies at all. It is so-to-speak a turn-key solution. 

Of course I should have read a bit more before I started. Only after your mail I read the pdf-file "Writing R Extensions". It says "A source package if possible should not contain binary executable files: they are not portable, and a security risk if they are of the appropriate architecture. R CMD check will warn about them unless they are listed (one filepath per line) in a file 'BinaryFiles' at the top level of the package or bundle. Note that CRAN will no longer accept submissions containing binary files even if they are listed." 

So, yes, you are right. I was actually hoping that CRAN could make some exceptions, but after some thinking I fully understand that many people would object to this for good reasons: R code depending on a C compiler will not work without a C Compiler either. For security reasons we cannot allow packages to install a binary C compiler. So, yes, I understand the reasons but still it is a pity. 

The current situation is that many MS Windows users can not easily use "dataframes2xls". There are a few reasons:  

* Most users of MS Windows will be unfamiliar with Python, which will make them reluctant to install Python. 

* Installing Python will be impossible on many MS Windows platforms due to limited user rights. 

* Downloading a standard Python installer takes about 15 Megabytes. My newly created "write2xls" package just contains 1.3 MB. 

So only few R users can benefit from "dataframes2xls". An alternative to "dataframes2xls" is "write.xls". "dataframes2xls" is technically superior, as it allows the specification of proper formatting and fonts. "dataframes2xls" also exists longer. However, "write.xls" is available to many more R users because it depends on Perl, which is installed as a part of the R-tools. 

So, I think it would be a pity not to provide "write2xls", since I have it readily available now. Therefore, I will probably be hosting "write2xls" on a different repository, as long as no Python Interpreter is included in the R-tools. Does anyone know of a alternative repository, which does accept "trustworthy" R packages with a binary Python Interpreter. 

Thanks! 

Best wishes, 

Guido van Steen 

P.S. For those who are interested or who would like to test it, at the moment "write2xls" can be downloaded as "http://www.heppel.net/write2xls_0.4.4.9.zip". The "source" package is available as "http://www.heppel.net/write2xls_0.4.4.9.tar.gz". 

P.P.S. I think that on MS Windows the combination of R and the R tools is just as much a potential security risk as allowing to include a Python Interpreter in a binary package. The R website should pay more serious attention to this.  

P.P.P.S. Uwe also brings up the issue of licensing. However, this is not a problem at all. The Python license is one of the most permissive licenses around. For the Python Interpreter that I included in the "write2xls" package, I used pyMingW, which is distributed under an MIT license. It is a version of Python compiled by the MinGW compiler. Thanks to this pyMingW distribution I also avoid the need of any Microsoft-owned dlls. "dataframes2xls" and "write2xls" are also distributed under a MIT license.
#
On Tue, Sep 1, 2009 at 5:41 PM, <gvsteen at yahoo.com> wrote:
Note that the rSymPy package has an entire Jython interpreter
in it and provided your software is only R and pure python you
should be able to run it off that.

Of course this just trades one dependency for another, i.e. it
does not require python since that's included but it does require
java; however, most people have java installed already since a lot
of the free software out there requires java.  See:
http://rsympy.googlecode.com

Note that since java jar files are source files and since java itself
is not included it was possible to do that without any binaries.
#
Hi Gabor, 

Thank you very much! That is an excelent idea! 

I had not thought about Jython at all. Moreover, I always had the impression that the latest Jython distribution was based on Python 2.2. But I just saw that they upgraded to 2.5 a few months ago. This is quite fortunate because both "dataframes2xls" and "write2xls" depend on Python 2.4 or better. 

An R user will not notice much of a 15 MB download. And for most MS Windows users Java is an even more common thing than Perl. 

I will definitely check this out! 

Thanks a lot!!! 

Best wishes, 

Guido
--- On Wed, 9/2/09, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:

            
#
On Sep 1, 2009, at 17:41 , gvsteen at yahoo.com wrote:

            
Just for other people - it is in general not necessary to make such  
exceptions because it's easy to pull any necessary binary dependencies  
at build or run time where needed and there are examples of packages  
that do exactly that (e.g. RGtk2 for run-time dependency installation  
of GTK+ and Cairo for build-time dependency installation). The rule on  
CRAN is just to make sure that the package can be compiled purely from  
sources and binaries are rather a convenience (Python can be equally  
well compiled form sources). Uwe's comment was about the necessity to  
make all sources available for GPL licensed packages in either case.

Cheers,
Simon
#
gvsteen at yahoo.com wrote:
>
>
Guido,

why? If you package asks them to install Python, if your package cannot 
find it in the path, they will certainly do if they find the 
functionality very useful.
Don't know Python on Windows so well, but why can't they install it? You 
can also install R with limited user privileges.
Well, but much of that space is useless/wasted for non-Windows users then.
Most R users under Windows won't have Rtools installed, just the 
developers will have.
Can't you add some configure script / Makefile that allows to build the 
binary from sources that you provide in your package?

Otherwise, what you could do is to install the binary on demand from 
another side you are hosting. E.g. library("write2xls") could check if 
the required binary is available and install on demand if it is not 
available. But don't forget to look carefully into all license issues.
R tools are just required for developers, and they won't be installed on 
other platforms than Windows, I believe. Additionally, we know who built 
those tools.
If MIT allows to ship things the way you plan to, then it's fine, but no 
binaries in sources packages on CRAN. We did quite some work to get rid 
of the packages that did (even my own package!) and won't make 
exceptions. We won't revert our decision.

Best wishes,
Uwe Ligges