Skip to content

Summer of Code, LLVM, parallelization and R

3 messages · Florian Gross, Luke Tierney, Gabor Grothendieck

#
Hi everybody,

I'm currently working towards my Master's degree as a student of  
Computer Science at the University of Saarbr?cken and highly  
interested in compiler construction, interpretation techniques,  
optimization, programming languages and more. :)

Two professors of my university approached me about an interesting  
project just a few days ago: Developing a LLVM-based JIT compilation  
back-end for R. The primary goal would be the generation of parallel /  
vectorized code, but other ways of increasing performance might be  
very interesting as well.

I've thought a bit about this and am now wondering if this would make  
sense as a project for Google's Summer of Code program -- I have seen  
that the R foundation was accepted as a mentoring organization in 2008  
and has applied to be one again in this year.

I've already taken part in the SoC program thrice (working on Novell's  
JScript.NET compiler and run-time environment in 2005, writing a  
debugger for the Ruby programming language in 2006 and working on a  
detailed specification for the Ruby programming language in 2007) and  
it has always been a lot of fun and a great experience. One thing that  
was particularly helpful was getting into contact with the development  
communities so easily.

What do you folks think? Would this be of benefit to the R community?  
Would it be a good candidate for this year's SoC installment? :)

Also, if some thinking in this direction has already been done or if  
you have any other pointers, please don't hesitate to reply!

Thanks a lot in advance!

Kind regards,
Florian Gross
#
There is ongoing work on developing a byte code compiler for R.  A
preliminary implementation is available and the corresponding byte
code engine is part of the R distribution.  The initial engine has
been a useful proof of concept but is in the process of being
rewritten from scratch, in part with an eye to supporting
parallelization at least of vectorized math operations; I expect to
make signitficant progress on this over the coming summer.  There are
a lot of open design issues relating to changes or adjustments
(e.g. via declarations) in the R language that might be needed or help
in generating good code, which makes this too loosely specified to
make a good SoC project at the moment.  By summer 2010 it may have
jelled to the point where it is reasonable to spin off projects to,
for example, target lower level VMs like LLVM or JVM or .Net's VM from
the higher level R VM code.

Best,

luke
On Sun, 15 Mar 2009, Florian Gross wrote:

            

  
    
#
In addition to the work Luke is doing there is Ra:

http://www.milbo.users.sonic.net/ra
On Sun, Mar 15, 2009 at 11:25 AM, Florian Gross <Florian.S.Gross at web.de> wrote: