Skip to content

R on Large Data Sets (again)

8 messages · Lars Bishop, Daniel Nordlund, Duncan Murdoch +3 more

#
Hello Lars,
On 2009.11.28 18:53:09, Lars Bishop wrote:
I think you'll have to provide a more precise definition of
"large"---are we talking 1 GB of records or 100 GB? Also, it would help
to know what you are trying to do with the data. The documentation for
the biglm and bigmemory packages may provide some help.
I'm not familiar enough with the commercial version of R, but I do
believe it provides better support for parallelization, which may be of
some help. I don't think, however, that this version will "solve" your
problem.
Possibly, but Win64 should provide plenty of memory (I believe Windows 7
Ultimate can use up to 192 GB of memory). You just have to find the
system that can take that much... With Unix/Linux you can probably cut
back some overhead, and the memory management is most likely better, but
unless you need to go over 192GB of memory, you don't necessarily have
to move to a different platform. 

~Jason
#
Windows 64-bit can certainly handle large memory spaces, but unless something has changed recently it my understanding Revolution Computing's 64-bit is the only 64-bit version of R available for Windows (due to the unavailability of adequate open source compilers for 64-bit Windows).  So 64-bit R will need to be Revolution's solution or a non-Windows platform.

Hope this is helpful,

Dan 

Daniel Nordlund
Bothell, WA USA
#
On 28/11/2009 6:53 PM, Lars Bishop wrote:
There are several packages for handling datasets without keeping them in 
RAM:  bigmemory, ff, etc.  You may find that you need to write functions 
to handle your data a block at a time, or you may find they have already 
been written, e.g. biglm.  You can also keep your data in a database and 
just retrieve it a block at a time for processing.
It is compatible with Win64, but it is a 32 bit application.  It 
benefits from running on 64 bit Windows (because Windows can get out of 
the way and give it most of 4 GB to work in), but not as much as a true 
64 bit application.  So it probably doesn't solve your problem.
There are builds available for 64 bit Linux and MacOS (and maybe 
others); they'd likely help more than running 32 bit R in Win64.  I 
don't know how they compare to running Revolution's 64 bit R in Win64.

Duncan Murdoch
#
On 2009.11.28 21:50:09, Daniel Nordlund wrote:
It appears that GNU does have a project that has had some success at
compiling 64 bit Windows applications:

http://mingw-w64.sourceforge.net/

Not sure if all of the pieces are there for an R build, though.
#
Jason Morgan wrote:
Last time we tried, it was not sufficient.

Best wishes,
Uwe Ligges
#
On Sun, 29 Nov 2009, Jason Morgan wrote:

            
Or use a commercial Windows compiler.
Well, some interesed people have a project to port GCC and binutils: 
as far as I am aware that is not an official GNU project.
You are welcome to show us how to do it (on the R-devel list): several 
people have spent man months attempting this (including submitting 
many patches to that project), and the rw-FAQ did tell you do so in 
http://cran.r-project.org/bin/windows/base/rw-FAQ.html#How-can-I-compile-R-from-source_003f

  
    
#
On 2009.11.29 14:24:40, Prof Brian Ripley wrote:
Not a chance :)

I got away from Windows 10 years ago for exactly these reasons. I was
just trying help point a poor guy in the right direction.