Skip to content

big data file versus ram memory

3 messages · Mauricio O Calvao, Stephan Kolassa, David Winsemius

#
Hi there

I am new to R and would like to ask some questions which might not make 
perfect sense. Anyhow, here they are:

1) I would like very much to use R for processing some big data files 
(around 1.7 or more GB) for spatial analysis, wavelets, and power 
spectra estimation; is this possible with R? Within IDL, such a big data 
set seems to be tractable...

2) I have heard/read that R "puts all its data on ram"? Does this really 
mean my data file cannot be bigger than my ram memory?

3) If I have a big enough ram, would I be able to process whatever data 
set?? What constrains the practical limits of my data sets??

Thanks!
#
Hi Mauricio,

Mauricio Calvao schrieb:
There are some packages to handle large datasets, e.g., bigmemoRy. There 
were a couple of presentations on various ways to work with large 
datasets at the last useR conference - take a look at the presentations at
http://www.statistik.uni-dortmund.de/useR-2008/
You'll probably be most interested in the "High Performance" streams.
The philosophy is basically to use RAM. Anything working outside RAM is 
not exactly heretical to R, but it does require some additional effort.
From what I understand - little to nothing, beyond the time needed for 
computations.

HTH,
Stephan
#
On Dec 18, 2008, at 3:07 PM, Stephan Kolassa wrote:

            
Er, ... it depends. At a minimum a person considering this should have  
read the FAQs. If this is a question about Windows, then R-Win FAQ 2.9:

http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021

There has been quite a bit about this in the list over the last couple  
of years. Search the archives:
http://search.r-project.org/