Skip to content

>2GB dataset

3 messages · apollo wong, Brian Ripley, Ernesto Jardim

#
Hi, do any one have experience with loading dataset
that is larger than 2GB into R. My organization is a
SAS oriented shop and I'm in the process of switching
it to R. One of the complain about R has always been
it's inability to handle large dataset (>GB)
efficiently. I would like some comments from someone
with experience of working on >2GB dataset in R.
Thanks.
Apollo
#
Absolutely no problem on 64-bit OSes with enough memory.  Many 32-bit OSes 
have problems with > 2Gb files.

Please do read the posting guide and tell us basic facts like which OS you 
are running on, so we don't have to speculate to answer your question.

Also, what you want to do with the dataset?   This matters crucially.
On Wed, 24 Nov 2004, apollo wong wrote:

            

  
    
3 days later
#
Hi,

I've been using large datasets (>GB) and I've stored them in MySQL
databases and use RMySQL to access them. My feeling is that most of the
times you don't need to keep the dataset in your workspace, but you need
to access parts of it or aggregate it in some way, before run some
analysis. So use what is best from each world, databases to store and
perform partial selections and aggregations, and R to statistical
analysis.

You'll be amazed with the speed of this 2 together (R & MySQL).

Regards

EJ
On Wed, 2004-11-24 at 15:37, apollo wong wrote: