hi, Guys: Now I am trying to use "R" to do some canonical analysis on large data sets, one of them is 42MB, and can be read by "R", the other data file is about 90MB, and this time R cannot read such a big size data. My question is that how can I deal with such a big dataset with "R", or are there any other statistical softwares which can read a huge data file as I memtioned? Thank you Jianbing
for help!
2 messages · jbwu, Ott Toomet
Hi, In general, there are no clear upper limit on how big datasets R can handle. If you have a 32-bit computer, that is probably 2GB, but it can be lowered. In which form do you have your data and how did you try to read it? If it is an ASCII table, scan() should be much more memory efficient than read.table(). You may also consider pre-processing your dataset with e.g. perl in order to get rid of much what you do not need. The best (but not trivial) way is perhaps to read the data into a SQL database and user R to query only necessary variables from there. Perhaps you should start with a small subset of your data, try to read it into R and do some exploratory analysis. Otherwise, SAS is known for its ability to handle large datasets. Perhaps it helps. Ott | From: jbwu <jbwu at pangea.stanford.edu> | Date: Wed, 04 Dec 2002 13:59:32 -0800 | | hi, Guys: | Now I am trying to use "R" to do some canonical analysis on large data sets, | one of them is 42MB, and can be read by "R", the other data file is about 90MB, | and this time R cannot read such a big size data. My question is that how can | I deal with such a big dataset with "R", or are there any other statistical | softwares | which can read a huge data file as I memtioned?