Skip to content

R-beta: read.table and large datasets

3 messages · Rick White, Douglas Bates, Thomas Lumley

#
I find that read.table cannot handle large datasets. Suppose data is a
40000 x 6 dataset

R -v 100

x_read.table("data")  gives
Error: memory exhausted
but
x_as.data.frame(matrix(scan("data"),byrow=T,ncol=6))
works fine.

read.table is less typing ,I can include the variable names in the first
line and in Splus executes faster. Is there a fix for read.table on the
way?


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Rick White <rick at stat.ubc.ca> writes:
You probably need to increase -n as well as -v to read in this table.
Try setting 
 gcinfo(TRUE)
to see what is happening with the garbage collector.  Most likely it
is running out of cons cells long before it runs out of heap storage.

The reason I suspect this is because I encountered exactly the same
situation several weeks ago and Thomas Lumley pointed this out to me.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Mon, 9 Mar 1998, Rick White wrote:

            
You need to increase the number of cons cells as well as the vector heap
size

eg

R -v 40 -n 1000000

to allocate 1000000 cons cells instead of the standard 200000.

To see what sort of memory you are running out of, use gcinfo(T), which
tells R to report the memory status after each garbage collection. 


Thomas Lumley
------------------------
Biostatistics		
Uni of Washington	
Box 357232		
Seattle WA 98195-7232	
------------------------


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._