Huge memory comsumption with foreign and RPgSQL - R-help

Wed, Jan 17, 2001 1:06 AM #

I know this is something R isn't meant to do well but I tried it anyway :)

I have this SPSS-datafile (size 31 MB). When I converted it to a R object
with read.spss("datafile.sav") I ended up with a .RData-file which was 229
MB big. Is this considered normal?

Then I tried to dump that object into a database with RPgSQL-package
function db.write.table(object) (Memory ran out first time I tried to
convert SPSS-file into a R-object so I was quite prepared for the
database manouvre ; I increased the size of swap (working with linux) to
2500 MB) The process kept going and going and getting bigger and bigger.
After 6 hours and 30 minutes I aborted it. At that time the process had
grown into 1400 MB:s. Again, is this considered normal? And further more,
am I likely to succeed if I'm patient enough?

-perttu-

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Peter Dalgaard

Wed, Jan 17, 2001 3:10 AM #

Perttu Muurim?ki <Perttu.Muurimaki at Helsinki.Fi> writes:

Doesn't sound completely unreasonable: If all your fields fit in a
single byte to begin with and get converted to double in the process,
you'll have an inflation by a factor of 8.

This, however, sounds a bit excessive, although I wouldn't know
exactly what goes on inside RPgSQL... If it is converting every field
in the entire data frame to string form before sending it to the
database, then I might understand. Might it be possible to send it in
smaller blocks?

O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Tim Keitt

Wed, Jan 17, 2001 7:23 AM #

I usually import directly into postgresql first and then read the data 
using rpgsql.  In psql, create a table, e.g.,

   create table my_table (col1 int, col2 float, ...)

then format your data as a tab-separated ascii file, one column per 
variable.  In psql,

   \copy my_table from 'filename'

or

   \g copy my_table from 'filename' using delimiters 'delim' with null 
as 'null string'

Once the data are in postgresql, fire up R and read the tables with rpgsql.

Tim

Peter Dalgaard BSA wrote:

Timothy H. Keitt
Department of Ecology and Evolution
State University of New York at Stony Brook
Phone: 631-632-1101, FAX: 631-632-7626
http://life.bio.sunysb.edu/ee/keitt/

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._