-----Original Message-----
From: Tamas K Papp [mailto:tpapp at princeton.edu]
Sent: Saturday, August 13, 2005 4:03 AM
To: R-help mailing list
Subject: [R] retrieving large columns using RODBC
Hi,
I have a large table in Postgresql (result of an MCMC
simulation, with 1
million rows) and I would like to retrive colums (correspond
to variables)
using RODBC. I have a column called "index" which is used to
order rows.
Unfortunately, sqlQuery can't return all the values from a
column at once
(RODBC complains about lack of memory). So I am using the
following code:
getcolumns <- function(channel, tablename, colnames, totalrows,
ordered=TRUE,chunksize=1e5) {
r <- matrix(double(0),totalrows,length(colnames))
for (i in 1:ceiling(totalrows/chunksize)) {
cat(".")
r[((i-1)*chunksize+1):(i*chunksize)] <- as.matrix(
sqlQuery(channel, paste("SELECT", paste(colnames,collapse=", "),
"FROM", tablename,
"WHERE index <=", i*chunksize,
"AND index >", (i-1)*chunksize,
if (ordered) "ORDER BY index;"
else ";")))
}
cat("\n")
drop(r) # convert to vector if needed
}
to retrieve it in chunks. However, this is very slow --
takes about 15
minutes on my machine. Is there a way to speed it up?
I am running Linux on a powerbook, RODBC version 1.1-4, R 2.1.1. The
machine has only 512 Mb of RAM.
Thanks,
Tamas