R-sig-DB Digest, Vol 68, Issue 7
On Tue, Jun 29, 2010 at 8:19 PM, kMan <kchamberln at gmail.com> wrote:
Dear Jonathan,
I'm trying to cycle through a database, reading a chunk of data from one
table in the DB, processing this data chunk, >and then writing the output to another table in the database. ?I'm having to set the fetch(n=) option because the >database is too large to read the entire thing into memory at once. ?Here's my sample code: [snip] There is no need to open separate connections for reading vs. writing.
While it is true that you can read and write from a single connection, you can only have one result set open at a time and so if you want to iterate through a result set and then write, you do, I believe, need two connections. I suspect that the code you propose will be considerably slower as you are issuing multiple queries along with a WHERE constraint. + seth
Your code was corrupting the connection between R/driver, I think, and I
don't know why. But when I close a connection in R and my OS won't let me
delete the file, there is a different kind of problem. The code below works
on my system. I wish I knew enough to explain wtf happened with your code.
------------------------------------------------------------------------
##
## Prelim/Declarations
##
require(RSQLite)
mysqldb <- "test.sqlite"
m <- dbDriver("SQLite")
tbl.names <- c("TABLEA", "TABLEB")
q1<-paste("SELECT COUNT(*) FROM", tbl.names[1])
##
## Surrogate data (if not allready done)
##
(somedata=data.frame(data1=seq(1:10),data2=seq(1:10)*3))
con<-dbConnect(m, dbname=mysqldb)
dbWriteTable(con,"TABLEA",somedata, overwrite=TRUE)
dbDisconnect(con)
##
## Process TABLEA in parts, write to TABLEB
##
con<-dbConnect(m, dbname=mysqldb)
n.lines<-dbGetQuery(con, q1) #NOTE class is data.frame
chunk.size<-3
start.index<-seq(from = 1, to = n.lines[1,1], by = 3)
limit<-c(diff(start.index), n.lines[1,1]-sum(diff(start.index)))
i<-1
for(i in 1:length(limit)){
?if(i<length(limit)) { q2<-paste("SELECT * FROM TABLEA WHERE row_names >=",
? ? ?start.index[i], "LIMIT", limit[i])
?} else { q2<-paste("SELECT * FROM TABLEA WHERE row_names =",
? ? ?start.index[i], "LIMIT", limit[i])
?}
?dat<-dbGetQuery(con, q2)
?dbWriteTable(con, tbl.names[2], dat, row.names=FALSE, append=TRUE)
}
dbGetQuery(con, "SELECT * FROM TABLEB")
dbDisconnect(con)
------------------------------------------------------------------------
Sincerely,
KeithC.
Seth Falcon | @sfalcon | http://userprimary.net/