Skip to content
Prev 863 / 1559 Next

concurrent reading/writing in "chunks" with RSQLite (need some help troubleshooting)

Dear Jonathan,

I do not know of a quick way to query the number of records, the first time.
After that, it should be a simpler matter. I suggest looking into "CREATE
INDEX" as part of your dbGetQuery() call. With clever use of indexes you may
work around needing to create duplicate data in some cases. 

I have not actually used it for this particular case yet, so I do not have a
working example. 

Sincerely,
KeithC.


-----Original Message-----
From: jgrn307 at gmail.com [mailto:jgrn307 at gmail.com] On Behalf Of Jonathan
Greenberg
Sent: Monday, July 05, 2010 1:37 PM
To: kMan
Cc: Seth Falcon; r-sig-db at stat.math.ethz.ch
Subject: Re: concurrent reading/writing in "chunks" with RSQLite (need some
help troubleshooting)

Keith and Seth (and r-sig-db'ers):

I somewhat understand the difference between both of your solutions -- Seth,
I'm not a database person (quickly learning tho) but does your code hold the
output in memory before writing it all at the end (via dbCommit), or is each
iteration's output being stored in some "waiting room" (on disk) where it
gets written at the very end?  The key (for my purposes) is to avoid keeping
more than a chunk's worth of input AND output in memory at any one time.

Along the lines of Keith's solution, I noticed the COUNT query can be VERY
slow -- there is no rapid way of getting the number of rows in a database
table?

Thanks again!

--j
On Tue, Jun 29, 2010 at 9:50 PM, kMan <kchamberln at gmail.com> wrote:
would not be surprised if I missed a goal implied by the code style used or
off-list correspondence.
through the loop required (2) queries, a read and a write, where (I've
assumed) the result sets are handled one at a time anyway. The code I
proposed does the same thing, but overall there is (1) additional query
prior to the loop to get the number of rows. The rest is just case
processing b/c dbGetQuery() seemed to do funny things when LIMIT extended
beyond the end of the table, and it wouldn't remember its last read location
(e.g. didn't work like scan()). In addition, it works with dbWriteTable()
and gets around the need for creating the table first, and filling it in
later.
particular record. The fetch() & fetch_n, were those intended to get the
read queries to start at the last read location (e.g. like serial reads
using scan())? Eliminating need to condition match or generate a db index
first would speed things up, and I'd very much like to learn that trick. I'd
think it would also make for an excellent R-wiki post.
happened with your code.