Message-ID: <BFD09C33.2573%sdavis2@mail.nih.gov>
Date: 2005-12-22T23:08:51Z
From: Sean Davis
Subject: Reading in large file in pieces
I have a large file (millions of lines) and would like to read it in pieces.
The file is logically separated into little modules, but these modules do
not have a common size, so I have to scan the file to know where they are.
They are independent, so I don't have to read one at the end to interpret
one at the beginning. Is there a way to read one line at a time and parse
it on the fly and do so quickly, or do I need to read say 100k lines at a
time and then work with those? Only a small piece of each module will
remain in memory after parsing is completed on each module.
My direct question is: Is there a fast way to parse one line at a time
looking for breaks between "modules", or am I better off taking large but
manageable chunks from the file and parsing that chunk all at once?
Thanks,
Sean