how to bread while loop reading from connection with read.csv
On 21-01-2013, at 16:56, "Collins, Stephen" <Stephen.Collins at allstate.com> wrote:
Hello, I'm trying to read a file rows at a time, so as to not read the entire file into memory. When reading the "connections" and "readLines" help, and "R help archive," it seems this should be possible with read.csv and a file connection, making use of the "nrows" argument, and checking where the "nrow()" of the new batch is zero rows.
From certain posts, it seemed that read.csv should return "character(0)" when the end of file is reached, and there are no more rows to read. Instead, I get an error there are "no lines available for input." Have I made a mistake with the file, or calling read.csv?
What is the proper way to check the end-of-file condition with read.csv, such that I could break a while loop reading the data in?
#example, make a test file
con <- file("test.csv","wt")
cat("a,b,c\n", "1,2,3\n", "4,5,6\n", "7,6,5\n", "4,3,2\n", "3,2,1\n",file=con)
unlink(con)
#show the file is valid
con <- file("test.csv","rt")
read.csv(con,header=T)
unlink(con)
#show that readLines ends with "character(0)", like expected
con <- file("test.csv","rt")
readLines(con,n=10)
readLines(con,n=10)
unlink(con)
#show that read.csv end with error
con <- file("test.csv","rt")
read.csv(con,header=T,nrows=10)
read.csv(con,header=F,nrows=10)
unlink(con)
How about:
con <- file("test.csv","rt")
while( length(tmp <- readLines(con,n=10)) > 0 ) {
qq <- read.csv(text=tmp, header=TRUE)
# do something with qq
}
unlink(con)
qq
Berend