Read.dcf with no newline ending: gzfile drops last line
I don't know if this is a bug per se, but an undesired behavior in
read.dcf. read.dcf takes a file argument and passes it to gzfile if
it's a character:
if (is.character(file)) {
file <- gzfile(file)
on.exit(close(file))
}
This gzfile connection is passed to readLines (line #39):
lines <- readLines(file)
If no newline is at the end of the file, readLines doesn't give a
warning (I think appropriate behavior). If a DESCRIPTION file doesn't
happen to have a newline at the end of it (odd, but it may happen),
then the last tag is dropped:
x = "Package: test
+ Type: Package"
###################################### # No Newline in file ###################################### fname = tempfile() writeLines(x, fname, sep = "") ### readlines with character - warning but all fields readLines(fname)
[1] "Package: test" "Type: Package" Warning message: In readLines(fname) : incomplete final line found on '/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//Rtmpz95dsT/file180a65a6b745'
### readlines with file connection - warning but all fields file_con <- file(fname) readLines(file_con)
[1] "Package: test" "Type: Package" Warning message: In readLines(file_con) : incomplete final line found on '/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//Rtmpz95dsT/file180a65a6b745'
### readlines with gzfile connection ## no warning and drops last field gz_con = gzfile(fname) readLines(gz_con) # ONLY 1 lines!
[1] "Package: test"
###################################### # No Newline in file - fine ###################################### ### readlines with gzfile connection ## no warning and drops last field but OK writeLines(x, fname, sep = "\n") gz_con = gzfile(fname) readLines(gz_con)
[1] "Package: test" "Type: Package" Currently I use file(fname) before read.dcf to be sure a warning occurs, but all fields are read. I didn't see anything in read.dcf help about this. readLines states clearly: "If the final line is incomplete (no final EOL marker) the behaviour depends on whether the connection is blocking or not", but it's not 100% clear that read.dcf uses gzfile if the file is not compressed. Thanks John