Skip to content
Prev 46893 / 63421 Next

read.table() with quoted integers

I think this is not the right approach -- quoting is a transport-layer
feature of the CSV format, not part of the application layer. Quotes
should always be interpreted away from column data before any data is
handed to the application layer. (CSV does not _have_ any application
layer; type information is conspicuously absent.)

If quoting is incorrectly treated as a feature of the values rather
than the encoding of the values, there's just going to be the same
problem with datetime columns, and any other column types.

So I disagree -- parsing quotes is never the column data-converter's
job, it's read.table's job.

Please refer to this specification of CSV:
http://kanspra.org/memberdirectory.csv

particularly this part:
"Fields may always be delimited with double quotes. The delimiters
will always be discarded."

and the implementation note which follows. Other CSV specs, like RFC
4180, contain similar statements. I think the only way to comply with
"always" discarding delimiters is to do it in read.table.

Peter
On Fri, Oct 4, 2013 at 6:58 AM, Milan Bouchet-Valat <nalimilan at club.fr> wrote: