A slight trap in read.table/read.csv.
On 9/03/2010, at 11:17 AM, Mike Prager wrote:
Rolf Turner <r.turner at auckland.ac.nz> wrote:
I solved the problem by putting in a colClasses argument in my call to read.csv(). But I really think that the read functions are being too clever by half here. If field entries are surrounded by quotes, shouldn't they be left as character? Even if they are all F's and T's? Furthermore using F's and T's to represent TRUE's and FALSE's is bad practice anyway. Since FALSE and TRUE are reserved words it would make sense for the read function to assume that a field is logical if it consists entirely of these words. But T's and F's .... I don't think so. I would argue that this behaviour should be changed. I can see no downside to such a change.
I agree with you, Rolf, that this is horrid behavior. It is such automatic devices that have made people hate (e.g.) Microsoft Word with a passion. Yet, in R this is a designed-in bug (e.g., feature) that probably can't be changed without making some legacy code not work. But at least, T and F could be removed soon as synonms for TRUE and FALSE. We have seen that "_" was removed as an assignment operator, and the world did not crumble. The use of T and F is no less error-prone, and possibly more.
I would definitely support the removal of the use of T and F for TRUE and FALSE. Some code would break, but it would be easy to trace the source of the problem and easy to fix.
The only immediate solution to this accretion of overly clever behavior would be for someone to write new functions (say, Read.csv) that didn't do all those conversions behind the scenes. I'm not about to do that. Are you?
NFL!!!
cheers,
Rolf
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}