-----Original Message-----
From: r-devel-bounces at r-project.org
[mailto:r-devel-bounces at r-project.org] On Behalf Of Prof Brian Ripley
Sent: Wednesday, May 09, 2007 12:05 PM
To: John Fox
Cc: r-devel at r-project.org
Subject: Re: [Rd] Behaviour of read.table with empty columns
On Wed, 9 May 2007, John Fox wrote:
Dear r-devel list members,
I stumbled across the following behaviour of read.table() recently:
that I have the data
a " " ""
"" "" ""
in a file or copied to the clipboard, and issue the command
DF <- read.table("clipboard")
DF
V1 V2 V3
1 a NA NA
2 NA NA
V1 V2 V3
[1,] FALSE TRUE TRUE
[2,] FALSE TRUE TRUE
I was surprised by the NAs. Note that they occur only when a column
consists entirely of empty strings or strings composed of blanks.
On the other hand
data.frame(A=c("", "", ""))
A
1
2
3
works as I would have expected.
How did you expect R to know that "" meant a character
column? You are allowed to quote any type of column, so as
far as read.table is concerned the columns is entirely empty
and so its type is unknown. It defaults to the simplest
possible type, logical.
The answer is I think to use colClasses="character".
It is probably slightly more accurate to say that if
colClasses is not given, all columns are read as character
columns, and then converted to the simplest possible type.
In earlier versions of R you could get NULL columns (if there
were no rows at all), but now the simplest is logical.
Brian
V1 V2 V3
1 a
2
But, as I said, I found the behaviour of read.table() puzzling.
All this is with R 2.5.0 on a Windows XP Pro SP 2 system.
Comments?
Thanks,
John
--------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox