read.table bug in Mac OS X (PR#2469)
George has supplied an example file which is CR-terminated. As far as I can see this is an error when using classic MacOS files on an foreign OS, and is I presume about the Darwin port of R (confirmation please) where the native files are LF terminated and the example file was CR terminated. It's a bit of a wonder that it ever worked, but it was broken in fixing PR#2396. I've added a test example and a fix that covers this and PR#2396, and they will be in R-patched and R-devel shortly. It does make me wonder about the testing process: do the testers of the Darwin port never use classic MacOS files? How does emacs manage to create CR-terminated files on a unix-based OS? Or is this a case of using Carbon MacOS application with Darwin R, and that's rare?
On Fri, 17 Jan 2003 gwgilc@wm.edu wrote:
Full_Name: George W. Gilchrist
Version: 1.6.2
OS: OS X
Submission from: (NULL) (128.239.124.126)
Start with a tab-delimited or comma-delimited text file created on the Mac and
use read.table("filename.txt", header=T) to read it in. When the first column of
the file contains a character vector, and there is a header line, the first
letter of the first column of the fifth row is appended to the start of the
column name and is omitted from the data entry. See the example below. This
appears to have something to do with the way text files are encoded on the Mac.
Text flies created in Excel, emacs, Word, and TextEdit on OS X all seem to do
this, even when you copy the text file over to a PC and run R 1.6.2 there under
Windows. If you open the Mac text file in a text editor on the PC and save it
under a different name, the problem goes away. I have tried this with a half
dozen different files.
tmp1<-read.table("deadFly.txt", header=T)
tmp1[1:10,]
VTrt Dead.X Dead.C Live.X Live.C N.X N.C P.Live.X P.Live.C 1 Vg 2 0 7 10 9 10 0.78 1.000 2 Vg 5 1 5 8 10 9 0.50 0.890 3 Vg 0 0 8 10 8 10 1.00 1.000 4 Vg 0 0 9 9 9 9 1.00 1.000 5 g 1 1 9 7 10 8 0.90 0.875 6 Vg 4 1 6 9 10 10 0.60 0.900 7 Vg 2 1 7 9 9 10 0.78 0.900 8 Vg 0 0 9 8 9 8 1.00 1.000 9 Vg 0 0 10 10 10 10 1.00 1.000 10 Vg 0 0 8 9 8 9 1.00 1.000
tmp2<-read.table("musselJen.txt", header=T)
tmp2[1:10,]
LLoc Size ID Bac Sec N PC 1 LS 120.0 1 T 1 32.7 92.0 2 LS 120.0 1 T 2 33.3 92.5 3 LS 120.0 1 T 3 39.3 96.9 4 LS 120.0 2 T 1 36.1 94.3 5 S 120.0 2 T 2 38.3 94.5 6 LS 120.0 2 T 3 34.3 94.1 7 LS 120.0 3 T 1 22.1 83.9 8 LS 120.0 3 T 2 25.5 93.1 9 LS 120.0 3 T 3 28.7 94.6 10 LS 4.2 1 T 1 48.5 93.7
______________________________________________ R-devel@stat.math.ethz.ch mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595