Summary: [R] read.table on Mac OS X, CARBON vs. DARWIN
Adding that line didn't work for me. I get the same problem as before
(version 1.4.0):
'temp' is a two-line text file with three tab-delimited columns.
UNDER DARWIN:
> read.table('temp')
V1 V2 V3
1 AFFX-BioB-5_at -214 -139
2 AFFX-BioB-M_at -49 -11
> read.table('temp',as.is=TRUE)
stack imbalance in internal type.convert, 26 then 25stack imbalance in
Internal, 25 then 24
stack imbalance in if, 19 then 18
stack imbalance in <-, 17 then 16
stack imbalance in {, 15 then 14
stack imbalance in for, 8 then 7
stack imbalance in {, 6 then 5
V1 V2 V3
1 AFFX-BioB-5_at -214 -139
2 AFFX-BioB-M_at -49 -11
Error: unprotect(): stack imbalance
UNDER CARBON:
> read.table('temp')
V1 V2 V3
1 AFFX-BioB-5_at -214 -139
2 AFFX-BioB-M_at -49 -11
> read.table('temp',as.is=TRUE)
V1 V2 V3
1 AFFX-BioB-5_at -214 -139
2 AFFX-BioB-M_at -49 -11
On Friday, February 22, 2002, at 09:00 X, Meinhard Ploner wrote:
Thanks a lot, James!! The problem is fixed. On the version 1.4.0 Mac/darwin (the latest available version for this system) the function read.table (which is called from read.delim etc., too) has the bug you explained. Inserting the row nlines <- nlines+1 after lines <- c(lines, line) removes this bug. M. On Friday, February 22, 2002, at 02:33 PM, james.holtman at convergys.com wrote:
If you can not the the latest 1.4.1, here is a patch (adds one line to read.table) that will fix it on your current system.
The 'read.table' function appears to be up to 10X slower in R 1.4.0 than
R
1.3.1 for some of the data sets I read in. I was comparing the source
code
for the 2 versions and see that it was rewritten in R 1.4.0. I think I found out what part of the problem might be. I was comparing R1.3.1 and R1.4.0 code and it appears that a statement is missing in some of the code for R 1.4. This is the section of code at the beginning of read.table. The loop starting with 'while (nlines < 5)' will read in the entire file, because there is no increment of 'nlines' in the loop. I traced through the code and this is what was happening. It then does a 'pushBack' of the entire file. In tracing through the code, this is
where
is appears to be taking the time. With the change noted below, the
speed
was similar to R 1.3.1 and the results were the same.
Here is the current code with what I think is the additional statement
needed:
=================part of read.table========
nlines <- 0
lines <- NULL
while (nlines < 5) {
line <- readLines(file, 1, ok = TRUE)
if (length(line) == 0)
break
if (blank.lines.skip && length(grep("^[ \\t]*$", line)))
next
if (length(comment.char) && nchar(comment.char)) {
pattern <- paste("^[ \\t]*", substring(comment.char,
1, 1), sep = "")
if (length(grep(pattern, line)))
next
}
lines <- c(lines, line)
#
# additional line required
#
nlines <- nlines+1
}
--
Meinhard Ploner <meinhardploner at gmx.net> on 02/22/2002 03:17:34 To: james.holtman at convergys.com cc: Subject: Re: [R] read.table on Mac OS X, CARBON vs. DARWIN Yes. Thanks a lot. I had the 1.4.0 because on Fink the latest version (1.4.1) is not available. However, I will download it from the CRAN. Meinhard On Thursday, February 21, 2002, at 10:29 PM, james.holtman at convergys.com wrote:
read.table did have a bug in it in 1.4.0. It was fixed in 1.4.1. Is that what you are running with?
-- NOTICE: The information contained in this electronic mail transmission is intended by Convergys Corporation for the use of the named individual or entity to which it is directed and may contain information that is privileged or otherwise confidential. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email or by telephone (collect), so that the sender's address records can be corrected.
http://www.mcg.edu/research/biostat/bickel.html David R. Bickel, PhD Assistant Professor Medical College of Georgia Office of Biostatistics and Bioinformatics 1120 Fifteenth St., AE-3037 Augusta, GA 30912-4900 Tel.: 706-721-4697; Fax: 706-721-6294 E-mail: dbickel at mail.mcg.edu or bickel at prueba.info -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 4761 bytes Desc: not available Url : https://stat.ethz.ch/pipermail/r-help/attachments/20020222/19523d0e/attachment.bin