Skip to content

Problems reading tab-delim files using read.table and read.delim

3 messages · mails, Jan van der Laan, Gabor Grothendieck

#
Hello,

I used read.xlsx to read in Excel files but for large files it turned out to
be not very efficient.
For that reason I use a programme which writes each sheet in an Excel file
into tab-delim txt files.
After that I tried using read.table and read.delim to read in those txt
files. Unfortunately, the results
are not as expected. To show you what I mean I created a tiny Excel sheet
with some rows and columns and
read it in using read.xlsx. I also used my script to write that sheet to a
tab-delim txt file and read that one it with
read.table and read.delim. Here is the R output:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, 
: 
  line 1 did not have 5 elements
c1 c2 c3  X
123 213 NA NA NA
234 asd NA NA NA
c1   c2  c3       NA.     NA..1     NA..2
1 123 <NA> 213                <NA>      <NA>
2 234  asd  NA      <NA>                    


The last output is what I would expect the file to be read in. Columns 4 to
6 do not have any header rows. in R1C4 I added some white spaces as well as
into R2C5 and R2C6 which a read in correctly by the read.xlsx function.

read.table and read.delim seem not to be able to handle such files. Is there
any workaround for that?


Cheers

--
View this message in context: http://r.789695.n4.nabble.com/Problems-reading-tab-delim-files-using-read-table-and-read-delim-tp4369195p4369195.html
Sent from the R help mailing list archive at Nabble.com.
#
I don't know if this completely solves your problem, but here are some  
arguments to read.table/read.delim you might try:
row.names=FALSE
fill=TRUE
The details section also suggests using the colClasses argument as the  
number of columns is determined from the first 5 rows which may not be  
correct.

HTH

Jan



mails <mails00000 at gmail.com> schreef:
#
On Wed, Feb 8, 2012 at 7:09 AM, mails <mails00000 at gmail.com> wrote:
Note that that is how read.xls in the gdata package works - it uses a
perl program to convert the spreadsheet to a text file and then reads
in the text file.