Skip to content

Unusual separators

4 messages · Matt Curcio, Peter Dalgaard, jim holtman

#
Hi all,
I have a list that I got from a web page that I would like to crunch.
Unfortunately, the list has some unusual separators in it.  I believe
the columns are separated by 1 space and 1 tab.  I tried to insert
this into the read.table( ..., sep=" \t", ...) but got an error that
said something like 'only one byte separators can be used.
I have thought about using a gsub to 'swap out' the "space + tab" and
replace it with commas, etc but thought there might be another way.
Any suggestions?
M
#
just read in the file using the tab as the separator.  if this is a problem because a tab might appear by itself, then use readLines to read in the file, gsub to replace the blank/tab with a new separator, writeLines to write out to a temporary and then read in from the temporary file.

Sent from my iPad
On Aug 16, 2011, at 11:02, Matt Curcio <matt.curcio.ri at gmail.com> wrote:

            
3 days later
#
On Aug 17, 2011, at 05:57 , Jim Holtman wrote:

            
You can skip the write and read back step by reading from a text connection. In R 2.14-to-be, there's a text= argument to read.table (and scan too), so you'll be able to do the whole thing on the fly: 

read.table(.... text=gsub(readLines(.....)....))

  
    
#
In the current version (2.13.1) textConnection is much slower if you have a large file (>10000 lines) than using a temporary output file.  Try timing a script using the two different approachs to get an appreciation for the difference.

Sent from my iPad
On Aug 20, 2011, at 3:39, peter dalgaard <pdalgd at gmail.com> wrote: