Reading word by word in a dataset

Mon, Nov 1, 2004 2:53 PM

Dear Andy & Tony: 

      That's great.  Unfortunately, I still spend most of my life in the 
S-Plus world, and read.table in S-Plus 6.2 does not have the "fill" 
argument.  However, Tony's solution (and my ugly hack) work in both 
S-Plus 6.2 and R 2.0.0. 

      Thanks again. 
      Spencer Graves

Tony Plate wrote:

Trying to make it work when not all rows have the same numbers of 
fields seems like a good place to use the "flush" argument to scan() 
(to skip everything after the first field on the line):

With the following copied to the clipboard:

i1-apple        10$   New_York
i2-banana
i3-strawberry   7$    Japan

do:

scan("clipboard", "", flush=T)

Read 3 items
[1] "i1-apple"      "i2-banana"     "i3-strawberry"

sub("^[A-Za-z0-9]*-", "", scan("clipboard", "", flush=T))

Read 3 items
[1] "apple"      "banana"     "strawberry"

-- Tony Plate

At Monday 01:59 PM 11/1/2004, Spencer Graves wrote:

     Uwe and Andy's solutions are great for many applications but 
won't work if not all rows have the same numbers of fields.  Consider 
for example the following modification of Lee's example:
i1-apple        10$   New_York
i2-banana
i3-strawberry   7$    Japan

     If I copy this to "clipboard" and run Andy's code, I get the 
following:

read.table("clipboard", colClasses=c("character", "NULL", "NULL"))

Error in scan(file = file, what = what, sep = sep, quote = quote, dec 
= dec,  :
   line 2 did not have 3 elements

     We can get around this using "scan", then splitting things apart 
similar to the way Uwe described:

dat <-

+ scan("clipboard", character(0), sep="\n")
Read 3 items

dash <- regexpr("-", dat)
dat2 <- substring(dat, pmax(0, dash)+1)

blank <- regexpr(" ", dat2)
if(any(blank<0))

+   blank[blank<0] <- nchar(dat2[blank<0])

substring(dat2, 1, blank)

[1] "apple "      "banana"      "strawberry "

     hope this helps.  spencer graves

Uwe Ligges wrote:

Liaw, Andy wrote:

Using R-2.0.0 on WinXPPro, cut-and-pasting the data you have:

read.table("clipboard", colClasses=c("character", "NULL", "NULL"))



             V1
1      i1-apple
2     i2-banana
3 i3-strawberry




... and if only the words after "-" are of interest, the statement 
can be followed by

 sapply(strsplit(...., "-"), "[", 2)


Uwe Ligges

HTH,
Andy

From: j lee

Hello All,

I'd like to read first words in lines into a new file.
If I have a data file the following, how can I get the
first words: apple, banana, strawberry?

i1-apple        10$   New_York
i2-banana       5$    London
i3-strawberry   7$    Japan

Is there any similar question already posted to the
list? I am a bit new to R, having a few months of
experience now.

Cheers,

John

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

Spencer Graves, PhD, Senior Development Engineer
O:  (408)938-4420;  mobile:  (408)655-4567

Reading word by word in a dataset

Thread (6 messages)