Prev 58491 / 398502 Next

Reading word by word in a dataset

John

Thu, Nov 4, 2004 5:00 AM

Thanks, Tony.
I got a very good idea of using "flush" in scan() from
your reply, so that I successfully did my little job.
But, my next question arises if I want to extract the
list of the price items only in the 2nd column in my
example.
I did it the following way. Is it the right way to do?
Or do you have a smarter or more efficient way to do
it?

i1-apple 10$ New_York
i2-banana 5$ London
i3-strawberry 7$ Japan

flush=T)[[2]]
Read 3 records
[1] "10$" "5$"  "7$"

Cheers,

John

--- Tony Plate <tplate at acm.org> wrote:

Trying to make it work when not all rows have the
same numbers of fields 
seems like a good place to use the "flush" argument
to scan() (to skip 
everything after the first field on the line):

With the following copied to the clipboard:

i1-apple        10$   New_York
i2-banana
i3-strawberry   7$    Japan

do:

 > scan("clipboard", "", flush=T)

Read 3 items
[1] "i1-apple"      "i2-banana"     "i3-strawberry"

 > sub("^[A-Za-z0-9]*-", "", scan("clipboard", "",

flush=T))
Read 3 items
[1] "apple"      "banana"     "strawberry"

-- Tony Plate

At Monday 01:59 PM 11/1/2004, Spencer Graves wrote:

     Uwe and Andy's solutions are great for many

applications but won't

work if not all rows have the same numbers of

fields.  Consider for

example the following modification of Lee's

example:

i1-apple        10$   New_York
i2-banana
i3-strawberry   7$    Japan

     If I copy this to "clipboard" and run Andy's

code, I get the following:

read.table("clipboard",

colClasses=c("character", "NULL", "NULL"))

Error in scan(file = file, what = what, sep = sep,

quote = quote, dec =

dec,  :
   line 2 did not have 3 elements

     We can get around this using "scan", then

splitting things apart

similar to the way Uwe described:

dat <-

+ scan("clipboard", character(0), sep="\n")
Read 3 items

dash <- regexpr("-", dat)
dat2 <- substring(dat, pmax(0, dash)+1)

blank <- regexpr(" ", dat2)
if(any(blank<0))

+   blank[blank<0] <- nchar(dat2[blank<0])

substring(dat2, 1, blank)

[1] "apple "      "banana"      "strawberry "

     hope this helps.  spencer graves

Uwe Ligges wrote:

Liaw, Andy wrote:

Using R-2.0.0 on WinXPPro, cut-and-pasting the

data you have:

read.table("clipboard",

colClasses=c("character", "NULL", "NULL"))


             V1
1      i1-apple
2     i2-banana
3 i3-strawberry



... and if only the words after "-" are of

interest, the statement can be

followed by

 sapply(strsplit(...., "-"), "[", 2)


Uwe Ligges

HTH,
Andy

From: j lee

Hello All,

I'd like to read first words in lines into a new

file.

If I have a data file the following, how can I

get the

first words: apple, banana, strawberry?

i1-apple        10$   New_York
i2-banana       5$    London
i3-strawberry   7$    Japan

Is there any similar question already posted to

the

list? I am a bit new to R, having a few months

of

experience now.

Cheers,

John

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html


--
Spencer Graves, PhD, Senior Development Engineer
O:  (408)938-4420;  mobile:  (408)655-4567

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

Thread (6 messages)

Liaw, Andy Reading word by word in a dataset Nov 1 Uwe Ligges Reading word by word in a dataset Nov 1 Spencer Graves Reading word by word in a dataset Nov 1 Tony Plate Reading word by word in a dataset Nov 1 Spencer Graves Reading word by word in a dataset Nov 1 John Reading word by word in a dataset Nov 4