Reading word by word in a dataset
Dear Andy & Tony:
That's great. Unfortunately, I still spend most of my life in the
S-Plus world, and read.table in S-Plus 6.2 does not have the "fill"
argument. However, Tony's solution (and my ugly hack) work in both
S-Plus 6.2 and R 2.0.0.
Thanks again.
Spencer Graves
Tony Plate wrote:
Trying to make it work when not all rows have the same numbers of fields seems like a good place to use the "flush" argument to scan() (to skip everything after the first field on the line): With the following copied to the clipboard: i1-apple 10$ New_York i2-banana i3-strawberry 7$ Japan do:
scan("clipboard", "", flush=T)
Read 3 items [1] "i1-apple" "i2-banana" "i3-strawberry"
sub("^[A-Za-z0-9]*-", "", scan("clipboard", "", flush=T))
Read 3 items [1] "apple" "banana" "strawberry"
-- Tony Plate At Monday 01:59 PM 11/1/2004, Spencer Graves wrote:
Uwe and Andy's solutions are great for many applications but
won't work if not all rows have the same numbers of fields. Consider
for example the following modification of Lee's example:
i1-apple 10$ New_York
i2-banana
i3-strawberry 7$ Japan
If I copy this to "clipboard" and run Andy's code, I get the
following:
read.table("clipboard", colClasses=c("character", "NULL", "NULL"))
Error in scan(file = file, what = what, sep = sep, quote = quote, dec
= dec, :
line 2 did not have 3 elements
We can get around this using "scan", then splitting things apart
similar to the way Uwe described:
dat <-
+ scan("clipboard", character(0), sep="\n")
Read 3 items
dash <- regexpr("-", dat)
dat2 <- substring(dat, pmax(0, dash)+1)
blank <- regexpr(" ", dat2)
if(any(blank<0))
+ blank[blank<0] <- nchar(dat2[blank<0])
substring(dat2, 1, blank)
[1] "apple " "banana" "strawberry "
hope this helps. spencer graves
Uwe Ligges wrote:
Liaw, Andy wrote:
Using R-2.0.0 on WinXPPro, cut-and-pasting the data you have:
read.table("clipboard", colClasses=c("character", "NULL", "NULL"))
V1
1 i1-apple
2 i2-banana
3 i3-strawberry
... and if only the words after "-" are of interest, the statement can be followed by sapply(strsplit(...., "-"), "[", 2) Uwe Ligges
HTH, Andy
From: j lee Hello All, I'd like to read first words in lines into a new file. If I have a data file the following, how can I get the first words: apple, banana, strawberry? i1-apple 10$ New_York i2-banana 5$ London i3-strawberry 7$ Japan Is there any similar question already posted to the list? I am a bit new to R, having a few months of experience now. Cheers, John
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-- Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Spencer Graves, PhD, Senior Development Engineer O: (408)938-4420; mobile: (408)655-4567