Skip to content
Prev 58322 / 398502 Next

Reading word by word in a dataset

Uwe and Andy's solutions are great for many applications but won't 
work if not all rows have the same numbers of fields.  Consider for 
example the following modification of Lee's example: 

i1-apple        10$   New_York
i2-banana
i3-strawberry   7$    Japan

      If I copy this to "clipboard" and run Andy's code, I get the 
following: 

 > read.table("clipboard", colClasses=c("character", "NULL", "NULL"))
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = 
dec,  :
    line 2 did not have 3 elements

      We can get around this using "scan", then splitting things apart 
similar to the way Uwe described: 

 > dat <-
+ scan("clipboard", character(0), sep="\n")
Read 3 items
 > dash <- regexpr("-", dat)
 > dat2 <- substring(dat, pmax(0, dash)+1)
 >
 > blank <- regexpr(" ", dat2)
 > if(any(blank<0))
+   blank[blank<0] <- nchar(dat2[blank<0])
 > substring(dat2, 1, blank)
[1] "apple "      "banana"      "strawberry "

      hope this helps.  spencer graves
Uwe Ligges wrote: