"Andy Bunn" <abunn at whrc.org> writes:
Hi all:
I have acquired a 100s of data files that I need to preprocess to get them
usable in R. The files are fixed width (to a point) and contain 1 to 3 lines
of header, followed by a variable number of fixed width data lines (that I
can read with read.fwf). I want to read through the files and remove every
_line_ where characters column 83-86 do not equal "STD". If I can do that
and store it in a text file, then I can get the data I need using read.fwf.
I can't figure out how to do this because of the irregularity of the header
info buried in the file. It seems like the kind of thing perl or emacs would
be good at but I'd like to do it all in R if possible. Any pointers
appreciated.
How large are the files? With today's RAM sizes, it could be feasible
to do something along the lines of
1) x <- readLines(....), i <- read.fwf(...col83-86...)
2) read.fwf(textConnection(x[I %in% "STD"]),......)