Skip to content

Reading multiple tables from one .txt doc

4 messages · Mark Fingerle, Bert Gunter, William Dunlap +1 more

#
Dear all,
I have a .txt file which contains multiple tables and I would like to read these tables separately in order to create graphs for each one.
The tables are separated by a blank line, have a variable number of lines, fixed nr. Of rows, have a header and a comment above the header (#) which contains a specific word that identifies each table. (see example below). It would be possible to change the layout of the .txt data a bit (Add a word, remove comment etc..)
I would be extremely grateful if anyone could help me with this daunting task :-)
Example:
# CoordA
Image    X             Y             Z             MeasuredMove MachineMove
vf_36.png            -114.345475       -89.043448         556.073402         0             0
vf_37.png            -111.118715       -89.978534         606.040764         50.080172           50.000000
vf_38.png            -107.911209       -90.901958         656.025557         50.096111           50.000000
vf_39.png            -104.693931       -91.814392         705.982620         50.068868           50.000000
vf_40.png            -101.459549       -92.730113         755.983835         50.114082           50.000000

# CoordB
Image    X             Y             Z             MeasuredMove MachineMove
vf_36.png            -115.345475       -89.043448         556.073402         0             0
vf_37.png            -115.118715       -89.978534         606.040764         50.080172           50.000000
vf_38.png            -134.911209       -90.901958         656.025557         50.096111           50.000000
vf_39.png            -164.693931       -91.814392         705.982620         50.068868           50.000000
vf_40.png            -134.459549       -92.730113         755.983835         50.114082           50.000000

# CoordC
Image    X             Y             Z             MeasuredMove MachineMove
vf_36.png            -168.345475       -89.043448         556.073402         0             0
vf_37.png            -115.118715       -89.978534         606.040764         50.080172           50.000000
vf_38.png            -146.911209       -90.901958         656.025557         50.096111           50.000000
vf_39.png            -187.693931       -91.814392         705.982620         50.068868           50.000000
vf_40.png            -185.459549       -92.730113         755.983835         50.114082           50.000000
#
One approach would be to use ?readLines to read the lines into a
character vector. You could then use indexing to remove all the blank
and header (lines beginning with "image") lines. You can now find the
indices of where the separate data blocks begin (they all begin with
"#", right?) and then sequentially read them into a list of data
frames, using ?strsplit to delineate columns and ?as.numeric to
convert the numeric columns. I leave it to you to work out what I
think are the straightforward details.

Do note, however, that there may be better ways to do this, especially
using some of the text manipulation packages and functions.  Have a
look at the "stringr" package or any others that searching might bring
up (rseek.org is a good search site for R; as is google of course).
And wait a bit for better suggestions before proceeding, as my brute
force approach should probably be considered a last resort.


Cheers,
Bert




Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, May 3, 2016 at 7:45 AM, Mark Fingerle
<mark.fingerle at brainlab.com> wrote:
#
The following base R code does roughly what Bert suggests.  It does
no checking that the data is in the format you describe.  The
split-by-cumsum
trick is a handy idiom.

# lines <- readLines(yourFile), or, for this example:
lines <- c("#One","X Y Z","1 2 3","4 5 6","",
                "#Two", "X Y Z", "11 12 13", "",
                "#Three", "X Y Z", "21 22 23")
tables <- split(lines, cumsum( grepl("^#", lines)))
names(tables) <- vapply(tables, function(table)sub("^#", "", table[1]), "")
lapply(tables, function(text)read.table(text=text, header=TRUE, skip=1))
#$One
#  X Y Z
#1 1 2 3
#2 4 5 6
#
#$Two
#   X  Y  Z
#1 11 12 13
#
#$Three
#   X  Y  Z
#1 21 22 23


Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Tue, May 3, 2016 at 8:41 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:

            

  
  
#
Mark Fingerle <mark.fingerle at brainlab.com> writes:
This is not a solution if you really want separate graphs of the
individual tables, but you could add column "Coord", roll everything
into a single table and then create one or more facet plots from that.

Cheers,

Loris