Skip to content
Prev 179246 / 398506 Next

Way to handle variable length and numbers of columns using read.table(...)

Jim, 

You guessed it.  There are other "problems" with the data.  Here is a closer representation of the data:
Total time and location 
are listed below.

Time Loc1 Loc2
---------------
1 22.33 44.55
2 66.77 88.99
3 222.33344.55
4 66.77 88.99

Avg. Loc1 = 77.88
Avg. Loc2 = 55.66
Final Time = 4

Right now I am using "nrows" in order to only read Time 1-4 & "skip" to skip over the unusable header info, e.g.

read.table(read.table('clipboard', header=FALSE, fill=TRUE, skip=5, nrows=4)

Unfortunately, sometimes the number of "Time" rows varies, so I need to also account for that.  

Maybe I need to look into what Gabor suggested as well, i.e. library(gsubfn)

Thanks again for any feedback and advice on this one, as the data I receive is out of my control, but I am working with the go get them to fix it as well.
--- On Mon, 5/4/09, jim holtman <jholtman at gmail.com> wrote:

            
As you can see the variable that has two decimal points is read in as a character and cause the whole column to be converted to a factor.  It appears that you have some fixed length fields that are overflowing.  Now you could read in the data and use regular expressions and parse the data; you just have to match on the first part have two decimal place and then extract the rest.  THe question is, is this the only "problems" you have in the data?  If so, parsing it is not hard.