Marc Schwartz wrote:
On Mon, 2006-10-30 at 19:51 +0100, Gregor Gorjanc wrote:
Hi! I have data (also in attached file) in the following form: num1 num2 num3 int1 fac1 fac2 cha1 cha2 Date POSIXt 1 1 f q 1900-01-01 1900-01-01 01:01:01 2 1.0 1316666.5 2 a g r z 1900-01-01 01:01:01 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 2.5 829737.4 d j u w 1900-01-01 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01 This is a FWF (fixed width format) file. I can not use read.table here, because of missing values. I have tried with the following
read.fwf(file="test.txt", widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20),
header=TRUE) Error in read.table(file = FILE, header = header, sep = sep, as.is = as.is, : more columns than column names I could use:
read.fwf(file="test.txt", widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20),
header=FALSE, skip=1) V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 1 1 NA NA 1 f q 1900-01-01 1900-01-01 01:01:01 2 2 1.0 1316666.5 2 a g r z 1900-01-01 01:01:01 3 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 5 2.5 829737.4 NA d j u w 1900-01-01 6 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 11 NA 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01 Does anyone have a clue, how to get above result with header? Thanks!
The attachment did not come through. Perhaps it was too large?
Not sure if this is the most efficient way, but how about this:
DF <- read.fwf("test.txt",
widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20),
skip = 1, strip.white = TRUE,
col.names = read.table("test.txt",
nrow = 1, as.is = TRUE)[1, ])
Argh, my fault as I forgot to attach it :(
Not sure if this is the most efficient way, but how about this:
DF <- read.fwf("test.txt",
widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20),
skip = 1, strip.white = TRUE,
col.names = read.table("test.txt",
nrow = 1, as.is = TRUE)[1, ])
That is a very nice compromise! No need for [1, ], due to nrow=1.
Of course, with the limited number of columns, you can always just set
colnames(DF) <- c("num1", "num2", "num3", "int1", "fac1",
"fac2", "cha1", "cha2", "Date", "POSIXt")
I fully agree here, but I kind of lack this directly in read.fwf. I hope that someone from R-core is also listening to this ;) Thank you! Gregor -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test.txt Url: https://stat.ethz.ch/pipermail/r-devel/attachments/20061030/88560f7c/attachment-0004.txt