Read TXT file with variable separation
Jan, This is what I'm looking for! Very thanks!
On Tue, Nov 29, 2011 at 9:24 AM, Jan van der Laan <rhelp at eoos.dds.nl> wrote:
Raphael, This looks like fixed width format which you can read with read.fwf. In fixed width format the columns are not separated by white space (or other characters), but are identified by the positition in the file. So in your file, for example the first field looks to contained in the first 2 columns of your file (the first 2 characters of every line), the second field in the next five columns, etc. Regards, Jan Citeren Raphael Saldanha <saldanha.plangeo at gmail.com>:
Hi! I have to import some TXT files into R, but the separation between the columns are made with different blank spaces, but each file use the same separation. Example: 31 ?104 5 0 ? 11RUA ? ? ? ? ? ? ? ? SAO SEBASTIAO ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 25 ?BAIRRO FILETO ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?01 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?00200338540000 The pattern is the same on each file. There is two sample files attached to this message. I would like to figure out how to import a single file, and the use some code to import several files (like this http://www.ats.ucla.edu/stat/r/code/read_multiple.htm) When I try read.table, I receive this: cnefe <- read.table("sample1.txt", header=FALSE) Erro em scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, ?: ?linha 1 n?o tinha 17 elementos Information about my session:
sessionInfo()R version 2.12.1 (2010-12-16)Platform: i386-pc-mingw32/i386 (32-bit)
locale:[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 ??[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages:[1] stats ? ? graphics ?grDevices utils datasets ?methods ? base -- Atenciosamente, Raphael Saldanha saldanha.plangeo at gmail.com
Atenciosamente, Raphael Saldanha saldanha.plangeo at gmail.com