Skip to content
Back to formatted view

Raw Message

Message-ID: <CACSWb426omYkN11LCeAU4ck_u2QeNJP7c37fDTHE8kV+5k2eZQ@mail.gmail.com>
Date: 2011-11-29T12:43:38Z
From: Raphael Saldanha
Subject: Read TXT file with variable separation
In-Reply-To: <20111129122425.Horde.tMcTJsLqK_BO1MDpbPyRpaA@webmailnew.dds.nl>

Jan,

This is what I'm looking for! Very thanks!

On Tue, Nov 29, 2011 at 9:24 AM, Jan van der Laan <rhelp at eoos.dds.nl> wrote:
>
> Raphael,
>
> This looks like fixed width format which you can read with read.fwf.
>
> In fixed width format the columns are not separated by white space (or other
> characters), but are identified by the positition in the file. So in your
> file, for example the first field looks to contained in the first 2 columns
> of your file (the first 2 characters of every line), the second field in the
> next five columns, etc.
>
> Regards,
> Jan
>
>
> Citeren Raphael Saldanha <saldanha.plangeo at gmail.com>:
>
>> Hi!
>>
>> I have to import some TXT files into R, but the separation between the
>> columns are made with different blank spaces, but each file use the
>> same separation. Example:
>>
>> 31 ?104 5 0 ? 11RUA ? ? ? ? ? ? ? ? SAO
>> SEBASTIAO ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 25
>>
>>
>>
>> ?BAIRRO FILETO
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?01
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?00200338540000
>>
>> The pattern is the same on each file.
>>
>> There is two sample files attached to this message.
>>
>> I would like to figure out how to import a single file, and the use
>> some code to import several files (like this
>> http://www.ats.ucla.edu/stat/r/code/read_multiple.htm)
>>
>> When I try read.table, I receive this:
>>
>> cnefe <- read.table("sample1.txt", header=FALSE)
>> Erro em scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>> ?:
>> ?linha 1 n?o tinha 17 elementos
>>
>>
>> Information about my session:
>>
>>> sessionInfo()R version 2.12.1 (2010-12-16)Platform: i386-pc-mingw32/i386
>>> (32-bit)
>>
>> locale:[1] LC_COLLATE=Portuguese_Brazil.1252
>> LC_CTYPE=Portuguese_Brazil.1252 ??[3]
>> LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
>> [5] LC_TIME=Portuguese_Brazil.1252
>> attached base packages:[1] stats ? ? graphics ?grDevices utils
>> datasets ?methods ? base
>>
>> --
>> Atenciosamente,
>>
>> Raphael Saldanha
>> saldanha.plangeo at gmail.com
>
>
>
>



-- 
Atenciosamente,

Raphael Saldanha
saldanha.plangeo at gmail.com