Hi!
I have to import some TXT files into R, but the separation between the
columns are made with different blank spaces, but each file use the
same separation. Example:
31 104 5 0 11RUA SAO
SEBASTIAO 25
BAIRRO FILETO
01
00200338540000
The pattern is the same on each file.
There is two sample files attached to this message.
I would like to figure out how to import a single file, and the use
some code to import several files (like this
http://www.ats.ucla.edu/stat/r/code/read_multiple.htm)
When I try read.table, I receive this:
cnefe <- read.table("sample1.txt", header=FALSE)
Erro em scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
linha 1 n?o tinha 17 elementos
Information about my session:
> sessionInfo()R version 2.12.1 (2010-12-16)Platform: i386-pc-mingw32/i386
(32-bit)
locale:[1] LC_COLLATE=Portuguese_Brazil.1252
LC_CTYPE=Portuguese_Brazil.1252 ??[3]
LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252
attached base packages:[1] stats ? ? graphics ?grDevices utils
datasets ?methods ? base
--
Atenciosamente,
Raphael Saldanha
saldanha.plangeo at gmail.com
-------------- next part --------------
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 375
BAIRRO FILETO
06OLARIA 1
00100238540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 241
BAIRRO FILETO
04CRECHE ENSINO LETRAMENTO INFANIL 1
00100238540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 225
BAIRRO FILETO
01
00100238540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 219
BAIRRO FILETO
01
00100238540000
31 104 5 0 11RUA EZEQUIEL C DOS
SANTOS 10
BAIRRO FILETO
01
00100538540000
31 104 5 0 11RUA JOSE TOMAZ DA
CUNHA 20
BAIRRO FILETO
01
00100638540000
31 104 5 0 11RUA JOSE TOMAZ DA
CUNHA 26
BAIRRO FILETO
01
00100638540000
31 104 5 0 11RUA JOSE TOMAZ DA
CUNHA 30
BAIRRO FILETO
01
00100638540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 107
BAIRRO FILETO
01
00100738540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 97
BAIRRO FILETO
01
00100738540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 77
BAIRRO FILETO
01
00100738540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 67
BAIRRO FILETO
01
00100738540000
31 104 5 0 11RUA LAFAETE
ESTEVES CRUVINEL 8
BAIRRO FILETO
01
00100938540000
31 104 5 0 11RUA LAFAETE
ESTEVES CRUVINEL 10
BAIRRO FILETO
01
00100938540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 45
BAIRRO FILETO
01
00101038540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 35
BAIRRO FILETO
01
00101038540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 25
BAIRRO FILETO
01
00101038540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 15
BAIRRO FILETO
01
00101038540000
31 104 5 0 11AVENIDA JOSE ESTEVES
BORGES 0SN
BAIRRO FILETO
01
00101038540000
31 104 5 0 11AVENIDA SANTOS
368
CENTRO
01
00200238540000
31 104 5 0 11AVENIDA SANTOS
324
CENTRO
01
00200238540000
31 104 5 0 11AVENIDA SANTOS
220
CENTRO
01
00200238540000
31 104 5 0 11AVENIDA SANTOS
200
CENTRO
01
00200238540000
31 104 5 0 11AVENIDA SANTOS
190
CENTRO
06OFICINA MECANICA 1
00200238540000
31 104 5 0 11AVENIDA SANTOS
190 APARTAMENTO
CENTRO
06LOJA DE PECAS AUTOMOTIVAS 2
00200238540000
31 104 5 0 11AVENIDA SANTOS
190 APARTAMENTO
CENTRO
01
00200238540000
31 104 5 0 11AVENIDA SANTOS
208
CENTRO
06OFICINA MECANICA 1
00200238540000
31 104 5 0 11AVENIDA SANTOS
0SN
CENTRO ULTIMO BARRACAO
FACE1 06BARRACAO DEPOSITO
1 00200238540000
-------------- next part --------------
12 435 5 0 11RUA DOS MILITAR
18FNS
CIDADE NOVA
01
00202469955000
12 435 5 0 11RUA DOS MILITAR
14FNS
CIDADE NOVA
01
00202469955000
12 435 5 0 11TRAVESSA 1
0SN
CENTRO
06PSTO DE GASOLINA 1
00202669955000
12 435 5 0 11RUA CORONEL JOSE FERREIRA
0SN
CIDADE NOVA
02 PELOTAO DO EXERCITO
00202269955000
12 435 5 0 11RUA CORONEL JOSE FERREIRA
3FNS
CIDADE NOVA
06LOJA MOVES E MATERIAL DE ONSRUCAO 1
00202269955000
Raphael, This looks like fixed width format which you can read with read.fwf. In fixed width format the columns are not separated by white space (or other characters), but are identified by the positition in the file. So in your file, for example the first field looks to contained in the first 2 columns of your file (the first 2 characters of every line), the second field in the next five columns, etc. Regards, Jan Citeren Raphael Saldanha <saldanha.plangeo at gmail.com>:> Hi! > > I have to import some TXT files into R, but the separation between the > columns are made with different blank spaces, but each file use the > same separation. Example: > > 31 104 5 0 11RUA SAO > SEBASTIAO 25 > > > > BAIRRO FILETO > 01 > 00200338540000 > > The pattern is the same on each file. > > There is two sample files attached to this message. > > I would like to figure out how to import a single file, and the use > some code to import several files (like this > http://www.ats.ucla.edu/stat/r/code/read_multiple.htm) > > When I try read.table, I receive this: > > cnefe <- read.table("sample1.txt", header=FALSE) > Erro em scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > linha 1 n?o tinha 17 elementos > > > Information about my session: > >> sessionInfo()R version 2.12.1 (2010-12-16)Platform: >> i386-pc-mingw32/i386 (32-bit) > locale:[1] LC_COLLATE=Portuguese_Brazil.1252 > LC_CTYPE=Portuguese_Brazil.1252 ??[3] > LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C > [5] LC_TIME=Portuguese_Brazil.1252 > attached base packages:[1] stats ? ? graphics ?grDevices utils > datasets ?methods ? base > > -- > Atenciosamente, > > Raphael Saldanha > saldanha.plangeo at gmail.com