On 10/09/2010 10:03 AM, Marcelo Est?cio wrote:>
>
> Dear,
>
> When I try to to execute the following command, R don't read all lines
(reads only 57658 lines when the file has 814125 lines):
>
>
>
> dados2<-read.table("C:\\Documents and
Settings\\mgoncalves\\Desktop\\T?bua
IFPD\\200701_02_03_04\\SegurosClube.txt",header=FALSE,sep="^",colClasses=c("character","character","NULL",NA,"NULL","NULL","NULL","character","character","NULL","NULL","NULL","NULL",NA,"NULL","NULL","NULL","NULL",NA,"NULL","NULL"),quote="",comment.char="",skip=1,fill=TRUE)
>
> If I exclude "fill=TRUE", R gives the message
>
>
>
> Warning message:
> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
> n?mero de itens n?o ? m?ltiplo do n?mero de colunas (number of itens is
not multiple of number of columns)
>
>
>
> I identified that the problem is the following line of my data (line 57659
of my file):
>
>
>
>
13850074571^01/01/1940^00000000000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previd?ncia
privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE
>
>
> As you can observe, my data have a "square" string like this: (i
don't know if you can see the character, but it looks like a white square).
It looks like that R understands this character as the end of the archive.
>
> I opened my data on the notepad and copied the character. When I paste this
character on R, it try to close asking if I want to save my work. What is
happenning?
That symbol is the way some systems display the hex 1A character, which
in DOS marked the end of file. By the pathname it looks as though
you're working on Windows, which has inherited that behaviour.
The best way to get around it would be to correct those bad characters:
they are almost certainly errors in the data file. If you want to keep
them, then you could try reading the file in binary mode rather than
text mode. You do this using
con <- file( "filename", open="rb")
read.table(con, header=FALSE, ...)
close(con)
You could also try reading it on a different OS; I don't think Linux
cares about 1A characters.
Duncan Murdoch
>
>
> Thanks very much.
>
> Marcelo Est?cio
>
>
> [[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.