On Jun 30, 2010, at 12:21 AM, ARRRRRR wrote:
>
>
http://r.789695.n4.nabble.com/file/n2272669/FT20100626_%2420_%2B_%242_Sit_%26_Go_-_%28169112900%29_-_Summary.txt
> FT20100626_%2420_%2B_%242_Sit_%26_Go_-_%28169112900%29_-_Summary.txt
>
> I have a lot of experience with Stata, but I'm new to R. I'm
trying
> to read
> the attached file into R on my mac. My goal is to have it as a
> list, with
> each element a string - from then I can parse out the data I need
> and add it
> as an observation in a dataframe.
>
> I've tried scan, readlines, etc. but I'm stumped. I've been
adding
> encoding="UTF-16", but that doesn't seem to help much.
> The closest I've come is:
>
> test<-scan(file="FT20100626 $20 + $2 Sit & Go - (169112900) -
> Summary.txt",
> what=list(""), flush=FALSE, skip=0, encoding="UTF-16",
quote="\n")
>
> which gives me a list wherein each element is first letter of the row.
>
>> test
> [[1]]
> [1] "\xff\xfeF" "T" "P"
"T" "S" "$"
> "+" "$" "S"
I believe you are being bitten by an encoding issue and that it is
referred to by this section of the help page from ?connections:
"The encoding "UCS-2LE" is treated specially, as it is the
appropriate
value for Windows ?Unicode? text files. If the first two bytes are the
Byte Order Mark 0xFFFE then these are removed as most implementations
of iconv do not accept BOMs. Note that some implementations will
handle BOMs using encoding "UCS-2" but many will not."
Notice the your first two entries are \xff\xfe which I believe is a
representation of 0xFFFE. When you look at that page with FireFox and
request encoding information you are given UTF-16. I am not
sufficiently educated on encoding issues even though we share
platforms. I tried a few different encoding specifications including
"UTF-16", "UCS-2" and "UCS-2LE" with scan and
readLines but failed to
work through to the solution. Another possiblity might be to subscribe
to the R SIG-Mac mailing list and post the question there.
--
David.
> "[10] "&" "G" "("
"H" "N" "L"
> "B" "u" "$"
> [19] "+" "$" "B"
"u" "C" "1"
> "6" "E" "T"
> [28] "o" "P" "P"
"$" "T" "o"
> "s" "2" "0"
> [37] "E" "T" "o"
"f" "2" "1"
> "E" "\n" "1"
> [46] "B" "$" "2"
":" "J" "$"
> "3" ":" "b"
> [55] "4" ":" "s"
"c" "2" "5"
> ":" "R" "6"
> [64] ":" "S" "B"
"o" "f" "i"
> "1" "p"
>
> Any help would be greatly appreciated.
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Reading-in-a-transcript-like-file-tp2272669p2272669.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT