Hi, I have the following input file. $ cat main.txt CEL_A CELL_B 1 4 2 5 2 6 Then I run read.table in R.> f=read.table('main.txt', header=T, check.names=F, sep='\t') > head(f)\ufeffCEL_A CELL_B 1 1 4 2 2 5 3 2 6> f$CEL_ANULL I'm not sure where the special character \ufeff comes from. Could anybody let me know what is the problem? Thanks, John [[alternative HTML version deleted]]
On 22/02/2011 10:43 AM, John Edwards wrote:> Hi, > > I have the following input file. > $ cat main.txt > CEL_A CELL_B > 1 4 > 2 5 > 2 6 > > Then I run read.table in R. > > > f=read.table('main.txt', header=T, check.names=F, sep='\t') > > head(f) > \ufeffCEL_A CELL_B > 1 1 4 > 2 2 5 > 3 2 6 > > f$CEL_A > NULL > > I'm not sure where the special character \ufeff comes from. Could anybody > let me know what is the problem?The Unicode character "\uFEFF" is the "byte-order mark". This is commonly used in Windows systems, not so commonly on others, which tend to get confused by it. You didn't say what system you are working on and what encoding was used for the file; those are likely both important. Duncan Murdoch
On Tue, Feb 22, 2011 at 7:43 AM, John Edwards <jhnedwards603 at gmail.com> wrote:> Hi, > > I have the following input file. > $ cat main.txt > CEL_A CELL_B > 1 4 > 2 5 > 2 6 > > Then I run read.table in R. > >> f=read.table('main.txt', header=T, check.names=F, sep='\t') >> head(f) > ?\ufeffCEL_A CELL_B > 1 ? ?1 ? ? ?4 > 2 ? ?2 ? ? ?5 > 3 ? ?2 ? ? ?6 >> f$CEL_A > NULL > > I'm not sure where the special character \ufeff comes from. Could anybody > let me know what is the problem?Looks like the Unicode character called 'byte order mark' (BOM), cf. http://en.wikipedia.org/wiki/Byte_order_mark It looks like your 'main.txt' text file was created by a software that saves it as a Unicode-encoded text file. If you need a plain old-style ASCII text file, see if you can resave it as such. With last year's development in R, it also not unlikely that you can tell R to read in the existing file by specifying the encoding, but since I don't now how to do that I leave that as an search-the-help exercise for you. /Henrik> > Thanks, > John > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
What you describe could be a bug (in which case providing your OS and R version info per the posting guidelines would be a minimum requirement to get it fixed) or a control character that is actually in your file (which you might need a binary editor to see). --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. John Edwards <jhnedwards603@gmail.com> wrote: Hi, I have the following input file. $ cat main.txt CEL_A CELL_B 1 4 2 5 2 6 Then I run read.table in R. > f=read.table('main.txt', header=T, check.names=F, sep='\t') > head(f) \ufeffCEL_A CELL_B 1 1 4 2 2 5 3 2 6 > f$CEL_A NULL I'm not sure where the special character \ufeff comes from. Could anybody let me know what is the problem? Thanks, John [[alternative HTML version deleted]]_____________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Possibly Parallel Threads
- Bug Report: read.table with UTF-8 encoded file imports infinity symbol as Integer 8
- Odd characters at beginning of file
- nchar reporting wrong width when zero-space character is present?
- Reading a txt file from internet
- How ls() only functions or anything else but functions?