Hi, I hope you are doing well! I have a CSV file which its encoding is ANSI. How can i change its encoding to UTF-8 in R? Many thanks! With best regards, -- *Mehdi Dadkhah* [[alternative HTML version deleted]]
On 2020-05-05 15:12 +0430, Mehdi Dadkhah wrote:> I have a CSV file which its encoding is > ANSI. How can i change its encoding to > UTF-8 in R?Hi! I do not know about ANSI, but to read latin1 encoded csv files into readr, do this: Determine that your file is latin1-encoded: rasmus at twentyfive ~ % file -i SAA.csv SAA.csv: application/csv; charset=iso-8859-1 read it in using readr::read_csv locale <- readr::locale(encoding = "latin1") SAA <- suppressMessages( readr::read_csv(file="SAA.csv", locale=locale)) Best, Rasmus
What do you mean "ANSI"? Do you mean ASCII? In that case there is nothing to be done. Do you mean some member of the ISO 8859 family of 8-bit character sets? Do you mean some Microsoft-specific code page, such as CP-1252? (Microsoft CP-437 and CP-1252 "ANSI" but if they have any connection whatever with ANSI I would appreciate being informed of it.) If you really do mean the ANSI Extended Latin (ANSEL) character set, you are out of luck. If it is supported in your environment, the easiest way is that use the iconv() function. That's what it is for. See ?iconv. But there is something easier, and that is not to. Just let R know what the external encoding is, and just read the file. If you check the documentation of read.csv, ?read.csv you will find the fileEncoding="..." argument. fileEncoding: character string: if non-empty declares the encoding used on a file (not a connection) so the character data can be re-encoded. See the 'Encoding' section of the help for 'file', the 'R Data Import/Export Manual' and 'Note'. At a guess, you want fileEncoding="WINDOWS-1252". On Tue, 5 May 2020 at 22:42, Mehdi Dadkhah <mehdidadkhah91 at gmail.com> wrote:> > Hi, > I hope you are doing well! > I have a CSV file which its encoding is ANSI. How can i change its encoding > to UTF-8 in R? > Many thanks! > With best regards, > > -- > *Mehdi Dadkhah* > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thank you! it works for me. With best regards, On Tue, May 5, 2020 at 3:27 PM Rasmus Liland <jral at posteo.no> wrote:> On 2020-05-05 15:12 +0430, Mehdi Dadkhah wrote: > > I have a CSV file which its encoding is > > ANSI. How can i change its encoding to > > UTF-8 in R? > > Hi! > > I do not know about ANSI, but to read latin1 > encoded csv files into readr, do this: > > Determine that your file is latin1-encoded: > > rasmus at twentyfive ~ % file -i SAA.csv > SAA.csv: application/csv; charset=iso-8859-1 > > read it in using readr::read_csv > > locale <- readr::locale(encoding = "latin1") > SAA <- suppressMessages( > readr::read_csv(file="SAA.csv", > locale=locale)) > > Best, > Rasmus >-- *Mehdi Dadkhah* [[alternative HTML version deleted]]
Thank you! With best regards, On Tue, May 5, 2020 at 3:47 PM Richard O'Keefe <raoknz at gmail.com> wrote:> What do you mean "ANSI"? > Do you mean ASCII? In that case there is nothing to be done. > Do you mean some member of the ISO 8859 family of 8-bit character sets? > Do you mean some Microsoft-specific code page, such as CP-1252? > (Microsoft CP-437 and CP-1252 "ANSI" but if they have any connection > whatever with ANSI I would appreciate being informed of it.) > If you really do mean the ANSI Extended Latin (ANSEL) character > set, you are out of luck. > > If it is supported in your environment, the easiest way is that use the > iconv() function. That's what it is for. See ?iconv. > > But there is something easier, and that is not to. > Just let R know what the external encoding is, and just read the file. > If you check the documentation of read.csv, ?read.csv > you will find the fileEncoding="..." argument. > > fileEncoding: character string: if non-empty declares the encoding used > on a file (not a connection) so the character data can be > re-encoded. See the 'Encoding' section of the help for > 'file', the 'R Data Import/Export Manual' and 'Note'. > > At a guess, you want fileEncoding="WINDOWS-1252". > > On Tue, 5 May 2020 at 22:42, Mehdi Dadkhah <mehdidadkhah91 at gmail.com> > wrote: > > > > Hi, > > I hope you are doing well! > > I have a CSV file which its encoding is ANSI. How can i change its > encoding > > to UTF-8 in R? > > Many thanks! > > With best regards, > > > > -- > > *Mehdi Dadkhah* > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >-- *Mehdi Dadkhah* PhD candidate & Research assistant Department of Management, Faculty of Economics and Administrative Sciences, Ferdowsi University of Mashhad, Mashhad, Iran *Email Addresses:* mehdidadkhah91 at gmail.com Mehdidadkhah at mail.um.ac.ir [[alternative HTML version deleted]]