Hi, I am using RpgSQL to retrieve data from a PostgreSQL database wich is with encoding UTF8, and I have some Chinese character in one of the columns, unfortunately R can't show it correctly.> df <- dbGetQuery(con, "select * from test") > dfa b 1 1 ????????\xa2 2 2 ???? EURO\xa1 I see the following option, do I need to change the encoding option to show the corresponding texts? In my case how to set? $encoding [1] "native.enc" Thanks, Xiaobo Gu
But Sys.setlocale tries to change the option of the whole OS, I just want only R to use a specified encoding, how can I do this. Xiaobo.Gu>>-----Original Message----- >>From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com] >>Sent: Monday, November 29, 2010 8:57 PM >>To: Xiaobo Gu >>Subject: Re: FW: R encoding question >> >>I have never played with encodings myself. Suggest you read the postgresql >>documentation and try different arguments to Sys.setlocale in R. You >>probably have to do that before you initiate the database since it might not >>have any effect afterwards. I am not sure this is the problem but its worth a try. >>Here are some examples. >> >>Sys.setlocale(locale="C") >>Sys.setlocale(locale="en_NZ.iso88591") >>Sys.setlocale("LC_ALL", "en_US") >>Sys.setlocale("LC_TIME", "English") >>Sys.setlocale('LC_ALL','fr_FR') >>Sys.putenv("LANGUAGE"="EN");Sys.setlocale("LC_ALL","EN") >>Sys.putenv("LANGUAGE"="FR");Sys.setlocale("LC_ALL","FR") >> >> >>2010/11/29 Xiaobo Gu <guxiaobo1982 at gmail.com>: >>> Hi, >>> Can you help with this. >>> >>> Regards, >>> >>> Xiaobo Gu >>> >>> >>> -----Original Message----- >>> From: Xiaobo Gu [mailto:guxiaobo1982 at gmail.com] >>> Sent: Wednesday, November 24, 2010 10:19 PM >>> To: r-help at r-project.org >>> Subject: R encoding question >>> >>> Hi, >>> I am using RpgSQL to retrieve data from a PostgreSQL database wich is >>> with encoding UTF8, and I have some Chinese character in one of the >>> columns, unfortunately R can't show it correctly. >>> >>>> df <- dbGetQuery(con, "select * from test") df >>> a b >>> 1 1 ????\xa2 >>> 2 2 ?? EURO\xa1 >>> >>> I see the following option, do I need to change the encoding option to >>> show the corresponding texts? In my case how to set? >>> >>> $encoding >>> [1] "native.enc" >>> >>> Thanks, >>> Xiaobo Gu >>> >>> >> >> >> >>-- >>Statistics & Software Consulting >>GKX Group, GKX Associates Inc. >>tel: 1-877-GKX-GROUP >>email: ggrothendieck at gmail.com
Do you know what values should I set to the category and locale parameters in order to use UTF-8 encoding in a Chinese Windows XP SP3 environment? Sys.setlocale(category = "LC_ALL", locale = "") Xiaobo Gu>>-----Original Message----- >>From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com] >>Sent: Monday, November 29, 2010 9:27 PM >>To: Xiaobo Gu >>Subject: Re: FW: R encoding question >> >>I believe the R Sys.setlocale function only changes it in R, not the entire OS. For >>example here we set it to German but the messages from the OS still come out >>as English: >> >>> Sys.getlocale() >>[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United >>States.1252;LC_MONETARY=English_United >>States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252" >>> Sys.setlocale(locale="German") >>[1] >>"LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC >>_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Ge >>rmany.1252" >>> shell("date") >>The current date is: 29/11/2010 >>Enter the new date: (dd-mm-yy) Warning message: >>In shell("date") : 'date' execution failed with error code 1 >> >> >> >>On Mon, Nov 29, 2010 at 8:18 AM, Xiaobo Gu <guxiaobo1982 at gmail.com> >>wrote: >>> But Sys.setlocale tries to change the option of the whole OS, I just want only >>R to use a specified encoding, how can I do this. >>> >>> >>> Xiaobo.Gu >>> >>> >>>>>-----Original Message----- >>>>>From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com] >>>>>Sent: Monday, November 29, 2010 8:57 PM >>>>>To: Xiaobo Gu >>>>>Subject: Re: FW: R encoding question >>>>> >>>>>I have never played with encodings myself. Suggest you read the >>>>>postgresql documentation and try different arguments to Sys.setlocale >>>>>in R. You probably have to do that before you initiate the database >>>>>since it might not have any effect afterwards. I am not sure this is the >>problem but its worth a try. >>>>>Here are some examples. >>>>> >>>>>Sys.setlocale(locale="C") >>>>>Sys.setlocale(locale="en_NZ.iso88591") >>>>>Sys.setlocale("LC_ALL", "en_US") >>>>>Sys.setlocale("LC_TIME", "English") >>>>>Sys.setlocale('LC_ALL','fr_FR') >>>>>Sys.putenv("LANGUAGE"="EN");Sys.setlocale("LC_ALL","EN") >>>>>Sys.putenv("LANGUAGE"="FR");Sys.setlocale("LC_ALL","FR") >>>>> >>>>> >>>>>2010/11/29 Xiaobo Gu <guxiaobo1982 at gmail.com>: >>>>>> Hi, >>>>>> Can you help with this. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Xiaobo Gu >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Xiaobo Gu [mailto:guxiaobo1982 at gmail.com] >>>>>> Sent: Wednesday, November 24, 2010 10:19 PM >>>>>> To: r-help at r-project.org >>>>>> Subject: R encoding question >>>>>> >>>>>> Hi, >>>>>> I am using RpgSQL to retrieve data from a PostgreSQL database wich >>>>>> is with encoding UTF8, and I have some Chinese character in one of >>>>>> the columns, unfortunately R can't show it correctly. >>>>>> >>>>>>> df <- dbGetQuery(con, "select * from test") df >>>>>> a b >>>>>> 1 1 ????\xa2 >>>>>> 2 2 ?? EURO\xa1 >>>>>> >>>>>> I see the following option, do I need to change the encoding option >>>>>> to show the corresponding texts? In my case how to set? >>>>>> >>>>>> $encoding >>>>>> [1] "native.enc" >>>>>> >>>>>> Thanks, >>>>>> Xiaobo Gu >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>>-- >>>>>Statistics & Software Consulting >>>>>GKX Group, GKX Associates Inc. >>>>>tel: 1-877-GKX-GROUP >>>>>email: ggrothendieck at gmail.com >>> >>> >> >> >> >>-- >>Statistics & Software Consulting >>GKX Group, GKX Associates Inc. >>tel: 1-877-GKX-GROUP >>email: ggrothendieck at gmail.com
On Tue, Nov 30, 2010 at 7:30 AM, Xiaobo Gu <guxiaobo1982 at gmail.com> wrote:> Do you know what values should I set to the category and locale parameters in order to use UTF-8 encoding in a Chinese Windows XP SP3 environment? > > Sys.setlocale(category = "LC_ALL", locale = "") >Its OS dependent but you could try: Sys.setlocale(locale = "Chinese") and Sys.setlocale(locale = "") to set it back. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
But locale "Chinese" will use GBK encoding by default, how to use UTF-8 encoding I have tried the following, neither of them works. Sys.setlocale(locale = "zh_CN.UTF-8") Sys.setlocale(category = "LC_CTYPE", locale= "zh_CN.UTF-8") Xiaobo Gu>>-----Original Message----- >>From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com] >>Sent: Tuesday, November 30, 2010 8:57 PM >>To: Xiaobo Gu >>Cc: r-help at r-project.org >>Subject: Re: FW: R encoding question >> >>On Tue, Nov 30, 2010 at 7:30 AM, Xiaobo Gu <guxiaobo1982 at gmail.com> >>wrote: >>> Do you know what values should I set to the category and localeparameters>>in order to use UTF-8 encoding in a Chinese Windows XP SP3 environment? >>> >>> Sys.setlocale(category = "LC_ALL", locale = "") >>> >> >>Its OS dependent but you could try: >> >>Sys.setlocale(locale = "Chinese") >> >>and >> >>Sys.setlocale(locale = "") >> >>to set it back. >> >> >>-- >>Statistics & Software Consulting >>GKX Group, GKX Associates Inc. >>tel: 1-877-GKX-GROUP >>email: ggrothendieck at gmail.com