Hi all, I am having difficulties to understand how R sort strings: If I do R) sort(c("X.","X0B")) [1] "X." "X0B" So for me, as far as lexicographic order is concerned I can add whatever to the end, the order will remain the same, but : R) sort(c("X.Z","X0B.Z")) [1] "X0B.Z" "X.Z" Can somebody give me a trick for the order to become lexicographic ? -- View this message in context: http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4403696.html Sent from the R help mailing list archive at Nabble.com.
On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote:> Hi all, I am having difficulties to understand how R sort strings: > > If I do > R) sort(c("X.","X0B")) > [1] "X." "X0B" > > So for me, as far as lexicographic order is concerned I can add whatever to > the end, the order will remain the same, but :Hi. This neednot be true for strings of different length. For example ab abc become by concatenation with z abcz abz Petr Savicky.
Ok so it changed from 2.12.2 to 2.14.1 ?? Can somebody tell me how to modify my sort or whatever to get the save resilt that I would get in 2.14.1 ? Cheers -- View this message in context: http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4403858.html Sent from the R help mailing list archive at Nabble.com.
See ?Comparison, which holds some warnings about what to expect when sorting strings. Am 20.02.2012 11:51, schrieb Petr Savicky:> On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote: >> Hi all, I am having difficulties to understand how R sort strings: >> >> If I do >> R) sort(c("X.","X0B")) >> [1] "X." "X0B" >> >> So for me, as far as lexicographic order is concerned I can add whatever to >> the end, the order will remain the same, but : > > Hi. > > This neednot be true for strings of different length. > For example > > ab > abc > > become by concatenation with z > > abcz > abz > > Petr Savicky. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Enrico Schumann Lucerne, Switzerland http://nmof.net/
I did, but this does not give the answer to my question... Anybody knows how to tweack the behaviour of sort or how to do ? -- View this message in context: http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404091.html Sent from the R help mailing list archive at Nabble.com.
On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote:> I did, but this does not give the answer to my question... > Anybody knows how to tweack the behaviour of sort or how to do ?Hi. Try this Sys.setlocale("LC_COLLATE", "C") This comes from ?locale and reads there Sys.setlocale("LC_COLLATE", "C") # turn off locale-specific sorting, # usually See also ?sort The sort order for character vectors will depend on the collating sequence of the locale in use: see ?Comparison?. ?Comparison Comparison of strings in character vectors is lexicographic within the strings using the collating sequence of the locale in use: see ?locales?. The collating sequence of locales such as ?en_US? is normally different from ?C? (which should use ASCII) and can be surprising. Beware of making _any_ assumptions about the collation order: ... Hope this helps. Petr Savicky.
NICE DUUUUDE It solves my problem ! Awesome stuff -- View this message in context: http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404424.html Sent from the R help mailing list archive at Nabble.com.
It seems OS-dependent. I got different results when trying it on windows xp and Redhat linux. > R.version _ platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 9.1 year 2009 month 06 day 26 svn rev 48839 language R version.string R version 2.9.1 (2009-06-26) > sort(c("X.","X0B")) [1] "X." "X0B" > sort(c("X.Z","X0B.Z")) [1] "X.Z" "X0B.Z" > R.version _ platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 9.1 year 2009 month 06 day 26 svn rev 48839 language R version.string R version 2.9.1 (2009-06-26) > sort(c("X.","X0B")) [1] "X." "X0B" > sort(c("X.Z","X0B.Z")) [1] "X0B.Z" "X.Z" On 2012-2-20 23:27, statquant2 wrote:> Ok I have : > > R) str(R.Version()) > List of 13 > $ platform : chr "x86_64-unknown-linux-gnu" > $ arch : chr "x86_64" > $ os : chr "linux-gnu" > $ system : chr "x86_64, linux-gnu" > $ status : chr "" > $ major : chr "2" > $ minor : chr "12.2" > $ year : chr "2011" > $ month : chr "02" > $ day : chr "25" > $ svn rev : chr "54585" > $ language : chr "R" > $ version.string: chr "R version 2.12.2 (2011-02-25)" > > R) sort(c("X.","X0B")) > [1] "X." "X0B" > R) sort(c("X.Z","X0B.Z")) > [1] "X0B.Z" "X.Z" > > I am using a linux redHat > $ uname -a > Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 x86_64 > x86_64 GNU/Linux > > > -- > View this message in context: http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404298.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Mon, Feb 20, 2012 at 04:56:21PM +0100, Petr Savicky wrote:> On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote: > > I did, but this does not give the answer to my question... > > Anybody knows how to tweack the behaviour of sort or how to do ? > > Hi. > > Try this > > Sys.setlocale("LC_COLLATE", "C") > > > This comes from ?locale and reads thereThis is not in ?locale, but in ?locales> Sys.setlocale("LC_COLLATE", "C") # turn off locale-specific sorting, > # usuallyThis in the example section at the end. Try also to see Sys.getlocale() Relevant can also be LC_CTYPE Sys.setlocale("LC_CTYPE", "C") Hope this helps. Petr Savicky.
Sorry, just made a mistake. This is the result from windows xp. > sort(c("X.","X0B")) [1] "X." "X0B" > sort(c("X.Z","X0B.Z")) [1] "X.Z" "X0B.Z" > R.version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 13.0 year 2011 month 04 day 13 svn rev 55427 language R version.string R version 2.13.0 (2011-04-13) On 2012-2-21 0:13, De-Jian Zhao wrote:> It seems OS-dependent. I got different results when trying it on > windows xp and Redhat linux. > > > > R.version > _ > platform x86_64-unknown-linux-gnu > arch x86_64 > os linux-gnu > system x86_64, linux-gnu > status > major 2 > minor 9.1 > year 2009 > month 06 > day 26 > svn rev 48839 > language R > version.string R version 2.9.1 (2009-06-26) > > sort(c("X.","X0B")) > [1] "X." "X0B" > > sort(c("X.Z","X0B.Z")) > [1] "X.Z" "X0B.Z" > > > > R.version > _ > platform x86_64-unknown-linux-gnu > arch x86_64 > os linux-gnu > system x86_64, linux-gnu > status > major 2 > minor 9.1 > year 2009 > month 06 > day 26 > svn rev 48839 > language R > version.string R version 2.9.1 (2009-06-26) > > sort(c("X.","X0B")) > [1] "X." "X0B" > > sort(c("X.Z","X0B.Z")) > [1] "X0B.Z" "X.Z" > > > On 2012-2-20 23:27, statquant2 wrote: >> Ok I have : >> >> R) str(R.Version()) >> List of 13 >> $ platform : chr "x86_64-unknown-linux-gnu" >> $ arch : chr "x86_64" >> $ os : chr "linux-gnu" >> $ system : chr "x86_64, linux-gnu" >> $ status : chr "" >> $ major : chr "2" >> $ minor : chr "12.2" >> $ year : chr "2011" >> $ month : chr "02" >> $ day : chr "25" >> $ svn rev : chr "54585" >> $ language : chr "R" >> $ version.string: chr "R version 2.12.2 (2011-02-25)" >> >> R) sort(c("X.","X0B")) >> [1] "X." "X0B" >> R) sort(c("X.Z","X0B.Z")) >> [1] "X0B.Z" "X.Z" >> >> I am using a linux redHat >> $ uname -a >> Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 >> x86_64 >> x86_64 GNU/Linux >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404298.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >