Hi all, I am having difficulties to understand how R sort strings:
If I do
R) sort(c("X.","X0B"))
[1] "X." "X0B"
So for me, as far as lexicographic order is concerned I can add whatever to
the end, the order will remain the same, but :
R) sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"
Can somebody give me a trick for the order to become lexicographic ?
--
View this message in context:
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4403696.html
Sent from the R help mailing list archive at Nabble.com.
On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote:> Hi all, I am having difficulties to understand how R sort strings: > > If I do > R) sort(c("X.","X0B")) > [1] "X." "X0B" > > So for me, as far as lexicographic order is concerned I can add whatever to > the end, the order will remain the same, but :Hi. This neednot be true for strings of different length. For example ab abc become by concatenation with z abcz abz Petr Savicky.
Ok so it changed from 2.12.2 to 2.14.1 ?? Can somebody tell me how to modify my sort or whatever to get the save resilt that I would get in 2.14.1 ? Cheers -- View this message in context: http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4403858.html Sent from the R help mailing list archive at Nabble.com.
See ?Comparison, which holds some warnings about what to expect when sorting strings. Am 20.02.2012 11:51, schrieb Petr Savicky:> On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote: >> Hi all, I am having difficulties to understand how R sort strings: >> >> If I do >> R) sort(c("X.","X0B")) >> [1] "X." "X0B" >> >> So for me, as far as lexicographic order is concerned I can add whatever to >> the end, the order will remain the same, but : > > Hi. > > This neednot be true for strings of different length. > For example > > ab > abc > > become by concatenation with z > > abcz > abz > > Petr Savicky. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Enrico Schumann Lucerne, Switzerland http://nmof.net/
I did, but this does not give the answer to my question... Anybody knows how to tweack the behaviour of sort or how to do ? -- View this message in context: http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404091.html Sent from the R help mailing list archive at Nabble.com.
On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote:> I did, but this does not give the answer to my question... > Anybody knows how to tweack the behaviour of sort or how to do ?Hi. Try this Sys.setlocale("LC_COLLATE", "C") This comes from ?locale and reads there Sys.setlocale("LC_COLLATE", "C") # turn off locale-specific sorting, # usually See also ?sort The sort order for character vectors will depend on the collating sequence of the locale in use: see ?Comparison?. ?Comparison Comparison of strings in character vectors is lexicographic within the strings using the collating sequence of the locale in use: see ?locales?. The collating sequence of locales such as ?en_US? is normally different from ?C? (which should use ASCII) and can be surprising. Beware of making _any_ assumptions about the collation order: ... Hope this helps. Petr Savicky.
NICE DUUUUDE It solves my problem ! Awesome stuff -- View this message in context: http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404424.html Sent from the R help mailing list archive at Nabble.com.
It seems OS-dependent. I got different results when trying it on windows
xp and Redhat linux.
> R.version
_
platform x86_64-unknown-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 2
minor 9.1
year 2009
month 06
day 26
svn rev 48839
language R
version.string R version 2.9.1 (2009-06-26)
> sort(c("X.","X0B"))
[1] "X." "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X.Z" "X0B.Z"
> R.version
_
platform x86_64-unknown-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 2
minor 9.1
year 2009
month 06
day 26
svn rev 48839
language R
version.string R version 2.9.1 (2009-06-26)
> sort(c("X.","X0B"))
[1] "X." "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"
On 2012-2-20 23:27, statquant2 wrote:> Ok I have :
>
> R) str(R.Version())
> List of 13
> $ platform : chr "x86_64-unknown-linux-gnu"
> $ arch : chr "x86_64"
> $ os : chr "linux-gnu"
> $ system : chr "x86_64, linux-gnu"
> $ status : chr ""
> $ major : chr "2"
> $ minor : chr "12.2"
> $ year : chr "2011"
> $ month : chr "02"
> $ day : chr "25"
> $ svn rev : chr "54585"
> $ language : chr "R"
> $ version.string: chr "R version 2.12.2 (2011-02-25)"
>
> R) sort(c("X.","X0B"))
> [1] "X." "X0B"
> R) sort(c("X.Z","X0B.Z"))
> [1] "X0B.Z" "X.Z"
>
> I am using a linux redHat
> $ uname -a
> Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 x86_64
> x86_64 GNU/Linux
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404298.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
On Mon, Feb 20, 2012 at 04:56:21PM +0100, Petr Savicky wrote:> On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote: > > I did, but this does not give the answer to my question... > > Anybody knows how to tweack the behaviour of sort or how to do ? > > Hi. > > Try this > > Sys.setlocale("LC_COLLATE", "C") > > > This comes from ?locale and reads thereThis is not in ?locale, but in ?locales> Sys.setlocale("LC_COLLATE", "C") # turn off locale-specific sorting, > # usuallyThis in the example section at the end. Try also to see Sys.getlocale() Relevant can also be LC_CTYPE Sys.setlocale("LC_CTYPE", "C") Hope this helps. Petr Savicky.
Sorry, just made a mistake. This is the result from windows xp.
> sort(c("X.","X0B"))
[1] "X." "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X.Z" "X0B.Z"
> R.version
_
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 13.0
year 2011
month 04
day 13
svn rev 55427
language R
version.string R version 2.13.0 (2011-04-13)
On 2012-2-21 0:13, De-Jian Zhao wrote:> It seems OS-dependent. I got different results when trying it on
> windows xp and Redhat linux.
>
>
> > R.version
> _
> platform x86_64-unknown-linux-gnu
> arch x86_64
> os linux-gnu
> system x86_64, linux-gnu
> status
> major 2
> minor 9.1
> year 2009
> month 06
> day 26
> svn rev 48839
> language R
> version.string R version 2.9.1 (2009-06-26)
> > sort(c("X.","X0B"))
> [1] "X." "X0B"
> > sort(c("X.Z","X0B.Z"))
> [1] "X.Z" "X0B.Z"
>
>
> > R.version
> _
> platform x86_64-unknown-linux-gnu
> arch x86_64
> os linux-gnu
> system x86_64, linux-gnu
> status
> major 2
> minor 9.1
> year 2009
> month 06
> day 26
> svn rev 48839
> language R
> version.string R version 2.9.1 (2009-06-26)
> > sort(c("X.","X0B"))
> [1] "X." "X0B"
> > sort(c("X.Z","X0B.Z"))
> [1] "X0B.Z" "X.Z"
>
>
> On 2012-2-20 23:27, statquant2 wrote:
>> Ok I have :
>>
>> R) str(R.Version())
>> List of 13
>> $ platform : chr "x86_64-unknown-linux-gnu"
>> $ arch : chr "x86_64"
>> $ os : chr "linux-gnu"
>> $ system : chr "x86_64, linux-gnu"
>> $ status : chr ""
>> $ major : chr "2"
>> $ minor : chr "12.2"
>> $ year : chr "2011"
>> $ month : chr "02"
>> $ day : chr "25"
>> $ svn rev : chr "54585"
>> $ language : chr "R"
>> $ version.string: chr "R version 2.12.2 (2011-02-25)"
>>
>> R) sort(c("X.","X0B"))
>> [1] "X." "X0B"
>> R) sort(c("X.Z","X0B.Z"))
>> [1] "X0B.Z" "X.Z"
>>
>> I am using a linux redHat
>> $ uname -a
>> Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64
>> x86_64
>> x86_64 GNU/Linux
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404298.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>