Avi Gross
2022-Jan-31 17:32 UTC
[R] [External] Weird behaviour of order() when having multiple ties
Tim, I thought I saw someone tell you that the order in which EQUAL items are presented is not deterministic in that any order, whether given by order() or sort() or anything else, is VALID. Unless you supply additional constraints such as a second key to sort by, then the order becomes deterministic up to the point where both the keys are the same. Here is a dumb suggestion. Place your Dat1 vector in a data.frame alongside another vector of 1:length(Dat1) and use some method that orders by Dat1 and then by the second vector, ascending. You have now forced it to take the first of a matching set before any others.? Let me try another. Say I give you a problem that might have multiple answer such as a quadratic equation with solutions of 2 and 10. You ask me for AN answer and I say 10. Am I wrong? You ask me for all answers and I say [10,2] and someone else says [2,10] and you wonder which of us is right. Well we are both right. The proper way to test is not to ask if the lists or tuples or anything ordered is equivalent but to use something like a set and show that they are equivalent or something like one is a subset of the other both ways.? Back to your topic, you are suggesting two independent developers should come up with algorithms to solve similar but different tasks the same way. Do you have any idea how many methods there are for sorting things? This site lists eleven and I am sure there are many more. https://www.javatpoint.com/sorting-algorithms What is considered more important is choosing an algorithm that works well on the kinds of data and some of those methods do not keep the data in the same order and produce results in the same order. What you are pointing out is not an error but an inconsistency. The message to you is to not depend on a UNIQUE solution. -----Original Message----- From: Ebert,Timothy Aaron <tebert at ufl.edu> To: Martin Maechler <maechler at stat.math.ethz.ch>; Stefan Fleck <stefan.b.fleck at gmail.com> Cc: r-help at r-project.org <r-help at r-project.org> Sent: Mon, Jan 31, 2022 10:07 am Subject: Re: [R] [External] Weird behaviour of order() when having multiple ties Dat1 <- c(0.6, 0.5, 0.3, 0.2, 0.1, 0.1, 0.2) print(order(Dat1)) print(sort(Dat1)) Compare output -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Martin Maechler Sent: Monday, January 31, 2022 9:04 AM To: Stefan Fleck <stefan.b.fleck at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] [External] Weird behaviour of order() when having multiple ties [External Email]>>>>> Stefan Fleck >>>>>? ? on Sun, 30 Jan 2022 21:07:19 +0100 writes:? ? > it's not about the sort order of the ties, shouldn't all the 1s in ? ? > order(c(2,3,4,1,1,1,1,1)) come before 2,3,4? because that's not what ? ? > happening aaah.. now we are getting somewhere: It looks you have always confused order() with sort() ... have you ? ? ? > On Sun, Jan 30, 2022 at 9:00 PM Richard M. Heiberger <rmh at temple.edu> wrote: ? ? >> when there are ties it doesn't matter which is first. ? ? >> in a situation where it does matter, you will need a tiebreaker column. ? ? >> ------------------------------ ? ? >> *From:* R-help <r-help-bounces at r-project.org> on behalf of Stefan Fleck < ? ? >> stefan.b.fleck at gmail.com> ? ? >> *Sent:* Sunday, January 30, 2022 4:16:44 AM ? ? >> *To:* r-help at r-project.org <r-help at r-project.org> ? ? >> *Subject:* [External] [R] Weird behaviour of order() when having multiple ? ? >> ties ? ? >> ? ? >> I am experiencing a weird behavior of `order()` for numeric vectors. I ? ? >> tested on 3.6.2 and 4.1.2 for windows and R 4.0.2 on ubuntu. Can anyone ? ? >> confirm? ? ? >> ? ? >> order( ? ? >> c( ? ? >> 0.6, ? ? >> 0.5, ? ? >> 0.3, ? ? >> 0.2, ? ? >> 0.1, ? ? >> 0.1 ? ? >> ) ? ? >> ) ? ? >> ## Result [should be in order] ? ? >> [1] 5 6 4 3 2 1 ? ? >> ? ? >> The sort order is obviously wrong. This only occurs if i have multiple ? ? >> ties. The problem does _not_ occur for decreasing = TRUE. ? ? >> ? ? >> [[alternative HTML version deleted]] ? ? >> ? ? >> ______________________________________________ ? ? >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see ? ? >> ? ? >> https://urldefense.proofpoint.com/v2/url?u=https-3A__nam10.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fstat.ethz.ch-252Fmailman-252Flistinfo-252Fr-2Dhelp-26amp-3Bdata-3D04-257C01-257Crmh-2540temple.edu-257Cbae20314c2314a5cc7cd08d9e429e33f-257C716e81efb52244738e3110bd02ccf6e5-257C0-257C0-257C637791692024451993-257CUnknown-257CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0-253D-257C3000-26amp-3Bsdata-3DO6R-252FNM6IdPzP8RY3JIWfLgmkE-252B0KcVyYBxoRMo8v2dk-253D-26amp-3Breserved-3D0&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=kydE98W9Su8vCPoxYcigO1iYSHVO2pjdbYqF8z4CEwo&e? ? >> PLEASE do read the posting guide ? ? >> https://urldefense.proofpoint.com/v2/url?u=https-3A__nam10.safelinks.protection.outlook.com_-3Furl-3Dhttp-253A-252F-252Fwww.r-2Dproject.org-252Fposting-2Dguide.html-26amp-3Bdata-3D04-257C01-257Crmh-2540temple.edu-257Cbae20314c2314a5cc7cd08d9e429e33f-257C716e81efb52244738e3110bd02ccf6e5-257C0-257C0-257C637791692024451993-257CUnknown-257CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0-253D-257C3000-26amp-3Bsdata-3D6hlfMjZLzopVzGnFVWlGnoEqvZBQwXPlxMuZ2sglEUk-253D-26amp-3Breserved-3D0&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=_xSJacXhmOM-JE0jBCZ62UPEgerWHVqFkW2aXuIekvY&e? ? >> and provide commented, minimal, self-contained, reproducible code. ? ? >> ? ? > [[alternative HTML version deleted]] ? ? > ______________________________________________ ? ? > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see ? ? > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=eoBL8fgGe-j3eEYAo1fT5-oVM-5twH3nn5iTJ3Dh6vc&e? ? > PLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=6QEl5w7lJHJJELW6QwypJN8KK64mDcTZXg5yoLs9Wu4&e? ? > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=eoBL8fgGe-j3eEYAo1fT5-oVM-5twH3nn5iTJ3Dh6vc&ePLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=6QEl5w7lJHJJELW6QwypJN8KK64mDcTZXg5yoLs9Wu4&eand provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Ebert,Timothy Aaron
2022-Jan-31 20:36 UTC
[R] [External] Weird behaviour of order() when having multiple ties
Dear Avi, 1. I made no comment or question or statement about EQUAL items. That was elsewhere in posts by you and others in this thread. 2. My intent was only to show a simple example of Stefan?s post wherein the outcome of sort() and order() are different. Not why, or how, just they are different. If you type order() when you mean sort() things will not work as expected, as shown below. 3. Yes both right due to communicative law of multiplication. I don?t see the point. I am suggesting nothing! I am simply observing the behavior of a function. If it satisfies my need then great. If not, I need to write my own or find a different function. It does help if I clearly understand the output of the function, and sometimes the documentation is not as helpful as hoped for given the range of readers from novice to expert. Here is the data in a different format where order=1 means it is the first observation in the data. Dat1 Order 1 2 3 4 5 6 7 Data 0.6 0.5 0.3 0.2 0.1 0.1 0.2 print(order(Dat1)) returns [1] 5 6 4 7 3 2 1 So I sort the raw data by ?Data? so that the values of order remain with each observed data point. Original Order 5 6 4 7 3 2 1 Data 0.1 0.1 0.2 0.2 0.3 0.5 0.6 Now reading off the values in row named ?Order? I get the result of print(order(Dat1)). Order does not return the sorted data, it returns the location of the sorted value in the original dataset. At least that is what it looks like. I assume that this is what the documentation means by ? ?order? returns a permutation which rearranges its first argument into ascending or descending order? but I am afraid that I still do not get that connection from the text provided: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/order. As far as I can tell there is no error or inconsistency. I am not quite skilled enough in R to take DatSort <- order(Dat1) and then return the sorted data, but I need to move to other tasks. I am really sorry that my post makes you mad. Regards, Tim From: Avi Gross <avigross at verizon.net> Sent: Monday, January 31, 2022 12:33 PM To: Ebert,Timothy Aaron <tebert at ufl.edu>; maechler at stat.math.ethz.ch; stefan.b.fleck at gmail.com Cc: r-help at r-project.org Subject: Re: [R] [External] Weird behaviour of order() when having multiple ties [External Email] Tim, I thought I saw someone tell you that the order in which EQUAL items are presented is not deterministic in that any order, whether given by order() or sort() or anything else, is VALID. Unless you supply additional constraints such as a second key to sort by, then the order becomes deterministic up to the point where both the keys are the same. Here is a dumb suggestion. Place your Dat1 vector in a data.frame alongside another vector of 1:length(Dat1) and use some method that orders by Dat1 and then by the second vector, ascending. You have now forced it to take the first of a matching set before any others. Let me try another. Say I give you a problem that might have multiple answer such as a quadratic equation with solutions of 2 and 10. You ask me for AN answer and I say 10. Am I wrong? You ask me for all answers and I say [10,2] and someone else says [2,10] and you wonder which of us is right. Well we are both right. The proper way to test is not to ask if the lists or tuples or anything ordered is equivalent but to use something like a set and show that they are equivalent or something like one is a subset of the other both ways. Back to your topic, you are suggesting two independent developers should come up with algorithms to solve similar but different tasks the same way. Do you have any idea how many methods there are for sorting things? This site lists eleven and I am sure there are many more. https://www.javatpoint.com/sorting-algorithms<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.javatpoint.com_sorting-2Dalgorithms&d=DwQFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=MjF5JhlNdgX8jQTswjCkVq_ChikBGD8fV7RNUT4RTUDnoY7_r1NIlx_C6vqazTID&s=H9vveKI2Vjbmp69vjpKDu0tZ_xNIdFrIga2AtzsBUCQ&e=> What is considered more important is choosing an algorithm that works well on the kinds of data and some of those methods do not keep the data in the same order and produce results in the same order. What you are pointing out is not an error but an inconsistency. The message to you is to not depend on a UNIQUE solution. -----Original Message----- From: Ebert,Timothy Aaron <tebert at ufl.edu<mailto:tebert at ufl.edu>> To: Martin Maechler <maechler at stat.math.ethz.ch<mailto:maechler at stat.math.ethz.ch>>; Stefan Fleck <stefan.b.fleck at gmail.com<mailto:stefan.b.fleck at gmail.com>> Cc: r-help at r-project.org<mailto:r-help at r-project.org> <r-help at r-project.org<mailto:r-help at r-project.org>> Sent: Mon, Jan 31, 2022 10:07 am Subject: Re: [R] [External] Weird behaviour of order() when having multiple ties Dat1 <- c(0.6, 0.5, 0.3, 0.2, 0.1, 0.1, 0.2) print(order(Dat1)) print(sort(Dat1)) Compare output -----Original Message----- From: R-help <r-help-bounces at r-project.org<mailto:r-help-bounces at r-project.org>> On Behalf Of Martin Maechler Sent: Monday, January 31, 2022 9:04 AM To: Stefan Fleck <stefan.b.fleck at gmail.com<mailto:stefan.b.fleck at gmail.com>> Cc: r-help at r-project.org<mailto:r-help at r-project.org> Subject: Re: [R] [External] Weird behaviour of order() when having multiple ties [External Email]>>>>> Stefan Fleck >>>>> on Sun, 30 Jan 2022 21:07:19 +0100 writes:> it's not about the sort order of the ties, shouldn't all the 1s in > order(c(2,3,4,1,1,1,1,1)) come before 2,3,4? because that's not what > happening aaah.. now we are getting somewhere: It looks you have always confused order() with sort() ... have you ? > On Sun, Jan 30, 2022 at 9:00 PM Richard M. Heiberger <rmh at temple.edu<mailto:rmh at temple.edu>> wrote: >> when there are ties it doesn't matter which is first. >> in a situation where it does matter, you will need a tiebreaker column. >> ------------------------------ >> *From:* R-help <r-help-bounces at r-project.org<mailto:r-help-bounces at r-project.org>> on behalf of Stefan Fleck < >> stefan.b.fleck at gmail.com<mailto:stefan.b.fleck at gmail.com>> >> *Sent:* Sunday, January 30, 2022 4:16:44 AM >> *To:* r-help at r-project.org<mailto:r-help at r-project.org> <r-help at r-project.org<mailto:r-help at r-project.org>> >> *Subject:* [External] [R] Weird behaviour of order() when having multiple >> ties >> >> I am experiencing a weird behavior of `order()` for numeric vectors. I >> tested on 3.6.2 and 4.1.2 for windows and R 4.0.2 on ubuntu. Can anyone >> confirm? >> >> order( >> c( >> 0.6, >> 0.5, >> 0.3, >> 0.2, >> 0.1, >> 0.1 >> ) >> ) >> ## Result [should be in order] >> [1] 5 6 4 3 2 1 >> >> The sort order is obviously wrong. This only occurs if i have multiple >> ties. The problem does _not_ occur for decreasing = TRUE. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see >> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__nam10.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A-252F-252Fstat.ethz.ch-252Fmailman-252Flistinfo-252Fr-2Dhelp-26amp-3Bdata-3D04-257C01-257Crmh-2540temple.edu-257Cbae20314c2314a5cc7cd08d9e429e33f-257C716e81efb52244738e3110bd02ccf6e5-257C0-257C0-257C637791692024451993-257CUnknown-257CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0-253D-257C3000-26amp-3Bsdata-3DO6R-252FNM6IdPzP8RY3JIWfLgmkE-252B0KcVyYBxoRMo8v2dk-253D-26amp-3Breserved-3D0&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=kydE98W9Su8vCPoxYcigO1iYSHVO2pjdbYqF8z4CEwo&e >> PLEASE do read the posting guide >> https://urldefense.proofpoint.com/v2/url?u=https-3A__nam10.safelinks.protection.outlook.com_-3Furl-3Dhttp-253A-252F-252Fwww.r-2Dproject.org-252Fposting-2Dguide.html-26amp-3Bdata-3D04-257C01-257Crmh-2540temple.edu-257Cbae20314c2314a5cc7cd08d9e429e33f-257C716e81efb52244738e3110bd02ccf6e5-257C0-257C0-257C637791692024451993-257CUnknown-257CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0-253D-257C3000-26amp-3Bsdata-3D6hlfMjZLzopVzGnFVWlGnoEqvZBQwXPlxMuZ2sglEUk-253D-26amp-3Breserved-3D0&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=_xSJacXhmOM-JE0jBCZ62UPEgerWHVqFkW2aXuIekvY&e >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] > ______________________________________________ > R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=eoBL8fgGe-j3eEYAo1fT5-oVM-5twH3nn5iTJ3Dh6vc&e > PLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=6QEl5w7lJHJJELW6QwypJN8KK64mDcTZXg5yoLs9Wu4&e > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=eoBL8fgGe-j3eEYAo1fT5-oVM-5twH3nn5iTJ3Dh6vc&ePLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=LETnyM_0QTFNEMNectANV0IJZgIOyofIv54iJDPBZF-atb3Xe9lGTZ7tN68hw3Te&s=6QEl5w7lJHJJELW6QwypJN8KK64mDcTZXg5yoLs9Wu4&e and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help<https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwMFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=MjF5JhlNdgX8jQTswjCkVq_ChikBGD8fV7RNUT4RTUDnoY7_r1NIlx_C6vqazTID&s=SF_ilzeKSVkTO6e49bpFFdcfanNKZkEY43BWf1s8AOA&e=> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.r-2Dproject.org_posting-2Dguide.html&d=DwMFaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=MjF5JhlNdgX8jQTswjCkVq_ChikBGD8fV7RNUT4RTUDnoY7_r1NIlx_C6vqazTID&s=eeD3ujlK8Hh2OtJsJH4q84v5y3OriS1bq3KuLz1GYWs&e=> and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]