Aaron Ballman via llvm-dev
2019-Jun-12 06:55 UTC
[llvm-dev] [RFC] Coding Standards: "prefer `int` for regular arithmetic, use `unsigned` only for bitmask and when you intend to rely on wrapping behavior."
On Tue, Jun 11, 2019, 9:59 PM Zachary Turner <zturner at roblox.com> wrote:> On Tue, Jun 11, 2019 at 12:24 PM Mehdi AMINI <joker.eph at gmail.com> wrote: > >> I agree that readability, maintainability, and ability to debug/find >> issues are key. >> I haven't found myself in a situation where unsigned was helping my >> readability: on the opposite actually I am always wondering where is the >> expecting wrap-around behavior and that is one more thing I have to keep in >> mind when I read code that manipulate unsigned. So YMMV but using unsigned >> *increases* my mental load when reading code. >> > I'm on the other end. I'm always reading the code wondering "is this > going to warn?" "Why could a container ever have a negative number of > elements?" "The maximum value representable by the return type (unsigned) > is larger than that of the value i'm storing it in (signed), so an overflow > could happen even if there were no error. What then?" >Strong +1 to this. ~Aaron> > On Tue, Jun 11, 2019 at 12:26 PM Michael Kruse <llvmdev at meinersbur.de> > wrote: > >> Am Di., 11. Juni 2019 um 11:45 Uhr schrieb Zachary Turner via llvm-dev >> <llvm-dev at lists.llvm.org>: >> > >> > I'm personally against changing everything to signed integers. To me, >> this is an example of making code strictly less readable and more confusing >> in order to fight deficiencies in the language standard. I get the problem >> that it's solving, but I view this as mostly a theoretical problem, whereas >> being able to read the code and have it make sense is a practical problem >> that we must face on a daily basis. If you change everything to signed >> integers, you may catch a real problem with it a couple of times a year. >> And by "real problem" here, I'm talking about a miscompile or an actual bug >> that surfaces in production somewhere, rather than a "yes, it seems >> theoretically possible for this to overflow". >> >> Doesn't it make it already worth it? >> > vector.size() returns a size_t, which on 64-bit platforms can represent > types values larger than those that can fit into an int64_t. So to turn > your argument around, since it's theoretically possible to have a vector > with more items than an int64_t can represent, isn't it already worth it to > use size_t, which is an unsigned type? > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190612/793fd999/attachment.html>
Renato Golin via llvm-dev
2019-Jun-12 08:01 UTC
[llvm-dev] [RFC] Coding Standards: "prefer `int` for regular arithmetic, use `unsigned` only for bitmask and when you intend to rely on wrapping behavior."
+1 to both points here. On Wed, 12 Jun 2019, 07:55 Aaron Ballman via llvm-dev, < llvm-dev at lists.llvm.org> wrote:> > > On Tue, Jun 11, 2019, 9:59 PM Zachary Turner <zturner at roblox.com> wrote: > >> On Tue, Jun 11, 2019 at 12:24 PM Mehdi AMINI <joker.eph at gmail.com> wrote: >> >>> I agree that readability, maintainability, and ability to debug/find >>> issues are key. >>> I haven't found myself in a situation where unsigned was helping my >>> readability: on the opposite actually I am always wondering where is the >>> expecting wrap-around behavior and that is one more thing I have to keep in >>> mind when I read code that manipulate unsigned. So YMMV but using unsigned >>> *increases* my mental load when reading code. >>> >> I'm on the other end. I'm always reading the code wondering "is this >> going to warn?" "Why could a container ever have a negative number of >> elements?" "The maximum value representable by the return type (unsigned) >> is larger than that of the value i'm storing it in (signed), so an overflow >> could happen even if there were no error. What then?" >> > > Strong +1 to this. > > ~Aaron > > >> >> On Tue, Jun 11, 2019 at 12:26 PM Michael Kruse <llvmdev at meinersbur.de> >> wrote: >> >>> Am Di., 11. Juni 2019 um 11:45 Uhr schrieb Zachary Turner via llvm-dev >>> <llvm-dev at lists.llvm.org>: >>> > >>> > I'm personally against changing everything to signed integers. To me, >>> this is an example of making code strictly less readable and more confusing >>> in order to fight deficiencies in the language standard. I get the problem >>> that it's solving, but I view this as mostly a theoretical problem, whereas >>> being able to read the code and have it make sense is a practical problem >>> that we must face on a daily basis. If you change everything to signed >>> integers, you may catch a real problem with it a couple of times a year. >>> And by "real problem" here, I'm talking about a miscompile or an actual bug >>> that surfaces in production somewhere, rather than a "yes, it seems >>> theoretically possible for this to overflow". >>> >>> Doesn't it make it already worth it? >>> >> vector.size() returns a size_t, which on 64-bit platforms can represent >> types values larger than those that can fit into an int64_t. So to turn >> your argument around, since it's theoretically possible to have a vector >> with more items than an int64_t can represent, isn't it already worth it to >> use size_t, which is an unsigned type? >> >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190612/8fd6c45d/attachment.html>
Hubert Tong via llvm-dev
2019-Jun-12 14:17 UTC
[llvm-dev] [RFC] Coding Standards: "prefer `int` for regular arithmetic, use `unsigned` only for bitmask and when you intend to rely on wrapping behavior."
On Wed, Jun 12, 2019 at 4:01 AM Renato Golin via llvm-dev < llvm-dev at lists.llvm.org> wrote:> +1 to both points here. > > On Wed, 12 Jun 2019, 07:55 Aaron Ballman via llvm-dev, < > llvm-dev at lists.llvm.org> wrote: > >> >> >> On Tue, Jun 11, 2019, 9:59 PM Zachary Turner <zturner at roblox.com> wrote: >> >>> On Tue, Jun 11, 2019 at 12:24 PM Mehdi AMINI <joker.eph at gmail.com> >>> wrote: >>> >>>> I agree that readability, maintainability, and ability to debug/find >>>> issues are key. >>>> I haven't found myself in a situation where unsigned was helping my >>>> readability: on the opposite actually I am always wondering where is the >>>> expecting wrap-around behavior and that is one more thing I have to keep in >>>> mind when I read code that manipulate unsigned. So YMMV but using unsigned >>>> *increases* my mental load when reading code. >>>> >>> I'm on the other end. I'm always reading the code wondering "is this >>> going to warn?" >>> >> Indeed, using comparison against 0u from the "obvious context" has savedme from warnings.> "Why could a container ever have a negative number of elements?" "The >>> maximum value representable by the return type (unsigned) is larger than >>> that of the value i'm storing it in (signed), so an overflow could happen >>> even if there were no error. What then?" >>> >> >> Strong +1 to this. >> > clang::ASTContext::toCharUnitsFromBits(int64_t) makes me cringe. It tendsto be called with uint64_t values like the Width field in the instance of clang::TypeInfo.> >> ~Aaron >> >> >>> >>> On Tue, Jun 11, 2019 at 12:26 PM Michael Kruse <llvmdev at meinersbur.de> >>> wrote: >>> >>>> Am Di., 11. Juni 2019 um 11:45 Uhr schrieb Zachary Turner via llvm-dev >>>> <llvm-dev at lists.llvm.org>: >>>> > >>>> > I'm personally against changing everything to signed integers. To >>>> me, this is an example of making code strictly less readable and more >>>> confusing in order to fight deficiencies in the language standard. I get >>>> the problem that it's solving, but I view this as mostly a theoretical >>>> problem, whereas being able to read the code and have it make sense is a >>>> practical problem that we must face on a daily basis. If you change >>>> everything to signed integers, you may catch a real problem with it a >>>> couple of times a year. And by "real problem" here, I'm talking about a >>>> miscompile or an actual bug that surfaces in production somewhere, rather >>>> than a "yes, it seems theoretically possible for this to overflow". >>>> >>>> Doesn't it make it already worth it? >>>> >>> vector.size() returns a size_t, which on 64-bit platforms can represent >>> types values larger than those that can fit into an int64_t. So to turn >>> your argument around, since it's theoretically possible to have a vector >>> with more items than an int64_t can represent, isn't it already worth it to >>> use size_t, which is an unsigned type? >>> >>> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190612/6ed1ec56/attachment.html>
Chandler Carruth via llvm-dev
2019-Jun-13 01:20 UTC
[llvm-dev] [RFC] Coding Standards: "prefer `int` for regular arithmetic, use `unsigned` only for bitmask and when you intend to rely on wrapping behavior."
FWIW, the talks linked by Mehdi really do talk about these things and why I don't think the really are the correct trade-off. Even if you imagine an unsigned type that doesn't allow wrapping, I think this is a really bad type. The problem is that you have made the most common value of the type (zero in every study I'm aware of) be a boundary condition. Today, it wraps to a huge value if you cross it. Afterward, it would trap. Both are super surprising. Another way of looking at the same lens: do you subtract these values? Should `a + (b - c)` be the same as `(a + b) - c`? You either need a signed type or wrapping to have reasonable answers here. And if you solve this with wrapping, then it makes any attempt to write assertions or other checks in the same type system very difficult. The fact that you write an assert to check for "did I accidentally go past zero?" by conjuring some "it's probably too large" value and then comparing if it is *greater* than that is ... extraordinarily confusing. Meanwhile, with signed types, it is quite easy to write asserts that check for non-negative values in the correct places. They are easy to read and produce easily understood errors. The boundary conditions are uncommon. Even on the C++ standards committee, there is remarkably strong consensus that in the *absence* of unsigned types coming back from `.size()` methods and such, we should be using signed types for the reasons above. The fact that we have unsigned `size_t` in a bunch of places is, IMO, a concern and it is important to have good ways of avoiding warnings. But I think we have so very many ways that don't require us to just use unsigned types everywhere and deal with the above issues: - Change the return types of our containers `size()` methods. - Add a `ssize()` method. (This is the direction the committee is moving AFAICT, but they are constrained by a powerful desire to break zero code, where as LLVM's containers have much more API freedom.) - Use idioms like the one I suggested with `llvm::seq`. Any or all of these seem significantly preferable to the readability concerns I outline above, at least to me. This is why I am still *strongly* in favor of signed types and assertions around value at known points where the value should obey that assertion. -Chandler On Wed, Jun 12, 2019 at 1:01 AM Renato Golin via llvm-dev < llvm-dev at lists.llvm.org> wrote:> +1 to both points here. > > On Wed, 12 Jun 2019, 07:55 Aaron Ballman via llvm-dev, < > llvm-dev at lists.llvm.org> wrote: > >> >> >> On Tue, Jun 11, 2019, 9:59 PM Zachary Turner <zturner at roblox.com> wrote: >> >>> On Tue, Jun 11, 2019 at 12:24 PM Mehdi AMINI <joker.eph at gmail.com> >>> wrote: >>> >>>> I agree that readability, maintainability, and ability to debug/find >>>> issues are key. >>>> I haven't found myself in a situation where unsigned was helping my >>>> readability: on the opposite actually I am always wondering where is the >>>> expecting wrap-around behavior and that is one more thing I have to keep in >>>> mind when I read code that manipulate unsigned. So YMMV but using unsigned >>>> *increases* my mental load when reading code. >>>> >>> I'm on the other end. I'm always reading the code wondering "is this >>> going to warn?" "Why could a container ever have a negative number of >>> elements?" "The maximum value representable by the return type (unsigned) >>> is larger than that of the value i'm storing it in (signed), so an overflow >>> could happen even if there were no error. What then?" >>> >> >> Strong +1 to this. >> >> ~Aaron >> >> >>> >>> On Tue, Jun 11, 2019 at 12:26 PM Michael Kruse <llvmdev at meinersbur.de> >>> wrote: >>> >>>> Am Di., 11. Juni 2019 um 11:45 Uhr schrieb Zachary Turner via llvm-dev >>>> <llvm-dev at lists.llvm.org>: >>>> > >>>> > I'm personally against changing everything to signed integers. To >>>> me, this is an example of making code strictly less readable and more >>>> confusing in order to fight deficiencies in the language standard. I get >>>> the problem that it's solving, but I view this as mostly a theoretical >>>> problem, whereas being able to read the code and have it make sense is a >>>> practical problem that we must face on a daily basis. If you change >>>> everything to signed integers, you may catch a real problem with it a >>>> couple of times a year. And by "real problem" here, I'm talking about a >>>> miscompile or an actual bug that surfaces in production somewhere, rather >>>> than a "yes, it seems theoretically possible for this to overflow". >>>> >>>> Doesn't it make it already worth it? >>>> >>> vector.size() returns a size_t, which on 64-bit platforms can represent >>> types values larger than those that can fit into an int64_t. So to turn >>> your argument around, since it's theoretically possible to have a vector >>> with more items than an int64_t can represent, isn't it already worth it to >>> use size_t, which is an unsigned type? >>> >>> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190612/40d574e3/attachment.html>