I'm having trouble understanding how APInts should be used. The APInt documentation states that it 'is a functional replacement for common case unsigned integer type', but I'm not seeing this because the internal logic is that the value is always treated as negative if the most significant bit is set. I'm interested in an add or sub that could be using a negative value. I have the following snippet of code to demostrate the issue: APInt Positive(8, 128, false); APInt Negative(8, -128, true); LLVM_DEBUG(dbgs() << "Positive: " << Positive << "\n"); LLVM_DEBUG(dbgs() << "Negative: " << Negative << "\n"); LLVM_DEBUG(dbgs() << "0 + Positive = " << 0 + Positive << "\n"); LLVM_DEBUG(dbgs() << "0 - Positive = " << 0 - Positive << "\n"); LLVM_DEBUG(dbgs() << "0 + Negative = " << 0 + Negative << "\n"); LLVM_DEBUG(dbgs() << "0 - Negative = " << 0 - Negative << "\n"); The output is: Positive: -128 Negative: -128 0 + Positive = -128 0 - Positive = -128 0 + Negative = -128 0 - Negative = -128 I know there are operators for when the sign matters, but from my example, either my understanding or the functionality is broken. If an abstract structure exists, why does the MSB still represent the sign? Especially when it's supposed to be an unsigned type! Thanks, Sam Sam Parker Compilation Tools Engineer | Arm . . . . . . . . . . . . . . . . . . . . . . . . . . . Arm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190131/198cc934/attachment.html>
On Thu, 31 Jan 2019 at 12:56, Sam Parker via llvm-dev <llvm-dev at lists.llvm.org> wrote:> The APInt documentation states that it 'is a functional replacement for common > case unsigned integer type', but I'm not seeing this because the internal logic > is that the value is always treated as negative if the most significant bit is > set.I take that as saying it's a 2s-complement type rather than overflow being UB, but the statement may still be misleading.> I know there are operators for when the sign matters, but from my example, > either my understanding or the functionality is broken.It's definitely quirky that it's always printed as a signed integer. My guess would be it stems from a very early decision about the friendliest ways to print IR's iN types, which was probably its first use-case (i.e. most people would prefer to see i64 -1 over i64 18446744073709551616). But I haven't done the archaeology to confirm it.> If an abstract > structure exists, why does the MSB still represent the sign? Especially > when it's supposed to be an unsigned type!I think it's be more correct to say it's an arbitrary precision type that could be either sign (again, much like LLVM's iN). There's a separate APSInt for a type that genuinely is either signed or unsigned in all cases. Cheers. Tim.
Cheers Tim, The real problem is not just in the printing though, any code can misinterpret the true value if one queries isNegative(). negate() will also produce the original value. I didn't know about APSInt. It seems I have been mislead and I think I will have to go back to some of my past patches... I know I'm not the only one to be caught out by this behaviour though, APSInt looks like a safer type to use. Thanks again, Sam Parker Compilation Tools Engineer | Arm . . . . . . . . . . . . . . . . . . . . . . . . . . . Arm.com ________________________________ From: Tim Northover <t.p.northover at gmail.com> Sent: 31 January 2019 13:25:32 To: Sam Parker Cc: llvm-dev at lists.llvm.org; nd Subject: Re: [llvm-dev] Behaviour of APInt On Thu, 31 Jan 2019 at 12:56, Sam Parker via llvm-dev <llvm-dev at lists.llvm.org> wrote:> The APInt documentation states that it 'is a functional replacement for common > case unsigned integer type', but I'm not seeing this because the internal logic > is that the value is always treated as negative if the most significant bit is > set.I take that as saying it's a 2s-complement type rather than overflow being UB, but the statement may still be misleading.> I know there are operators for when the sign matters, but from my example, > either my understanding or the functionality is broken.It's definitely quirky that it's always printed as a signed integer. My guess would be it stems from a very early decision about the friendliest ways to print IR's iN types, which was probably its first use-case (i.e. most people would prefer to see i64 -1 over i64 18446744073709551616). But I haven't done the archaeology to confirm it.> If an abstract > structure exists, why does the MSB still represent the sign? Especially > when it's supposed to be an unsigned type!I think it's be more correct to say it's an arbitrary precision type that could be either sign (again, much like LLVM's iN). There's a separate APSInt for a type that genuinely is either signed or unsigned in all cases. Cheers. Tim. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190131/9563b3de/attachment.html>