Esperanza de Escobar
2012-Jan-24 16:36 UTC
[LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
No one is arguing that there aren't ABI specs or LLVM design guidelines that say that unaligned accesses "should not", "could not" or "aren't guaranteed to" work, because it's besides the point. The point is that unaligned 32-bit loads and stores *work in practice* on every single ARM device Apple has ever manufactured. I'm not a hardware person, but I'm guessing it takes a non-negligible amount of silicon to support them. With Xcode's switch to LLVM, this deployed silicon has suddenly become off-limits because of a single overzealous optimization. The only possible workarounds are assembly code and turning the optimizer off altogether. It would be one thing if the optimizer generated ldrd/strd for 64-bit loads and stores only. But it actually goes as far as taking two separate 32-bit accesses and merging them into one silently-incompatible 64-bit access. These two accesses could be unrelated to one another in the context of the code at hand, and even syntactically distant. This introduces a stealth crash into Apple-only code that was bug-free under gcc. And realizing what's going on requires familiarity with ARM assembly. With a bit of googling you'll find other support board posts asking "Why is my code suddenly crashing after upgrading Xcode?" Unfortunately, there is no way to turn this optimization off. Not to mention that it's not much of an optimization at all. I'd be surprised if you could measure a performance improvement on a real-world program with less than a million iterations, maybe orders more. You can measure its damage pretty significantly, though. We've already spent a lot of hours tracking down unaligned accesses and wrapping them in assembly macros. Which of course ends up disabling other, actually useful optimizations. And with a large codebase, we can't be sure we've found every last one yet. Whether this optimization is academically acceptable or not, its net impact in real-world terms is exceedingly negative. EdE On 1/24/12, David Blaikie <dblaikie at gmail.com> wrote:>> Note that this code compiled with GCC 4.2 runs perfectly whereas LLVM will >> produce a binary that crashes: LLVM breaks existing source code. > > On this point: > This is not uncommon - and the very nature of "Undefined Behaviour". > This reason alone is not enough to justify a change to Clang. We/you > would need to show that the behaviour is defined & Clang is violating > that definition. > > Chris has posted about the general principle at length starting here: > http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Joerg Sonnenberger
2012-Jan-24 17:15 UTC
[LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
On Tue, Jan 24, 2012 at 08:36:17AM -0800, Esperanza de Escobar wrote:> No one is arguing that there aren't ABI specs or LLVM design > guidelines that say that unaligned accesses "should not", "could not" > or "aren't guaranteed to" work, because it's besides the point.No, it is the core of the issue. Standard C gives the compiler certain garantees and one of them is correct alignment of pointers to whatever the platform wants. For some architectures, this is normally not enforced (x86, ppc), on some violating results in traps (SPARC), on some it results in unexpected behavior. Early ARM generations for example fall into the last category.> This introduces a stealth crash into Apple-only code that was bug-free > under gcc. And realizing what's going on requires familiarity with ARM > assembly. With a bit of googling you'll find other support board posts > asking "Why is my code suddenly crashing after upgrading Xcode?"Your code is buggy. Stop justifying it by saying that GCC doesn't utilise optimisation potential. It has been proven over and over again, that half of the cases where GCC misoptimises code turned out to be completely broken assumptions by the code in question. Never versions of GCC tended to expose new bugs. It's not a problem in the compiler. I wouldn't be surprised if you get the same behavior with a recent GCC, too.> You can measure its damage pretty significantly, though. We've already > spent a lot of hours tracking down unaligned accesses and wrapping > them in assembly macros. Which of course ends up disabling other, > actually useful optimizations. And with a large codebase, we can't be > sure we've found every last one yet.There is a warning about casts that change the alignment for a reason. Sorry, but I have absolute no sympathy for case. Joerg
Joerg Sonnenberger
2012-Jan-24 17:29 UTC
[LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
BTW as the question was asked on IRC: it is possible to force LLVM to forget about the natural alignment of pointers. Examples are the packed attribute on structures or using the align attribute on pointers. consider the attached example for the latter. Notice the significant difference in code size... Joerg -------------- next part -------------- A non-text attachment was scrubbed... Name: unalignedptr.c Type: text/x-csrc Size: 106 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120124/5cfbce64/attachment.c> -------------- next part -------------- .syntax unified .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .file "unalignedptr.c" .text .globl foo .align 2 .type foo,%function foo: ldr r0, .LCPI0_0 ldrb r1, [r0] ldrb r2, [r0, #1] ldrb r3, [r0, #2] ldrb r0, [r0, #3] orr r0, r3, r0, lsl #8 orr r1, r1, r2, lsl #8 orr r0, r1, r0, lsl #16 ldr r0, [r0] bx lr .align 2 .LCPI0_0: .long f .Ltmp0: .size foo, .Ltmp0-foo .globl foo2 .align 2 .type foo2,%function foo2: ldr r0, .LCPI1_0 ldr r0, [r0] ldr r0, [r0] bx lr .align 2 .LCPI1_0: .long g .Ltmp1: .size foo2, .Ltmp1-foo2 .type f,%object .comm f,4,4 .type g,%object .comm g,4,4
James Molloy
2012-Jan-24 17:31 UTC
[LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
Hi Esperanza,> No one is arguing that there aren't ABI specs or LLVM design > guidelines that say that unaligned accesses "should not", "could not" > or "aren't guaranteed to" work, because it's besides the point.The C standard, 6.3.2.3 clause 7 states: "A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behaviour is undefined." Your program is exhibiting undefined behaviour, and just because others have also written code that exhibits the same undefined behaviour does not make yours right.> It would be one thing if the optimizer generated ldrd/strd for 64-bit > loads and stores only. But it actually goes as far as taking two > separate 32-bit accesses and merging them into one > silently-incompatible 64-bit access.That's two accesses *to the same array*, right? Which has incorrect alignment. If you really want to force this behaviour, you can possibly mark the array as an array of volatile ints. But it's still undefined and has no guarantee.> The point is that unaligned 32-bit loads and stores *work in practice* > on every single ARM device Apple has ever manufactured.They're slow and can cover multiple cache lines. They require two memory accesses in the worst cast for a word access. If you don't care about having optimised code, why not turn optimisations off? TL;DR: Don't blindly cast between char*/void* and int*. Just because it works on x86 and recent ARM hardware supports unaligned access does not make it standards compliant. Cheers, James -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Esperanza de Escobar Sent: 24 January 2012 16:36 To: David Blaikie Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?) No one is arguing that there aren't ABI specs or LLVM design guidelines that say that unaligned accesses "should not", "could not" or "aren't guaranteed to" work, because it's besides the point. The point is that unaligned 32-bit loads and stores *work in practice* on every single ARM device Apple has ever manufactured. I'm not a hardware person, but I'm guessing it takes a non-negligible amount of silicon to support them. With Xcode's switch to LLVM, this deployed silicon has suddenly become off-limits because of a single overzealous optimization. The only possible workarounds are assembly code and turning the optimizer off altogether. It would be one thing if the optimizer generated ldrd/strd for 64-bit loads and stores only. But it actually goes as far as taking two separate 32-bit accesses and merging them into one silently-incompatible 64-bit access. These two accesses could be unrelated to one another in the context of the code at hand, and even syntactically distant. This introduces a stealth crash into Apple-only code that was bug-free under gcc. And realizing what's going on requires familiarity with ARM assembly. With a bit of googling you'll find other support board posts asking "Why is my code suddenly crashing after upgrading Xcode?" Unfortunately, there is no way to turn this optimization off. Not to mention that it's not much of an optimization at all. I'd be surprised if you could measure a performance improvement on a real-world program with less than a million iterations, maybe orders more. You can measure its damage pretty significantly, though. We've already spent a lot of hours tracking down unaligned accesses and wrapping them in assembly macros. Which of course ends up disabling other, actually useful optimizations. And with a large codebase, we can't be sure we've found every last one yet. Whether this optimization is academically acceptable or not, its net impact in real-world terms is exceedingly negative. EdE On 1/24/12, David Blaikie <dblaikie at gmail.com> wrote:>> Note that this code compiled with GCC 4.2 runs perfectly whereas LLVMwill>> produce a binary that crashes: LLVM breaks existing source code. > > On this point: > This is not uncommon - and the very nature of "Undefined Behaviour". > This reason alone is not enough to justify a change to Clang. We/you > would need to show that the behaviour is defined & Clang is violating > that definition. > > Chris has posted about the general principle at length starting here: > http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >_______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Joe Groff
2012-Jan-24 17:43 UTC
[LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
On Tue, Jan 24, 2012 at 8:36 AM, Esperanza de Escobar <esperanzitadeescobar at gmail.com> wrote:> You can measure its damage pretty significantly, though. We've already > spent a lot of hours tracking down unaligned accesses and wrapping > them in assembly macros. Which of course ends up disabling other, > actually useful optimizations. And with a large codebase, we can't be > sure we've found every last one yet.Instead of resorting to assembler, have you tried __attribute__((align(n))) ? -Joe
Seemingly Similar Threads
- [LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
- [LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
- [LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
- [LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)
- [LLVMdev] Use of 'ldrd' instructions with unaligned addresses on armv7 (Major bug in LLVM optimizer?)