Reed Kotler
2013-Sep-17 19:36 UTC
[LLVMdev] [llvm] r190328 - Revert patches to add case-range support for PR1255.
Hi Bob, This has turned out to be what appears to be a very obscure binutils bug. I'm working on a test case for it now. I have a patch for Mips16 llvm which works around the issue for now. In general, pure risc architectures have no pity for compiler and toolchain developers. Mips16 is way more extreme in this way than mips32. In mips32, there is no PC register or PC relative instructions. To make PIC work, they load register T9 with the address of the function that is being called. For referencing the GOT then, you have to emit this two sequence instruction as the first two instructions of the function so that this T9 register can be used to load from the GOT. In mips16 there is no PC register either but there are some PC relative instructions so in principle this two instruction sequence can be placed anywhere (and if we know we are calling another mips16 function then we don't need to load T9 even though it's part of the PIC ABI). Gcc mips16 though always place this sequence as the first two instructions of the function. In llvm, I did not do that because there is no reason to be tied to that; if it happens it happens but it does not need to be like that and you can end up with slower code in that case. The full got pointer GP load takes 4 instructions so if you have paths in the function which don't make external calls, then you are paying a lot to force all paths to execute this sequence. But the calculation for these kinds of external symbol offsets is complicated in mips and if you are not careful you can end up with some strange boundary condition errors. This seems to be what the problem is here. When the two instruction sequence is placed at the beginning of the function, you are guaranteed that it is longword aligned but in other places in mips16 code, you can have instructions starting on halfword boundaries. That is what appears to the problem. In principle, you can put this sequence anywhere but in some strange cases, if it is not longword aligned, the address calculation overflows and comes our wrong. My patch forces these sequences to be longword aligned and it fixes the problem. Reed
Reed Kotler
2013-Sep-19 19:40 UTC
[LLVMdev] [llvm] r190328 - Revert patches to add case-range support for PR1255.
For reasons that are too long to explain here, for our testing of llvm for Mips, we use an older binutils and libc. This apparent regression was actually a linker bug related to Mips16 Pic that has since been fixed. The bug is fixed by this patch. 2011-12-13 Chung-Lin Tang <cltang at codesourcery.com> * elfxx-mips.c (mips_elf_calculate_relocation): Correct R_MIPS16_HI16/R_MIPS16_LO16 handling of two cleared lower bits, update comments. https://sourceware.org/ml/binutils/2011-12/msg00123.html I have pushed a workaround patch to llvm for Mips16 which deals with this. This original patch reversion just coincidentally triggered this bug. I think you can flip a coin 20 times and more likely get all heads than to run into this address boundary problem. Reed On 09/17/2013 12:36 PM, Reed Kotler wrote:> Hi Bob, > > This has turned out to be what appears to be a very obscure binutils > bug. I'm working on a test case for it now. > > I have a patch for Mips16 llvm which works around the issue for now. > > In general, pure risc architectures have no pity for compiler and > toolchain developers. Mips16 is way more extreme in this way than mips32. > > In mips32, there is no PC register or PC relative instructions. > To make PIC work, they load register T9 with the address of the function > that is being called. > > For referencing the GOT then, you have to emit this two sequence > instruction as the first two instructions of the function so that this > T9 register can be used to load from the GOT. > > In mips16 there is no PC register either but there are some PC relative > instructions so in principle this two instruction sequence can be placed > anywhere (and if we know we are calling another mips16 function then we > don't need to load T9 even though it's part of the PIC ABI). > > Gcc mips16 though always place this sequence as the first two > instructions of the function. > > In llvm, I did not do that because there is no reason to be tied to > that; if it happens it happens but it does not need to be like that and > you can end up with slower code in that case. The full got pointer GP > load takes 4 instructions so if you have paths in the function which > don't make external calls, then you are paying a lot to force all paths > to execute this sequence. > > But the calculation for these kinds of external symbol offsets is > complicated in mips and if you are not careful you can end up with some > strange boundary condition errors. > > This seems to be what the problem is here. > > When the two instruction sequence is placed at the beginning of the > function, you are guaranteed that it is longword aligned but in other > places in mips16 code, you can have instructions starting on halfword > boundaries. > > That is what appears to the problem. > > In principle, you can put this sequence anywhere but in some strange > cases, if it is not longword aligned, the address calculation overflows > and comes our wrong. > > My patch forces these sequences to be longword aligned and it fixes the > problem. > > Reed