Phil Tomson via llvm-dev
2016-Dec-23 01:45 UTC
[llvm-dev] struct bitfield regression between 3.6 and 3.9 (using -O0)
Given that this is compiled with -O0, would there a way to skip the Optimization of the Type-legalized selection DAG? It's fine until it optimizes the Type-legalized selection DAG into the Optimized Type-legalized selection DAG. Phil On Thu, Dec 22, 2016 at 10:29 AM, Friedman, Eli <efriedma at codeaurora.org> wrote:> On 12/21/2016 4:45 PM, Phil Tomson via llvm-dev wrote: > > Here's our testcase: > > #include <stdio.h> > > struct flags { > unsigned frog: 1; > unsigned foo : 1; > unsigned bar : 1; > unsigned bat : 1; > unsigned baz : 1; > unsigned bam : 1; > }; > > int main() { > struct flags flags; > flags.bar = 1; > flags.foo = 1; > if (flags.foo == 1) { > printf("Pass\n"); > return 0; > } else { > printf("FAIL\n"); > return 1; > } > } > > when we compile this using LLVM 3.9 we get the "FAIL" message. However, > when we compile in LLVM 3.6 it passes. (this is only an issue with -O0, > higher levels of optimization work fine) > > After some investigation we discovered the problem, here's the relevant > part of our assembly generated by LVM 3.9: > > load r0, r510, 24, 8 > slr r0, r0, 1, 8 > cmpimm r0, r0, 1, 0, 8, SNE > bitop1 r0, r0, 1<<0, AND, 64 > jct .LBB0_2, r0, 0, N > jrel .LBB0_1 > > Notice the slr (shift logical right) instruction there is shifting to the > right 1 position in order to get flags.foo into bit 0 of r0. But the > problem is that the compare(cmpimm) is comparing not just the single bit > but the whole value in r0 (an 8-bit value) against 1. If we insert a > logical AND with '1' to mask r0 just prior to the compare it works fine. > > And as it turns out, we see that *and* in the LLVM IR generated using -O0 > and -emit-llvm has the AND included: > ... > %bf.lshr = lshr i8 %bf.load4, 1 > * %bf.clear5 = and i8 %bf.lshr, 1* > %bf.cast = zext i8 %bf.clear5 to i32 > %cmp = icmp eq i32 %bf.cast, 1 > br i1 %cmp, label %if.then, label %if.else > > (compiled with: clang -O0 -emit-llvm -S failing.c -o failing.ll ) > > I reran passing -debug to llc to see what's happening at various stages of > DAG optimization: > > clang -O0 -mllvm -debug -S failing.c -o failing.s > > The initial selection DAG has the AND op node: > > t22: i8 = srl t19, Constant:i64<1> > * t23: i8 = and t22, Constant:i8<1>* > t24: i32 = zero_extend t23 > t27: i1 = setcc t24, Constant:i32<1>, seteq:ch > t29: i1 = xor t27, Constant:i1<-1> > t31: ch = brcond t18, t29, BasicBlock:ch<if.else 0xa5f8d48> > t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98> > > The Optimized lowered selection DAG does not contain the* AND* node, but > it does have a truncate which would seem to stand in for it given the > result is only 1bit wide and the xor following it is operating on 1-bit > wide values: > > t22: i8 = srl t19, Constant:i64<1> > t35: i1 = truncate t22 > t29: i1 = xor t35, Constant:i1<-1> > t31: ch = brcond t18, t29, BasicBlock:ch<if.else 0xa5f8d48> > t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98> > > Next we get to the Type-legalized selection DAG: > > t22: i8 = srl t19, Constant:i64<1> > t40: i8 = xor t22, Constant:i8<1> > t31: ch = brcond t18, t40, BasicBlock:ch<if.else 0xa5f8d48> > t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98> > > The truncate is now gone. > > Next we have the Optimzied type-legalized DAG: > > t22: i8 = srl t19, Constant:i64<1> > t43: i8 = setcc t22, Constant:i8<1>, setne:ch > t31: ch = brcond t18, t43, BasicBlock:ch<if.else 0xa5f8d48> > t33: ch = br t31, BasicBlock:ch<if.then 0xa5f8c98> > > The *xor* has been replaced with a *setcc*. The legalized selection DAG > is essentially the same. As is the optimized legalized selection DAG. > > So if t19 contains 0b00000110 then > t22 contains 0b00000011 > setcc then compares t22 with a constant 1 and since they're not equal ( > setne) it sets bit 0 of t43. > brcond will then test bit 0 of t43 and since it's set it branches to the > else branch (prints FAIL in this case) > > If instead t22 contained 0b00000001 (as would be the case if the mask was > still there) the setcc would find both values to compare equal and since setne > is specified the branch in brcond will not be taken (the correct behavior) > > Things seem to have gone wrong when the Type-legalized selection DAG was > optimized and the *xor *node was changed to a *setcc *(and actually, the > *xor* seems like it was more optimal than the *setcc *anyway)*. * > > Any ideas about why this is happening? > > > I would suggest starting with DAGTypeLegalizer::PromoteIntOp_BRCOND, I > think... > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161222/1612f960/attachment.html>
Friedman, Eli via llvm-dev
2016-Dec-23 03:11 UTC
[llvm-dev] struct bitfield regression between 3.6 and 3.9 (using -O0)
On 12/22/2016 5:45 PM, Phil Tomson wrote:> Given that this is compiled with -O0, would there a way to skip the > Optimization of the Type-legalized selection DAG? It's fine until it > optimizes the Type-legalized selection DAG into the Optimized > Type-legalized selection DAG.Umm, I wouldn't really suggest shoving the problem under the rug... I mean, turning off the optimization might make this particular testcase work the way you want it to, but the problem will still be lurking, waiting to be triggered by a different configuration. There are patches floating around to turn off DAGCombine, and various parts of it, at -O0; you should be able to find past email threads on llvmdev discussing it. IIRC it causes problems for various targets because it exercises different codepaths. -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Phil Tomson via llvm-dev
2016-Dec-23 03:34 UTC
[llvm-dev] struct bitfield regression between 3.6 and 3.9 (using -O0)
On Thu, Dec 22, 2016 at 7:11 PM, Friedman, Eli <efriedma at codeaurora.org> wrote:> On 12/22/2016 5:45 PM, Phil Tomson wrote: > >> Given that this is compiled with -O0, would there a way to skip the >> Optimization of the Type-legalized selection DAG? It's fine until it >> optimizes the Type-legalized selection DAG into the Optimized >> Type-legalized selection DAG. >> > > Umm, I wouldn't really suggest shoving the problem under the rug... I > mean, turning off the optimization might make this particular testcase work > the way you want it to, but the problem will still be lurking, waiting to > be triggered by a different configuration. >Possibly, but this testcase is based on distilling some larger libraries which we found to be failing into this testcase. And we need to move forward with being able to compile those larger libraries correctly by year end (they need them with -O0 -g ie. a debug build otherwise I'd tell them to just compile with -O2 which works). Since this is happening after the Type-legalized selection DAG is optimized and prior to Instruction Selection I'm thinking this is an upstream LLVM bug and thus won't be fixed in time for us to get done what we need to get done. Of course I could be wrong about this assessment, how might this be a target-specific bug?> > There are patches floating around to turn off DAGCombine, and various > parts of it, at -O0; you should be able to find past email threads on > llvmdev discussing it. IIRC it causes problems for various targets because > it exercises different codepaths.I suspect this is probably the best short-term solution to turn off DAGCombine if -O0 is specified. Phil Is this> > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux > Foundation Collaborative Project > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161222/c77e61d1/attachment.html>