Are there any implicit assumptions about where alloca instructions can appear. I've got a failing test where the only difference between a passing test and a failing test is one application of this code in instcombine: // Convert: malloc Ty, C - where C is a constant != 1 into: malloc [C x Ty], 1 Seems pretty harmless to me. Later on the instcombine code does this: // Scan to the end of the allocation instructions, to skip over a block of // allocas if possible... That comment makes me a bit suspicious regarding assumptions about alloca placement. The interesting thing about this testcase is that the extra instcombine makes the test pass. If I omit it, the test fails. The only differences in the asm are stack offsets, which leads me to believe that in the failing test codegen is not accounting for all allocas properly. Before this critical instcombine the input code looks like this: ; Fails %"t$1" = alloca %DV1, align 8 ; <%DV1*> [#uses=3] %"t$2" = alloca [1 x [1 x <2 x float>]]*, i32 12, align 8 ; <[1 x [1 x <2 x float>]]**> [#uses=2] %tmpcast5 = bitcast [1 x [1 x <2 x float>]]** %"t$2" to %DV2* ; <%DV2*> [#uses=2] %"t$34" = alloca [9 x [1 x <2 x float>]*], align 8 ; <[9 x [1 x <2 x float>]*]*> [#uses=3] Afterward it looks like this: ; Passes %"t$1" = alloca %DV1, align 8 ; <%DV1*> [#uses=3] %"t$26" = alloca [12 x [1 x [1 x <2 x float>]]*], align 8 ; <[12 x [1 x [1 x <2 x float>]]*]*> [#uses=1] %"t$26.sub" = getelementptr [12 x [1 x [1 x <2 x float>]]*]* %"t$26", i32 0, i32 0 ; <[1 x [1 x <2 x float>]]**> [#uses=2] %tmpcast5 = bitcast [1 x [1 x <2 x float>]]** %"t$26.sub" to %DV2* ; <%DV2*> [#uses=2] %"t$34" = alloca [9 x [1 x <2 x float>]*], align 8 ; <[9 x [1 x <2 x float>]*]*> [#uses=3] Any thoughts on why this might be a problem? -Dave
On Mon, Oct 12, 2009 at 1:07 PM, David Greene <dag at cray.com> wrote:> Are there any implicit assumptions about where alloca instructions > can appear.Static allocas should appear as a continuous chunk in the entry block, otherwise other passes might make bad assumptions.> The interesting thing about this testcase is that the extra instcombine makes > the test pass. If I omit it, the test fails. The only differences in the > asm are stack offsets, which leads me to believe that in the failing test > codegen is not accounting for all allocas properly.If running a testcase through -instcombine -instcombine gives a result that isn't identical to -instcombine, that's a bug. Please file it if you have a reduced testcase. -Eli
Hi,> On Mon, Oct 12, 2009 at 1:07 PM, David Greene <dag at cray.com> wrote: >> Are there any implicit assumptions about where alloca instructions >> can appear. > > Static allocas should appear as a continuous chunk in the entry block, > otherwise other passes might make bad assumptions.an alloca can appear anywhere, but when they are outside the entry block then some optimizations may not occur. The important distinction is between alloca's that are appear in a loop and those that are not in a loop. Rather than detect loops, optimizers tend to just check whether alloca's are in the entry block or not (the entry block is never part of a loop). Ciao, Duncan.
On Monday 12 October 2009 22:12, Eli Friedman wrote:> On Mon, Oct 12, 2009 at 1:07 PM, David Greene <dag at cray.com> wrote: > > Are there any implicit assumptions about where alloca instructions > > can appear. > > Static allocas should appear as a continuous chunk in the entry block, > otherwise other passes might make bad assumptions.Ok, we should document this.> > The interesting thing about this testcase is that the extra instcombine > > makes the test pass. If I omit it, the test fails. The only differences > > in the asm are stack offsets, which leads me to believe that in the > > failing test codegen is not accounting for all allocas properly. > > If running a testcase through -instcombine -instcombine gives a result > that isn't identical to -instcombine, that's a bug. Please file it if > you have a reduced testcase.No, hat's not what I'm doing. I'm limiting the number of transformations instcombine does to do a binary search and narrow down on the specific transformation that causes the problem (or in this case, masks it). -Dave