Jakob Stoklund Olesen
2013-Aug-05 16:19 UTC
[LLVMdev] Missing optimization - constant parameter
On Aug 5, 2013, at 8:34 AM, Maurice Marks <maurice.marks at gmail.com> wrote:> Are you sure that's it? I commented that block out, rebuilt llvm 3.3, and it still duplicates the constant. > My concern is that long constant loads increase code size and if they can be avoided by better targeting it would be a win. My project's application of llvm tends to use a lot of long constants so this can be a significant optimization. > I'll do some more debugging now that you have pointed me in the right direction.It is also possible that the coalescer is duplicating the instruction like Rafael suggested. It will do that if it can't eliminate a copy. /jakob
I went back to this problem after looking at some other things. Turning on debugging I noticed that the register coalescer is trying to do the "right thing" and merge vreg0 (the previous load of the constant) with the first parameter of the call (%rdi), which is exactly what gcc does in this case - eliminating a second constant load, but its refusing to do the merge: Here's the debug output for that part of the compilation: ********** SIMPLE REGISTER COALESCING ********** ********** Function: caller ********** JOINING INTERVALS *********** entry: 64B %RDI<def> = COPY %vreg0; GR64:%vreg0 Considering merging %vreg0 with %RDI Can only merge into reserved registers. Remat: %RDI<def> = MOV64ri 12345123400 Shrink: [32r,64r:0) 0 at 32r Shrunk: [32r,48r:0) 0 at 32r Trying to inflate 0 regs. ********** INTERVALS ********** %vreg0 = [32r,48r:0) 0 at 32r RegMasks: 80r Jakob, what does "can only merge into reserved registers" mean in this instance. I don't see any reason for it not to do the merge. On Mon, Aug 5, 2013 at 11:19 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote:> > On Aug 5, 2013, at 8:34 AM, Maurice Marks <maurice.marks at gmail.com> wrote: > > > Are you sure that's it? I commented that block out, rebuilt llvm 3.3, > and it still duplicates the constant. > > My concern is that long constant loads increase code size and if they > can be avoided by better targeting it would be a win. My project's > application of llvm tends to use a lot of long constants so this can be a > significant optimization. > > I'll do some more debugging now that you have pointed me in the right > direction. > > It is also possible that the coalescer is duplicating the instruction like > Rafael suggested. It will do that if it can't eliminate a copy. > > /jakob > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131010/313ecb6d/attachment.html>
Jakob Stoklund Olesen
2013-Oct-10 16:16 UTC
[LLVMdev] Missing optimization - constant parameter
On Oct 10, 2013, at 7:59 AM, Maurice Marks <maurice.marks at gmail.com> wrote:> I went back to this problem after looking at some other things. Turning on debugging I noticed that the register coalescer is trying to do the "right thing" and merge vreg0 (the previous load of the constant) with the first parameter of the call (%rdi), which is exactly what gcc does in this case - eliminating a second constant load, but its refusing to do the merge: > > Here's the debug output for that part of the compilation: > > ********** SIMPLE REGISTER COALESCING ********** > ********** Function: caller > ********** JOINING INTERVALS *********** > entry: > 64B %RDI<def> = COPY %vreg0; GR64:%vreg0 > Considering merging %vreg0 with %RDI > Can only merge into reserved registers. > Remat: %RDI<def> = MOV64ri 12345123400 > Shrink: [32r,64r:0) 0 at 32r > Shrunk: [32r,48r:0) 0 at 32r > Trying to inflate 0 regs. > ********** INTERVALS ********** > %vreg0 = [32r,48r:0) 0 at 32r > RegMasks: 80r > > Jakob, what does "can only merge into reserved registers" mean in this instance. I don't see any reason for it not to do the merge.The coalescer won’t merge virtual and allocatable physical registers because that will extend the live range of the physical registers, constraining register allocation. Instead, the register allocator will attempt to choose %rdi for%vreg0 so the copy can be eliminated after register allocation. /jakob