Hi all,
I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
and now I am running into a deficiency of the x86
peephole optimizer (or jump-threader?). Here is what I get:
andl $3, %edi
je .LBB0_4
# BB#2: # %nz
# in Loop: Header=BB0_1
Depth=1
cmpl $2, %edi
je .LBB0_6
# BB#3: # %nz.non-middle
# in Loop: Header=BB0_1
Depth=1
cmpl $2, %edi
jbe .LBB0_4
# BB#5: # %sw.bb6
ret
the second 'cmpl' is totally redundant, which pass is
(or would be) in charge of removing it?
Cheers,
Gabor
On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote:> Hi all, > > I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125) > and now I am running into a deficiency of the x86 > peephole optimizer (or jump-threader?). Here is what I get: > > > andl $3, %edi > je .LBB0_4 > # BB#2: # %nz > # in Loop: Header=BB0_1 > Depth=1 > cmpl $2, %edi > je .LBB0_6 > # BB#3: # %nz.non-middle > # in Loop: Header=BB0_1 > Depth=1 > cmpl $2, %edi > jbe .LBB0_4 > # BB#5: # %sw.bb6 > ret > > the second 'cmpl' is totally redundant, which pass is > (or would be) in charge of removing it?MachineCSE should be in charge of zapping it. -Chris
Am 07.10.2010 um 19:50 schrieb Chris Lattner:> > On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote: > >> Hi all, >> >> I am slowly working on a SwitchInst optimizer (http://llvm.org/ >> PR8125) >> and now I am running into a deficiency of the x86 >> peephole optimizer (or jump-threader?). Here is what I get: >> >> >> andl $3, %edi >> je .LBB0_4 >> # BB#2: # %nz >> # in Loop: Header=BB0_1 >> Depth=1 >> cmpl $2, %edi >> je .LBB0_6 >> # BB#3: # %nz.non-middle >> # in Loop: Header=BB0_1 >> Depth=1 >> cmpl $2, %edi >> jbe .LBB0_4 >> # BB#5: # %sw.bb6 >> ret >> >> the second 'cmpl' is totally redundant, which pass is >> (or would be) in charge of removing it? > > MachineCSE should be in charge of zapping it.Hi Chris, I had a look into MachineCSE, but it looks like MBB-oriented. The above problem is an inter-block one. Also MCSE seems to perform value numbering on virtual/physical registers, which does not map very well to status register bits that are implicitly defined. Any chance to recast this issue as a target-independent (but cmp-specific) peephole problem, that just looks into predecessor blocks and applies (target-hook-like) subsumption checks for 'cmp' instructions? I am thankful for any hint, cheers, Gabor> > -Chris