Jatin Bhateja via llvm-dev
2017-Jul-09 06:20 UTC
[llvm-dev] RFC: Identification of LEA instructions with complex addressing mode
Current state: 1/ LEA with complex addressing mode are supported by Intel micro architectures after Nehalem. 2/ LEA detection is being done during SelectionDAG based instruction selection through addressing mode based complex pattern matching. 3/ This does not identify LEA patterns beyond Scaling factor 1. e.g. T1 = A + B; T2 = T1 + B; T3 = T2 + B; T4 = T3 + 10; T5 = T4 + 20; T6 = T5 + B Above sequence can be folded to LEA 30( A , 4 , B); where BASE = A, SCALE = 4, INDEX = B and OFFSET = 30 4/ Control flow information is not present at SelectionDAG level, as SelectionDAG based selection work over a single BasicBlock at a time. Which makes it difficult to avoid generation of complex LEA with 3 operands (even with Scale=1) within Loops. Proposal: 1/ To have a pre-RA pass to identify LEAs with dense folding. By dense folding I mean scale factor greater than 1. 2/ Since this pass will run over MachineInstrs so it will be usable for FastISel and Global ISel based flows also which bypass SelectionDAG. 3/ At MI level we have control flow Analysis info in the form of MachineDominatorTree and MachineLoopInfo to avoid formation of LEAs in loops. 4/ Perform CSE over dense LEAs (which have Scale factor > 1) to factor out overlapping computations. When two LEAs share BASE , INDEX and OFFSET but have different SCALE we can take out the common complex LEA and generate a simple LEA with legal Scale. e.g. LEA1 : RES1 = LEA 10( A , 4 , B) LEA2 : RES2 = LEA 10( A , 8 , B) can be converted to LEA1 : RES1 = LEA 10( A , 4 , B) LEA2 : RES2 = LEA (RES1 , 4 , B) 5/ Disintegration of complex LEAs with Scale 1 to simple LEA + ADD which is already being handled during FixupLEAPass. Kindly drop your suggestions/comments. Thanks, Jatin Bhateja -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/27b26a18/attachment.html>
Saba, Lama via llvm-dev
2017-Jul-09 07:07 UTC
[llvm-dev] Identification of LEA instructions with complex addressing mode
Is this an RFC for (an expansion of) the patch you already have https://reviews.llvm.org/D35014? Or are you planning on making a different design? Thanks, Lama From: Jatin Bhateja [mailto:jatin.bhateja at gmail.com] Sent: Sunday, July 9, 2017 9:21 AM To: llvm-dev <llvm-dev at lists.llvm.org> Cc: llvm-dev at redking.me.uk; Saba, Lama <lama.saba at intel.com> Subject: RFC: Identification of LEA instructions with complex addressing mode Current state: 1/ LEA with complex addressing mode are supported by Intel micro architectures after Nehalem. 2/ LEA detection is being done during SelectionDAG based instruction selection through addressing mode based complex pattern matching. 3/ This does not identify LEA patterns beyond Scaling factor 1. e.g. T1 = A + B; T2 = T1 + B; T3 = T2 + B; T4 = T3 + 10; T5 = T4 + 20; T6 = T5 + B Above sequence can be folded to LEA 30( A , 4 , B); where BASE = A, SCALE = 4, INDEX = B and OFFSET = 30 4/ Control flow information is not present at SelectionDAG level, as SelectionDAG based selection work over a single BasicBlock at a time. Which makes it difficult to avoid generation of complex LEA with 3 operands (even with Scale=1) within Loops. Proposal: 1/ To have a pre-RA pass to identify LEAs with dense folding. By dense folding I mean scale factor greater than 1. 2/ Since this pass will run over MachineInstrs so it will be usable for FastISel and Global ISel based flows also which bypass SelectionDAG. 3/ At MI level we have control flow Analysis info in the form of MachineDominatorTree and MachineLoopInfo to avoid formation of LEAs in loops. 4/ Perform CSE over dense LEAs (which have Scale factor > 1) to factor out overlapping computations. When two LEAs share BASE , INDEX and OFFSET but have different SCALE we can take out the common complex LEA and generate a simple LEA with legal Scale. e.g. LEA1 : RES1 = LEA 10( A , 4 , B) LEA2 : RES2 = LEA 10( A , 8 , B) can be converted to LEA1 : RES1 = LEA 10( A , 4 , B) LEA2 : RES2 = LEA (RES1 , 4 , B) 5/ Disintegration of complex LEAs with Scale 1 to simple LEA + ADD which is already being handled during FixupLEAPass. Kindly drop your suggestions/comments. Thanks, Jatin Bhateja --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/25965544/attachment.html>
Jatin Bhateja via llvm-dev
2017-Jul-09 07:57 UTC
[llvm-dev] Identification of LEA instructions with complex addressing mode
Hi Lama, This is related to PR 32755 filed by you. But patch will not be able to fold beyond scaling factor 2. So to take care of cases which fold to scaling factor > 2 we may have to move extraction logic LEAs to a new/existing pass before MI Scheduling as has been proposed. Kindly provide your comments/suggestions. Thanks, Jatin On Sunday, July 9, 2017, Saba, Lama <lama.saba at intel.com> wrote:> Is this an RFC for (an expansion of) the patch you already have > https://reviews.llvm.org/D35014? Or are you planning on making a > different design? > > > > Thanks, > > Lama > > > > *From:* Jatin Bhateja [mailto:jatin.bhateja at gmail.com > <javascript:_e(%7B%7D,'cvml','jatin.bhateja at gmail.com');>] > *Sent:* Sunday, July 9, 2017 9:21 AM > *To:* llvm-dev <llvm-dev at lists.llvm.org > <javascript:_e(%7B%7D,'cvml','llvm-dev at lists.llvm.org');>> > *Cc:* llvm-dev at redking.me.uk > <javascript:_e(%7B%7D,'cvml','llvm-dev at redking.me.uk');>; Saba, Lama < > lama.saba at intel.com <javascript:_e(%7B%7D,'cvml','lama.saba at intel.com');>> > *Subject:* RFC: Identification of LEA instructions with complex > addressing mode > > > > Current state: > > > > 1/ LEA with complex addressing mode are supported by Intel micro > architectures after Nehalem. > > > > 2/ LEA detection is being done during SelectionDAG based instruction > selection through > > addressing mode based complex pattern matching. > > > > 3/ This does not identify LEA patterns beyond Scaling factor 1. > > e.g. > > T1 = A + B; > > T2 = T1 + B; > > T3 = T2 + B; > > T4 = T3 + 10; > > T5 = T4 + 20; > > T6 = T5 + B > > > > Above sequence can be folded to > > > > LEA 30( A , 4 , B); > > > > where BASE = A, SCALE = 4, INDEX = B and OFFSET = 30 > > > > 4/ Control flow information is not present at SelectionDAG level, as > SelectionDAG based selection > > work over a single BasicBlock at a time. Which makes it difficult > to avoid generation of > > complex LEA with 3 operands (even with Scale=1) within Loops. > > > > Proposal: > > > > 1/ To have a pre-RA pass to identify LEAs with dense folding. By > dense folding I mean scale factor greater than 1. > > > > 2/ Since this pass will run over MachineInstrs so it will be usable > for FastISel and Global ISel based flows also which > > bypass SelectionDAG. > > > > 3/ At MI level we have control flow Analysis info in the form of > MachineDominatorTree and MachineLoopInfo > > to avoid formation of LEAs in loops. > > > > 4/ Perform CSE over dense LEAs (which have Scale factor > 1) to > factor out overlapping computations. > > When two LEAs share BASE , INDEX and OFFSET but have different > SCALE we can > > take out the common complex LEA and generate a simple LEA with > legal Scale. > > e.g. > > > > LEA1 : RES1 = LEA 10( A , 4 , B) > > LEA2 : RES2 = LEA 10( A , 8 , B) > > > > can be converted to > > > > LEA1 : RES1 = LEA 10( A , 4 , B) > > LEA2 : RES2 = LEA (RES1 , 4 , B) > > > > 5/ Disintegration of complex LEAs with Scale 1 to simple LEA + ADD > which is already being handled during > > FixupLEAPass. > > > > Kindly drop your suggestions/comments. > > > > Thanks, > > Jatin Bhateja > > > > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. >-- Jatin Bhateja -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/67cf497a/attachment.html>