Chandler Carruth
2011-Oct-18  09:53 UTC
[LLVMdev] Question regarding basic-block placement optimization
Hello, I'm working on basic-block placement optimizations based on branch probability information. I've run into a stumbling block though. One of the existing passes to do this, essentially a dead pass 'block-placement' operates on the IR, reordering the blocks of the function, and relying on the code generator to preserve that order unless it has a reason to rearrange it. This pass is entirely untested AFAICT, and it doesn't work any more... That said, I've replicated its functionality with a much updated transformation algorithm (no longer O(N^2), and finds better orderings!) and predicated it on the new frequency and probability information. It "works" great, but it doesn't actually optimize anything. It re-arranges all the blocks of the IR function, laying them out perfectly. And then the code generation layer throws that away and seems to establish its own ordering altogether. What's the best approach here? Is ignoring the order of blocks in the IR function desired behavior? (It does seem reasonable, it just seems to have not always been that way, so I wondered.) Should I sink my pass down to a codegen pass? If this really should live as a CodeGen pass, then what should become of the existing code-placement pass in the CodeGen layer? Currently that pass does two things: 1) It aligns loop bodies. This seems like a completely separate concern, although likely a good one to do after any other placement optimizations. 2) It rearranges very specific loop structures using very specific heuristics. Some of these seem redundant in the presence of probability based information, but others seem orthogonal. Honestly, I don't understand any of this very well, so it's likely I'm just misreading it. I'm a bit wary of implementing this in the CodeGen layer as I worry that the layout of the MBBs and their various branches will have already made assumptions regarding the structure and shape of the CFG of the function... But it would also perhaps give more stability to the resulting orderings, improving the benefits of the pass. Thoughts? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111018/6535f897/attachment.html>
Sergei Larin
2011-Oct-18  15:37 UTC
[LLVMdev] Question regarding basic-block placement optimization
Chandler, Can you please elaborate on the following statement?>>” I've replicated its functionality with a much updated transformation algorithm (no longer O(N^2), and finds better orderings!)”Is this a new algorithm or something you can refer to. Thanks. Sergei From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chandler Carruth Sent: Tuesday, October 18, 2011 4:54 AM To: LLVM Developers Mailing List Subject: [LLVMdev] Question regarding basic-block placement optimization Hello, I'm working on basic-block placement optimizations based on branch probability information. I've run into a stumbling block though. One of the existing passes to do this, essentially a dead pass 'block-placement' operates on the IR, reordering the blocks of the function, and relying on the code generator to preserve that order unless it has a reason to rearrange it. This pass is entirely untested AFAICT, and it doesn't work any more... That said, I've replicated its functionality with a much updated transformation algorithm (no longer O(N^2), and finds better orderings!) and predicated it on the new frequency and probability information. It "works" great, but it doesn't actually optimize anything. It re-arranges all the blocks of the IR function, laying them out perfectly. And then the code generation layer throws that away and seems to establish its own ordering altogether. What's the best approach here? Is ignoring the order of blocks in the IR function desired behavior? (It does seem reasonable, it just seems to have not always been that way, so I wondered.) Should I sink my pass down to a codegen pass? If this really should live as a CodeGen pass, then what should become of the existing code-placement pass in the CodeGen layer? Currently that pass does two things: 1) It aligns loop bodies. This seems like a completely separate concern, although likely a good one to do after any other placement optimizations. 2) It rearranges very specific loop structures using very specific heuristics. Some of these seem redundant in the presence of probability based information, but others seem orthogonal. Honestly, I don't understand any of this very well, so it's likely I'm just misreading it. I'm a bit wary of implementing this in the CodeGen layer as I worry that the layout of the MBBs and their various branches will have already made assumptions regarding the structure and shape of the CFG of the function... But it would also perhaps give more stability to the resulting orderings, improving the benefits of the pass. Thoughts?
Cameron Zwarich
2011-Oct-18  21:59 UTC
[LLVMdev] Question regarding basic-block placement optimization
On Oct 18, 2011, at 2:53 AM, Chandler Carruth wrote:> Hello, > > I'm working on basic-block placement optimizations based on branch probability information. I've run into a stumbling block though. One of the existing passes to do this, essentially a dead pass 'block-placement' operates on the IR, reordering the blocks of the function, and relying on the code generator to preserve that order unless it has a reason to rearrange it. This pass is entirely untested AFAICT, and it doesn't work any more... > > That said, I've replicated its functionality with a much updated transformation algorithm (no longer O(N^2), and finds better orderings!) and predicated it on the new frequency and probability information. It "works" great, but it doesn't actually optimize anything. It re-arranges all the blocks of the IR function, laying them out perfectly. And then the code generation layer throws that away and seems to establish its own ordering altogether. > > What's the best approach here? Is ignoring the order of blocks in the IR function desired behavior? (It does seem reasonable, it just seems to have not always been that way, so I wondered.) > > Should I sink my pass down to a codegen pass?I think this should really live as a CodeGen pass. Is there any good reason to make it an IR pass? Cameron
Chandler Carruth
2011-Oct-18  22:07 UTC
[LLVMdev] Question regarding basic-block placement optimization
On Tue, Oct 18, 2011 at 2:59 PM, Cameron Zwarich <zwarich at apple.com> wrote:> On Oct 18, 2011, at 2:53 AM, Chandler Carruth wrote: > > > Hello, > > > > I'm working on basic-block placement optimizations based on branch > probability information. I've run into a stumbling block though. One of the > existing passes to do this, essentially a dead pass 'block-placement' > operates on the IR, reordering the blocks of the function, and relying on > the code generator to preserve that order unless it has a reason to > rearrange it. This pass is entirely untested AFAICT, and it doesn't work any > more... > > > > That said, I've replicated its functionality with a much updated > transformation algorithm (no longer O(N^2), and finds better orderings!) and > predicated it on the new frequency and probability information. It "works" > great, but it doesn't actually optimize anything. It re-arranges all the > blocks of the IR function, laying them out perfectly. And then the code > generation layer throws that away and seems to establish its own ordering > altogether. > > > > What's the best approach here? Is ignoring the order of blocks in the IR > function desired behavior? (It does seem reasonable, it just seems to have > not always been that way, so I wondered.) > > > > Should I sink my pass down to a codegen pass? > > I think this should really live as a CodeGen pass. Is there any good reason > to make it an IR pass?So, as it happens, I was *completely* wrong here. CodeGen correctly preserves the ordering of blocks from IR, *unless* it can do folding, etc. My original test cases ended up being folded into other structures, hiding the fact that the ordering wasn't what was changing. I've run much more real-world style test cases through the pass, and it is working beautifully. I'll be attaching a patch shortly, I'm currently cleaning it up based on some over-the-shoulder feedback from Nick Lewycky. As for why it should be an IR pass, mostly because once the selection dag runs through the code, we can never recover all of the freedom we have at the IR level. To start with, splicing MBBs around requires known about the terminators (which we only some of the time do), and it requires re-writing them a touch to account for the different fall-through pattern. To make matters worse, at this point we don't have the nicely analyzable 'switch' terminator (I think), and so the existing MBB placement code just bails on non-branch-exit blocks. Operating at the IR level makes writing the pass a breeze, and we can quickly and efficiently simply lay the blocks out in the ideal ordering. Then the SelectionDAG and other layers can preserve this modulo folding and other optimization opportunities. At the most, we should likely add probability awareness to some of these CodeGen passes to turn off (or on) their transforms around unlikely (or likely) edges. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111018/fb7f9bf7/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Question regarding basic-block placement optimization
- [LLVMdev] Question regarding basic-block placement optimization
- [LLVMdev] Question regarding basic-block placement optimization
- [LLVMdev] Question regarding basic-block placement optimization
- [LLVMdev] Question regarding basic-block placement optimization