thr3ads.net - llvm dev - [llvm-dev] Oddity w/MachineBlockPlacement and Loops [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Philip Reames via llvm-dev

2016-Feb-24 01:19 UTC

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

I'm getting some odd behavior out of MBP and was hoping someone 
knowledge of the code might be able to give some guidance.  Fair 
warning, I'm trying to describe a problem in code I don't really 
understand, so if something doesn't make sense, assume I misunderstood 
something.

The problematic case I'm seeing is that cold blocks are being placed 
between the preheader and header of a hot loop.  This has the result of 
adding a bunch of cold code spread through out the code rather than 
grouped all together at the end of the function.

 From what I can tell tracing through the code, the critical decision 
that goes wrong is when we're visiting the preheader after forming a 
(correct) chain for the loop itself.  When selecting a successor to 
merge with, we appear to not be considering the loop even though the 
loop hasn't been rotated and the header would be ideal for fallthrough.

In particular, we're printing the "(prob) (non-cold CFG conflict)"
debug
output message for the successor of the preheader which is the header.  
If I'm reading this code correctly, it's identifying the fact
there's a
global more important predecessor for the header (i.e. the latch block), 
but it doesn't seem to be account for the fact that the latch block has 
already been combined into the header's chain.  Unless I'm misreading 
something, we *should* be able to merge the loop chain with the 
preheader chain in it's entirety right?

At least one on example I've looked at, adding an early exit from 
BadCFGConflict loop when Pred and Succ are part of the same chain does 
appear to give the expected result, but I don't understand the code well 
enough to reason about whether that is generally a correct thing to do 
or not.

Philip

Xinliang David Li via llvm-dev

2016-Feb-24 05:48 UTC

head link

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

Hi Phillip,
http://reviews.llvm.org/D10825 tries to fix similar issues. Looks like
there are missing cases. Can you create a small reproducer?

Cong has also improved loop rotation based on better cost model -- it is
not yet enabled (-mllvm -precise-rotation-cost=true to turn on). If
possible, can you also give it a try?

thanks,

David

On Tue, Feb 23, 2016 at 5:19 PM, Philip Reames via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I'm getting some odd behavior out of MBP and was hoping someone
knowledge
> of the code might be able to give some guidance.  Fair warning, I'm
trying
> to describe a problem in code I don't really understand, so if
something
> doesn't make sense, assume I misunderstood something.
>
> The problematic case I'm seeing is that cold blocks are being placed
> between the preheader and header of a hot loop.  This has the result of
> adding a bunch of cold code spread through out the code rather than grouped
> all together at the end of the function.
>
> From what I can tell tracing through the code, the critical decision that
> goes wrong is when we're visiting the preheader after forming a
(correct)
> chain for the loop itself.  When selecting a successor to merge with, we
> appear to not be considering the loop even though the loop hasn't been
> rotated and the header would be ideal for fallthrough.
>
> In particular, we're printing the "(prob) (non-cold CFG
conflict)" debug
> output message for the successor of the preheader which is the header.  If
> I'm reading this code correctly, it's identifying the fact
there's a global
> more important predecessor for the header (i.e. the latch block), but it
> doesn't seem to be account for the fact that the latch block has
already
> been combined into the header's chain.  Unless I'm misreading
something, we
> *should* be able to merge the loop chain with the preheader chain in
it's
> entirety right?
>
> At least one on example I've looked at, adding an early exit from
> BadCFGConflict loop when Pred and Succ are part of the same chain does
> appear to give the expected result, but I don't understand the code
well
> enough to reason about whether that is generally a correct thing to do or
> not.
>
> Philip
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160223/4ad06988/attachment.html>

John Criswell via llvm-dev

2016-Feb-24 15:47 UTC

head link

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

Dear Philip,

On a related note, Rahman Lavaee has been working on optimizing code 
layout with LLVM.  His approach uses a trace of the program, so it may 
not be applicable to what you're doing, but I thought I should let you 
know since it works on a similar problem.

I believe his code is at https://github.com/rlavaee/code-layout-optimizer.

Rahman, does this code do the function-level code layout, the basic 
block layout optimization, or both?

Regards,

John Criswell

On 2/23/16 8:19 PM, Philip Reames via llvm-dev wrote:> I'm getting some odd behavior out of MBP and was hoping someone 
> knowledge of the code might be able to give some guidance.  Fair 
> warning, I'm trying to describe a problem in code I don't really 
> understand, so if something doesn't make sense, assume I misunderstood 
> something.
>
> The problematic case I'm seeing is that cold blocks are being placed 
> between the preheader and header of a hot loop.  This has the result 
> of adding a bunch of cold code spread through out the code rather than 
> grouped all together at the end of the function.
>
> From what I can tell tracing through the code, the critical decision 
> that goes wrong is when we're visiting the preheader after forming a 
> (correct) chain for the loop itself.  When selecting a successor to 
> merge with, we appear to not be considering the loop even though the 
> loop hasn't been rotated and the header would be ideal for fallthrough.
>
> In particular, we're printing the "(prob) (non-cold CFG
conflict)"
> debug output message for the successor of the preheader which is the 
> header.  If I'm reading this code correctly, it's identifying the
fact
> there's a global more important predecessor for the header (i.e. the 
> latch block), but it doesn't seem to be account for the fact that the 
> latch block has already been combined into the header's chain.  Unless 
> I'm misreading something, we *should* be able to merge the loop chain 
> with the preheader chain in it's entirety right?
>
> At least one on example I've looked at, adding an early exit from 
> BadCFGConflict loop when Pred and Succ are part of the same chain does 
> appear to give the expected result, but I don't understand the code 
> well enough to reason about whether that is generally a correct thing 
> to do or not.
>
> Philip
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
http://www.cs.rochester.edu/u/criswell

Philip Reames via llvm-dev

2016-Feb-24 17:21 UTC

head link

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

On 02/23/2016 09:48 PM, Xinliang David Li wrote:> Hi Phillip,
> http://reviews.llvm.org/D10825 tries to fix similar issues. Looks like 
> there are missing cases. Can you create a small reproducer?This doesn't quite look the same.  That review is about continuing the 
loop chain itself into the most profitable successor.  My case is about 
continuing the preheader's chain into the (previously formed) loop
chain.>
> Cong has also improved loop rotation based on better cost model -- it 
> is not yet enabled (-mllvm -precise-rotation-cost=true to turn on). If 
> possible, can you also give it a try?Loop rotation is not the problem here.  The layout of the loop 
(header->latch->header cycle) is exactly what it should
be.>
> thanks,
>
> David
>
> On Tue, Feb 23, 2016 at 5:19 PM, Philip Reames via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     I'm getting some odd behavior out of MBP and was hoping someone
>     knowledge of the code might be able to give some guidance. Fair
>     warning, I'm trying to describe a problem in code I don't
really
>     understand, so if something doesn't make sense, assume I
>     misunderstood something.
>
>     The problematic case I'm seeing is that cold blocks are being
>     placed between the preheader and header of a hot loop.  This has
>     the result of adding a bunch of cold code spread through out the
>     code rather than grouped all together at the end of the function.
>
>     From what I can tell tracing through the code, the critical
>     decision that goes wrong is when we're visiting the preheader
>     after forming a (correct) chain for the loop itself.  When
>     selecting a successor to merge with, we appear to not be
>     considering the loop even though the loop hasn't been rotated and
>     the header would be ideal for fallthrough.
>
>     In particular, we're printing the "(prob) (non-cold CFG
conflict)"
>     debug output message for the successor of the preheader which is
>     the header.  If I'm reading this code correctly, it's
identifying
>     the fact there's a global more important predecessor for the
>     header (i.e. the latch block), but it doesn't seem to be account
>     for the fact that the latch block has already been combined into
>     the header's chain.  Unless I'm misreading something, we
*should*
>     be able to merge the loop chain with the preheader chain in it's
>     entirety right?
>
>     At least one on example I've looked at, adding an early exit from
>     BadCFGConflict loop when Pred and Succ are part of the same chain
>     does appear to give the expected result, but I don't understand
>     the code well enough to reason about whether that is generally a
>     correct thing to do or not.
>
>     Philip
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160224/7c47c8b5/attachment.html>

Philip Reames via llvm-dev

2016-Mar-02 23:10 UTC

head link

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

I've posted a patch which addresses this:
http://reviews.llvm.org/D17830

On 02/23/2016 05:19 PM, Philip Reames wrote:> I'm getting some odd behavior out of MBP and was hoping someone 
> knowledge of the code might be able to give some guidance.  Fair 
> warning, I'm trying to describe a problem in code I don't really 
> understand, so if something doesn't make sense, assume I misunderstood 
> something.
>
> The problematic case I'm seeing is that cold blocks are being placed 
> between the preheader and header of a hot loop.  This has the result 
> of adding a bunch of cold code spread through out the code rather than 
> grouped all together at the end of the function.
>
> From what I can tell tracing through the code, the critical decision 
> that goes wrong is when we're visiting the preheader after forming a 
> (correct) chain for the loop itself.  When selecting a successor to 
> merge with, we appear to not be considering the loop even though the 
> loop hasn't been rotated and the header would be ideal for fallthrough.
>
> In particular, we're printing the "(prob) (non-cold CFG
conflict)"
> debug output message for the successor of the preheader which is the 
> header.  If I'm reading this code correctly, it's identifying the
fact
> there's a global more important predecessor for the header (i.e. the 
> latch block), but it doesn't seem to be account for the fact that the 
> latch block has already been combined into the header's chain.  Unless 
> I'm misreading something, we *should* be able to merge the loop chain 
> with the preheader chain in it's entirety right?
>
> At least one on example I've looked at, adding an early exit from 
> BadCFGConflict loop when Pred and Succ are part of the same chain does 
> appear to give the expected result, but I don't understand the code 
> well enough to reason about whether that is generally a correct thing 
> to do or not.
>
> Philip
>

llvm dev - Feb 2016 - Oddity w/MachineBlockPlacement and Loops

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

[llvm-dev] Oddity w/MachineBlockPlacement and Loops

[llvm-dev] Oddity w/MachineBlockPlacement and Loops