thr3ads.net - llvm dev - [llvm-dev] Problem with "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches" [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Marcello Maggioni via llvm-dev

2016-Sep-02 21:40 UTC

[llvm-dev] Problem with "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"

Probably the issue is solvable in some Codegen prepare pass.

That said I still believe some kind of control on if we would like to implement
this or not could be useful.

Just a question. Why implementing it in SimplifyCFG and not as a separate pass
like JumpThreading or something like that? The transformation itself doesn’t
seem to fit much in SimplifyCFG.> On 2 Sep 2016, at 13:35, Michael Kuperstein <mkuper at google.com>
wrote:
> 
> Hello Marcello,
> 
> We're currently planning to disable this optimization - at least
temporarily - for cases where we'll end up creating PHIs of pointers for
performance reasons (see PR30188).
> 
> Having said that - I don't believe this is something you can rely on
for correctness.
> From an IR semantics perspective, this is a completely legal transformation
that can be introduced by any optimization pass, for whatever reason. We
can't control that throughout the entire optimization stack with a TTI hook.
Two potential solutions I can see are to either (a) teach your backend to handle
it (by disambiguating these loads in a pre-ISel IR pass), or (b) use some kind
of type-system trick to make it impossible (say, assign a different address
space to each distinct pointer in the front end).
> 
> Thanks,
>   Michael
> 
> On Fri, Sep 2, 2016 at 11:39 AM, Marcello Maggioni via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
> Hello,
> 
> I’m getting some failures on our internal testing happening after this
commit.
> 
> I dug a little bit into it and it ended up begin caused by the fact that
this optimization is sinking some instructions that shouldn’t be sunk
> 
> In particular we have some loads that need to have a constant address to be
selected, because they load from a “special address space”.
> 
> What the optimization is doing is extracting the “getelementptr” , making
an instruction out of it, sinking the load and using a PHI to select between the
address.
> This breaks our selection of this types of loads.
> 
> We need a way to tell this code to not do that.
> 
> The solutions I kind of see for this is:
> 
> - Add some TTI hook that stops this sinking from happening for instructions
that don’t support that like in this case (maybe added in canSinkInstructions())
> 
> - Add some TTI hook that is a little bit coarser that stops the
optimization to run altogether. In particular this thing is creating
“unstructured control-flow” in some cases and this is not always good (and
actually generally bad) on GPU targets.
> 
> - Put the optimization in another separate pass. It seems to be doing stuff
that is a little bit more than just “Simplifying the CFG” actually. It is
actually making it more complex by splitting blocks around. I wonder if it would
make sense for it to live it in a separate pass that can be enabled or disabled
at the compiler pipeline level so that one that wants to opt out can just avoid
to add the pass to the pipeline.
> 
> Open for suggestions on what the best solution for this is.
> 
> Thanks,
> Marcello
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160902/88267cce/attachment.html>

James Molloy via llvm-dev

2016-Sep-02 22:05 UTC

head link

[llvm-dev] Problem with "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"

Hi,

On 2 Sep 2016, at 22:40, Marcello Maggioni <mmaggioni at
apple.com<mailto:mmaggioni at apple.com>> wrote:

What the optimization is doing is extracting the “getelementptr” , making an
instruction out of it, sinking the load and using a PHI to select between the
address.
This breaks our selection of this types of loads.

Are you referring to a downstream intrinsic, an upstream intrinsic or are you
somehow adding strange restrictions yourself on what a load instruction can and
can’t do?

There’s a function canReplaceOperandWithVariable() that this optimisation uses
to decide if it’s possible to do this. We use this to avoid making things that
must be constant (like your example) variable. Perhaps this is doing something
wrong? If you’re adding arbitrary restrictions on what loads can do though
that’s never going to go well - I’m sorry but we just don’t support that use
case. That’s what intrinsics are for.
> Just a question. Why implementing it in SimplifyCFG and not as a separate
pass like JumpThreading or something like that?
Because sinking instructions into successors already exists in SimplifyCFG and
has done for years. This is a small modification to make it a bit more clever.
SimplifyCFG does lots of this kind of stuff already (switch->lookup table for
example).

Cheers,

James
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160902/e98ea900/attachment.html>

Marcello Maggioni via llvm-dev

2016-Sep-02 23:54 UTC

head link

[llvm-dev] Problem with "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"

It’s not about weird restrictions we are adding, it’s about the fact that for
that specific address space we cannot dynamically index into the memory.

The address needs to be a constant of some sort in the selection (it’s an
hardware limitation, there’s nothing we can do about it) and the fact that it is
transformed into a PHI makes it not constant anymore.

We will need to undo this into Codegen prepare by either reverse that putting
the load back into the predecessors and creating a PHI of the result of the
loads.

Having a way to opt-out or control what you want to sink I still think would be
a added value to the optimization.

Marcello > On 2 Sep 2016, at 15:05, James Molloy <James.Molloy at arm.com>
wrote:
> 
> Hi,
> 
>> On 2 Sep 2016, at 22:40, Marcello Maggioni <mmaggioni at apple.com
<mailto:mmaggioni at apple.com>> wrote:
>> 
>>> What the optimization is doing is extracting the “getelementptr” ,
making an instruction out of it, sinking the load and using a PHI to select
between the address.
>>> This breaks our selection of this types of loads.
> 
> Are you referring to a downstream intrinsic, an upstream intrinsic or are
you somehow adding strange restrictions yourself on what a load instruction can
and can’t do?
> 
> There’s a function canReplaceOperandWithVariable() that this optimisation
uses to decide if it’s possible to do this. We use this to avoid making things
that must be constant (like your example) variable. Perhaps this is doing
something wrong? If you’re adding arbitrary restrictions on what loads can do
though that’s never going to go well - I’m sorry but we just don’t support that
use case. That’s what intrinsics are for.
> 
> > Just a question. Why implementing it in SimplifyCFG and not as a
separate pass like JumpThreading or something like that?
> 
> Because sinking instructions into successors already exists in SimplifyCFG
and has done for years. This is a small modification to make it a bit more
clever. SimplifyCFG does lots of this kind of stuff already (switch->lookup
table for example).
> 
> Cheers,
> 
> James
> IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160902/7516e99b/attachment.html>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Sep 2016 - Problem with "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"

[llvm-dev] Problem with "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"

[llvm-dev] Problem with "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"

[llvm-dev] Problem with "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"

Possibly Parallel Threads