thr3ads.net - llvm dev - [LLVMdev] Load serialisation during selection DAG building [Aug 2012]

If this information is useful, please help other people find it:
Share via:

Steve Montgomery

2012-Aug-13 10:29 UTC

[LLVMdev] Load serialisation during selection DAG building

I've got a question about how SelectionDAGBuilder treats loads.

The LLVM Language Reference Manual explicitly states that the order of volatile
operations may be changed relative to non-volatile operations. However, when the
SelectionDAGBuilder in LLVM 3.1 encounters a volatile load, it flushes all
pending loads and then chains the volatile load onto them meaning that the
volatile load must be scheduled after those loads. While this behaviour
isn't wrong, it seems to reduce the scope for efficient instruction
selection.

Is there any reason not to modify the behaviour of
SelectionDAGBuilder::visitLoad() so that volatile loads don't cause
SelectionDAGBuilder::getRoot() to be called. Instead, they can be chained
together with the head of the chain being stored in PendingLoads. Then when
something else calls SelectionDAGBuilder::getRoot(), the chain of volatile loads
is TokenFactored together with the non-volatile loads. I've tried this out
and it seems to do what I want but as I'm fairly inexperienced with LLVM,
I'm not sure whether there's something else preventing this strategy
from working.

The reason I noticed this is because I have been developing a back-end for a
target in which some instructions are implemented as pseudos which will be
expanded into a pair of instructions. Each of the two operands of the pseudo can
be a load but as the expansion accesses the right operand twice, I don't
want to match a volatile load for this operand. For example, in:

  %0 = load i16* @y, align 2, !tbaa !0
  %1 = load volatile i16* @x, align 2, !tbaa !0
  %add = add i16 %0, %1 

the volatile load is sequenced after the non-volatile load which means that the
non-volatile load can't match the left operand of the add because this would
create a scheduling cycle. This means both loads are selected as load
instructions, resulting in use of an extra register and an extra instruction. If
I change the behaviour of SelectionDAGBuilder::visitLoad() as described then I
get just two instructions.

Hal Finkel

2012-Aug-13 19:09 UTC

head link

[LLVMdev] Load serialisation during selection DAG building

Steve,

I had created a patch last year that does something similar to what you
describe for regular loads and stores using aliasing information. I
think that the last message in the thread was:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120402/140299.html

This approach has worked for me, but it is not the preferred solution
going forward. The preferred solution is to keep the "critical
chain" as is (partly because there are places in some of the backend
lowering code that assume it exists in its current form), and just
update the scheduler to be more intelligent about how it forms the
dependency graph. This has since been developed for the new MI
scheduling framework by Sergei Larin, and was committed in r156842
(last message in the discussion thread was
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142659.html)

I would recommend trying something like Sergei's solution first, and
fall back to trying to play with the critical chain only if that can't
or won't work.

 -Hal

On Mon, 13 Aug 2012 11:29:18 +0100
Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote:
> I've got a question about how SelectionDAGBuilder treats loads.
> 
> The LLVM Language Reference Manual explicitly states that the order
> of volatile operations may be changed relative to non-volatile
> operations. However, when the SelectionDAGBuilder in LLVM 3.1
> encounters a volatile load, it flushes all pending loads and then
> chains the volatile load onto them meaning that the volatile load
> must be scheduled after those loads. While this behaviour isn't
> wrong, it seems to reduce the scope for efficient instruction
> selection.
> 
> Is there any reason not to modify the behaviour of
> SelectionDAGBuilder::visitLoad() so that volatile loads don't cause
> SelectionDAGBuilder::getRoot() to be called. Instead, they can be
> chained together with the head of the chain being stored in
> PendingLoads. Then when something else calls
> SelectionDAGBuilder::getRoot(), the chain of volatile loads is
> TokenFactored together with the non-volatile loads. I've tried this
> out and it seems to do what I want but as I'm fairly inexperienced
> with LLVM, I'm not sure whether there's something else preventing
> this strategy from working.
> 
> The reason I noticed this is because I have been developing a
> back-end for a target in which some instructions are implemented as
> pseudos which will be expanded into a pair of instructions. Each of
> the two operands of the pseudo can be a load but as the expansion
> accesses the right operand twice, I don't want to match a volatile
> load for this operand. For example, in:
> 
>   %0 = load i16* @y, align 2, !tbaa !0
>   %1 = load volatile i16* @x, align 2, !tbaa !0
>   %add = add i16 %0, %1 
> 
> the volatile load is sequenced after the non-volatile load which
> means that the non-volatile load can't match the left operand of the
> add because this would create a scheduling cycle. This means both
> loads are selected as load instructions, resulting in use of an extra
> register and an extra instruction. If I change the behaviour of
> SelectionDAGBuilder::visitLoad() as described then I get just two
> instructions. _______________________________________________ LLVM
> Developers mailing list LLVMdev at cs.uiuc.edu
> http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-- 
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory

Steve Montgomery

2012-Aug-13 19:15 UTC

head link

[LLVMdev] Load serialisation during selection DAG building

Thanks Hal,

I'll look into it Sergei's solution.

Steve

On 13 Aug 2012, at 20:09, Hal Finkel wrote:
> Steve,
> 
> I had created a patch last year that does something similar to what you
> describe for regular loads and stores using aliasing information. I
> think that the last message in the thread was:
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120402/140299.html
> 
> This approach has worked for me, but it is not the preferred solution
> going forward. The preferred solution is to keep the "critical
> chain" as is (partly because there are places in some of the backend
> lowering code that assume it exists in its current form), and just
> update the scheduler to be more intelligent about how it forms the
> dependency graph. This has since been developed for the new MI
> scheduling framework by Sergei Larin, and was committed in r156842
> (last message in the discussion thread was
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142659.html)
> 
> I would recommend trying something like Sergei's solution first, and
> fall back to trying to play with the critical chain only if that can't
> or won't work.
> 
> -Hal
> 
> On Mon, 13 Aug 2012 11:29:18 +0100
> Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote:
> 
>> I've got a question about how SelectionDAGBuilder treats loads.
>> 
>> The LLVM Language Reference Manual explicitly states that the order
>> of volatile operations may be changed relative to non-volatile
>> operations. However, when the SelectionDAGBuilder in LLVM 3.1
>> encounters a volatile load, it flushes all pending loads and then
>> chains the volatile load onto them meaning that the volatile load
>> must be scheduled after those loads. While this behaviour isn't
>> wrong, it seems to reduce the scope for efficient instruction
>> selection.
>> 
>> Is there any reason not to modify the behaviour of
>> SelectionDAGBuilder::visitLoad() so that volatile loads don't cause
>> SelectionDAGBuilder::getRoot() to be called. Instead, they can be
>> chained together with the head of the chain being stored in
>> PendingLoads. Then when something else calls
>> SelectionDAGBuilder::getRoot(), the chain of volatile loads is
>> TokenFactored together with the non-volatile loads. I've tried this
>> out and it seems to do what I want but as I'm fairly inexperienced
>> with LLVM, I'm not sure whether there's something else
preventing
>> this strategy from working.
>> 
>> The reason I noticed this is because I have been developing a
>> back-end for a target in which some instructions are implemented as
>> pseudos which will be expanded into a pair of instructions. Each of
>> the two operands of the pseudo can be a load but as the expansion
>> accesses the right operand twice, I don't want to match a volatile
>> load for this operand. For example, in:
>> 
>>  %0 = load i16* @y, align 2, !tbaa !0
>>  %1 = load volatile i16* @x, align 2, !tbaa !0
>>  %add = add i16 %0, %1 
>> 
>> the volatile load is sequenced after the non-volatile load which
>> means that the non-volatile load can't match the left operand of
the
>> add because this would create a scheduling cycle. This means both
>> loads are selected as load instructions, resulting in use of an extra
>> register and an extra instruction. If I change the behaviour of
>> SelectionDAGBuilder::visitLoad() as described then I get just two
>> instructions. _______________________________________________ LLVM
>> Developers mailing list LLVMdev at cs.uiuc.edu
>> http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> 
> -- 
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory

Steve Montgomery

2012-Aug-14 15:43 UTC

head link

[LLVMdev] Load serialisation during selection DAG building

I looked into those patches but I don't think they will help in my situation
because my problems occur during instruction selection rather than scheduling.

A simple and concrete example is a pattern like:

    [(set GR:$dst (add GR:$src (nvload addr:$mem)))]

where nvload matches a load provided that isVolatile() is false.

If the selection DAG looks like:

     |      |
    LD1    LD2
     ^      ^
     |      |
     \      /
       add
        ^
        |
        \  /
         ST
                 
and the chain like:

    LD1    LD2
     ^      ^
     |      |
     \      /
    TokenFactor
        ^
        |
        ST
     
then the add instruction is selected. However, if operand 1 of the add is a
volatile load, then the chain looks like:

    LD1
     ^ 
     |
    LD2Volatile
     |
     ST

In this case the add instruction cannot be selected. The operator is
commutative, so instruction selection matches the non-volatile load against the
operand 1 but then fails because to select this instruction would mean that the
volatile load into a register (to match operand 0) would occur before the
non-volatile load that's folded into the instruction.

I can get this instruction to be selected by changing the way loads are built
into the selection DAG, i.e. making the chain:

    LD1    LD2Volatile
     ^      ^
     |      |
     \      /
    TokenFactor
        ^
        |
        ST

Given what the LRM says about volatile accesses, this seems valid but I take
Hal's point about there being code that might not be expecting such a chain.

It's a matter of changing a few lines in SelectionDAGBuilder.cpp to achieve
what I want so I think I'll go ahead and see what happens.

If Hal or anyone else has any further thoughts on the potential problems with
doing so then I'd be glad to know.

Thanks

Steve Montgomery

On 13 Aug 2012, at 20:09, Hal Finkel wrote:
> Steve,
> 
> I had created a patch last year that does something similar to what you
> describe for regular loads and stores using aliasing information. I
> think that the last message in the thread was:
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120402/140299.html
> 
> This approach has worked for me, but it is not the preferred solution
> going forward. The preferred solution is to keep the "critical
> chain" as is (partly because there are places in some of the backend
> lowering code that assume it exists in its current form), and just
> update the scheduler to be more intelligent about how it forms the
> dependency graph. This has since been developed for the new MI
> scheduling framework by Sergei Larin, and was committed in r156842
> (last message in the discussion thread was
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120507/142659.html)
> 
> I would recommend trying something like Sergei's solution first, and
> fall back to trying to play with the critical chain only if that can't
> or won't work.
> 
> -Hal
> 
> On Mon, 13 Aug 2012 11:29:18 +0100
> Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote:
> 
>> I've got a question about how SelectionDAGBuilder treats loads.
>> 
>> The LLVM Language Reference Manual explicitly states that the order
>> of volatile operations may be changed relative to non-volatile
>> operations. However, when the SelectionDAGBuilder in LLVM 3.1
>> encounters a volatile load, it flushes all pending loads and then
>> chains the volatile load onto them meaning that the volatile load
>> must be scheduled after those loads. While this behaviour isn't
>> wrong, it seems to reduce the scope for efficient instruction
>> selection.
>> 
>> Is there any reason not to modify the behaviour of
>> SelectionDAGBuilder::visitLoad() so that volatile loads don't cause
>> SelectionDAGBuilder::getRoot() to be called. Instead, they can be
>> chained together with the head of the chain being stored in
>> PendingLoads. Then when something else calls
>> SelectionDAGBuilder::getRoot(), the chain of volatile loads is
>> TokenFactored together with the non-volatile loads. I've tried this
>> out and it seems to do what I want but as I'm fairly inexperienced
>> with LLVM, I'm not sure whether there's something else
preventing
>> this strategy from working.
>> 
>> The reason I noticed this is because I have been developing a
>> back-end for a target in which some instructions are implemented as
>> pseudos which will be expanded into a pair of instructions. Each of
>> the two operands of the pseudo can be a load but as the expansion
>> accesses the right operand twice, I don't want to match a volatile
>> load for this operand. For example, in:
>> 
>>  %0 = load i16* @y, align 2, !tbaa !0
>>  %1 = load volatile i16* @x, align 2, !tbaa !0
>>  %add = add i16 %0, %1 
>> 
>> the volatile load is sequenced after the non-volatile load which
>> means that the non-volatile load can't match the left operand of
the
>> add because this would create a scheduling cycle. This means both
>> loads are selected as load instructions, resulting in use of an extra
>> register and an extra instruction. If I change the behaviour of
>> SelectionDAGBuilder::visitLoad() as described then I get just two
>> instructions. _______________________________________________ LLVM
>> Developers mailing list LLVMdev at cs.uiuc.edu
>> http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> 
> -- 
> Hal Finkel
> Postdoctoral Appointee
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120814/54546950/attachment.html>

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Aug 2012 - [LLVMdev] Load serialisation during selection DAG building

[LLVMdev] Load serialisation during selection DAG building

[LLVMdev] Load serialisation during selection DAG building

[LLVMdev] Load serialisation during selection DAG building

[LLVMdev] Load serialisation during selection DAG building

Reasonably Related Threads