thr3ads.net - llvm dev - [LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities [May 2014]

If this information is useful, please help other people find it:
Share via:

David Chisnall

2014-May-10 17:29 UTC

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

On 10 May 2014, at 18:14, Tim Northover <t.p.northover at gmail.com>
wrote:
>> The easiest solution would be to extend the cmpxchg instruction with a
>> weak variant.  It is then trivial to map load, modify, weak-cmpxchg to
>> load-linked, modify, store-conditional (that is what weak cmpxchg was
>> intended for in the C[++]11 memory model).
> 
> That would certainly be the easiest. But you'd get less scope for
> optimising control flow around the instructions (say an early return
> on failure or something). I think quite a bit can be done if LLVM
> *really* knows what's going to be going on with these atomic ops on
> LL/SC architectures.
I am not sure of any transforms that we'd want to do that aren't
microarchitecture-specific that need to know about the difference between
ll-modify-sc and load-modify-weak-cmpxchg.
>> I don't suppose you have any plans to port Mips to the IR-level
LL/SC
>>> expansion? Now that the infrastructure is present it's quite a
>>> simplification (r206490 in ARM64 for example, though you need
existing
>>> target-specific intrinsics at the moment). It would be good to iron
>>> out any ARM-specific assumptions I've made.
>> 
>> I'd rather avoid it, because it doing it that late precludes a lot
of optimisations
>> that we're interested in.  I'd much rather extend the IR to
support them at a
>> generic level.
> 
> I think you might be misinterpreting what the change actually is.
> Currently the expansion happens post-ISel (emitAtomicBinary and
> friends building the control flow and MachineInstrs directly).
> 
> This moves it to before ISel but still late in the pipeline (actually,
> you could even put it earlier: I didn't because of fears of opaque
> @llvm.arm.ldrex intrinsics pessimising mid-end optimisations).
> Strictly earlier than what happens now, and a reasonable
> stepping-stone to generic load-linked instructions or intrinsics.
The problem is that the optimisations that we're most interested in should
be done by the mid-level optimisers and are architecture agnostic.
> In my experience, CodeGen has improved with the change. ISelDAG gets
> to make use of more information when choosing how to do the operation:
> values already known to be sign/zero extended, immediates, etc.
Yes, it's definitely an improvement in the short term, but I'm not
convinced by the approach in the long term.  It's a useful hack that works
around a shortcoming in the IR, not a solution.

David

Tim Northover

2014-May-10 17:41 UTC

head link

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

>> In my experience, CodeGen has improved with the change. ISelDAG gets
>> to make use of more information when choosing how to do the operation:
>> values already known to be sign/zero extended, immediates, etc.
>
> Yes, it's definitely an improvement in the short term, but I'm not
convinced
> by the approach in the long term.  It's a useful hack that works around
a
> shortcoming in the IR, not a solution.
Hmm, so it sounds like you're not actually after an IR-level LL/SC,
but a higher-level  "cmpxchg weak". Fair enough, I suppose I'd
envisaged putting that burden on Clang.

Tim.

David Chisnall

2014-May-10 18:01 UTC

head link

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

On 10 May 2014, at 18:41, Tim Northover <t.p.northover at gmail.com>
wrote:
>>> In my experience, CodeGen has improved with the change. ISelDAG
gets
>>> to make use of more information when choosing how to do the
operation:
>>> values already known to be sign/zero extended, immediates, etc.
>> 
>> Yes, it's definitely an improvement in the short term, but I'm
not convinced
>> by the approach in the long term.  It's a useful hack that works
around a
>> shortcoming in the IR, not a solution.
> 
> Hmm, so it sounds like you're not actually after an IR-level LL/SC,
> but a higher-level  "cmpxchg weak". Fair enough, I suppose
I'd
> envisaged putting that burden on Clang.
Yes.  The weak cmpxchg is what the C[++]11 memory model provides, so there's
a lot of work proving soundness for various transforms involving it.  Once it
gets to pre-codegen IR passes, it's trivial to map a load that's paired
with an weak cmpxchg to a ll / ldrex and the cmpxchg to the sc / strex.  This
could be a generic IR pass that is parameterised with the names of the ll / sc
intrinsics (or even some architecture-agnostic intrinsics for ll / sc, since
they're fairly common), but ideally the optimisation would be on something
that closely resembles the memory model of the source language.  There are also
microarchitectural optimisations that can happen later.

In clang currently, we approximate a weak cmpxchg with a strong cmpxchg, but
that approximation is not quite semantically valid for all architectures (strong
cmpxchg is permitted to block, weak is not) and is not ideal for optimisation
either.

David

llvm dev - May 2014 - [LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities