thr3ads.net - llvm dev - [LLVMdev] compare and swap [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Andrew Lenharth

2008-Feb-20 00:51 UTC

[LLVMdev] compare and swap

I was working on compare and swap and ran into the following problem.
Several architectures implement this with a load locked, store
conditional sequence.  This is good, for those archs I can write
generic code to legalize a compare and swap (and most other atomic
ops) to load locked store conditional sequences (then the arch only
had to give the instr for ldl, stc to support all atomic ops (this
applies to mips, arm, ppc, and alpha)).  However, I have to split the
basic block at the CAS instruction and create two more basic blocks.

This isn't currently possible during legalize, nor during the initial
SelectionDAG formation  (the tricks switch lowering uses only work for
terminator instructions).

Anyone have an idea?  The patch as it stands is attached below.  X86
is a pseudo instruction because the necessary ones and prefixes aren't
in the code gen yet, but I would imagine they will be (so ignore that
ugliness).  The true ugliness can be seen in the alpha impl which open
codes it, including a couple relative branches.  The code sequence for
alpha is identical to ppc, mips, and arm, so it would be nice to lower
these to the correct sequences before code gen rather than splitting
(or hiding as I did here) basic blocks after code gen.

Andrew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lcs.patch
Type: text/x-diff
Size: 15127 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080219/9554a14a/attachment.patch>

Evan Cheng

2008-Feb-20 01:33 UTC

head link

[LLVMdev] compare and swap

The current *hack* solution is to mark your pseudo instruction with  
usesCustomDAGSchedInserter = 1. That allows the targets to expand it  
at scheduling time by providing a EmitInstrWithCustomInserter() hook.  
You can create new basic blocks then.

Evan

On Feb 19, 2008, at 4:51 PM, Andrew Lenharth wrote:
> I was working on compare and swap and ran into the following problem.
> Several architectures implement this with a load locked, store
> conditional sequence.  This is good, for those archs I can write
> generic code to legalize a compare and swap (and most other atomic
> ops) to load locked store conditional sequences (then the arch only
> had to give the instr for ldl, stc to support all atomic ops (this
> applies to mips, arm, ppc, and alpha)).  However, I have to split the
> basic block at the CAS instruction and create two more basic blocks.
>
> This isn't currently possible during legalize, nor during the initial
> SelectionDAG formation  (the tricks switch lowering uses only work for
> terminator instructions).
>
> Anyone have an idea?  The patch as it stands is attached below.  X86
> is a pseudo instruction because the necessary ones and prefixes aren't
> in the code gen yet, but I would imagine they will be (so ignore that
> ugliness).  The true ugliness can be seen in the alpha impl which open
> codes it, including a couple relative branches.  The code sequence for
> alpha is identical to ppc, mips, and arm, so it would be nice to lower
> these to the correct sequences before code gen rather than splitting
> (or hiding as I did here) basic blocks after code gen.
>
> Andrew
> <lcs.patch>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Andrew Lenharth

2008-Feb-20 01:41 UTC

head link

[LLVMdev] compare and swap

On 2/19/08, Evan Cheng <evan.cheng at apple.com>
wrote:> The current *hack* solution is to mark your pseudo instruction with
> usesCustomDAGSchedInserter = 1. That allows the targets to expand it
> at scheduling time by providing a EmitInstrWithCustomInserter() hook.
> You can create new basic blocks then.
I guess that can work in the short term.  It just seems wasteful for
each target that uses ldl/stc sequences to have to all implement it.
But if that is what we can do right now, I'll give that a shot.

Thanks,

Andrew

Torvald Riegel

2008-Feb-21 09:19 UTC

head link

[LLVMdev] compare and swap

On Wednesday 20 February 2008 01:51, Andrew Lenharth
wrote:> Anyone have an idea?  The patch as it stands is attached below.  X86
> is a pseudo instruction because the necessary ones and prefixes aren't
> in the code gen yet, but I would imagine they will be (so ignore that
> ugliness).  The true ugliness can be seen in the alpha impl which open
> codes it, including a couple relative branches.  The code sequence for
> alpha is identical to ppc, mips, and arm, so it would be nice to lower
> these to the correct sequences before code gen rather than splitting
> (or hiding as I did here) basic blocks after code gen.
Andrew,

why is the intrinsic name not CAS? And having another version that returns 
just a bool might be better in some cases ( 1. does  CAS return the value on 
all architectures? 2. you can just jump based on a flag and don't need to 
compare it again). Just my 2 cents though ...

torvald

Andrew Lenharth

2008-Feb-21 17:34 UTC

head link

[LLVMdev] compare and swap

On 2/21/08, Torvald Riegel <torvald at se.inf.tu-dresden.de>
wrote:>  why is the intrinsic name not CAS? And having another version that returns
>  just a bool might be better in some cases ( 1. does  CAS return the value
on
>  all architectures? 2. you can just jump based on a flag and don't need
to
>  compare it again). Just my 2 cents though ...
I was going from chandler's docs, but it could be renamed trivially
(and I almost did at several points).

1) yes, but on some it may be easier to have a bool version than others.
2.a) to get the bool, the x86 (and some others) backend would have to
generate the compare instruction anyway, so you don't save anything by
having a bool version.
2.b) in the case of a load locked store conditional based backend, the
bool version would save a compare if the store conditional has the
typical returns success or failure semantics.

So, yes, a CAS that returned bool could be useful.  However, it is
pretty easy to pattern match
CAS -> Compare
in those backends that can save the compare by testing the result of
the store conditional.

Andrew

Chandler Carruth

2008-Feb-21 18:19 UTC

head link

[LLVMdev] compare and swap

Torvald Riegel wrote:> On Wednesday 20 February 2008 01:51, Andrew Lenharth wrote:
>> Anyone have an idea?  The patch as it stands is attached below.  X86
>> is a pseudo instruction because the necessary ones and prefixes
aren't
>> in the code gen yet, but I would imagine they will be (so ignore that
>> ugliness).  The true ugliness can be seen in the alpha impl which open
>> codes it, including a couple relative branches.  The code sequence for
>> alpha is identical to ppc, mips, and arm, so it would be nice to lower
>> these to the correct sequences before code gen rather than splitting
>> (or hiding as I did here) basic blocks after code gen.
> 
> Andrew,
> 
> why is the intrinsic name not CAS?
Because, fundamentally, it loads, compares, and conditionally stores. 
There is no concept of a "swap" in SSA, so removing that aspect of the
atomic primitives makes the *LLVM* representation easier to understand.
> And having another version that returns 
> just a bool might be better in some cases ( 1. does  CAS return the value
on
> all architectures?
Check the page (http://chandlerc.net/llvm_atomics.html -- the 
implementation info is still current, even though the docs are not) for 
how this gets implemented. As Andrew has already pointed out, on x86, 
the LLVM behavior maps to the underlying architecture. Other 
architectures which might avoid a compare can easily do so by pattern 
matching in the codegen. I'm not saying this is 100% correct mapping of 
all architectures, but it seems very clean, and not to introduce 
performance issues on any.
> 2. you can just jump based on a flag and don't need to 
> compare it again). Just my 2 cents though ...
Again, pattern matching can enable the architectures which don't need to 
compare again, to in fact not do so, but some architectures will *need* 
to compare again in order to determine the bool value.

My strongest feeling is that "swap" has no place in an SSA IR, and the
idea of atomically loading, comparing, and storing is far more in 
keeping. In fact, I thought the "swap" instrinsic had even been
re-named
to "ls" for load-store at some point this summer.. Do you have those 
changes Andrew? In any event, those are the reasons I had for moving 
away from "swap" in the design process, and as Andrew said he was 
primarily basing the implementation on that work.

-Chandler
> 
> torvald
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Feb 2008 - [LLVMdev] compare and swap

[LLVMdev] compare and swap

[LLVMdev] compare and swap

[LLVMdev] compare and swap

[LLVMdev] compare and swap

[LLVMdev] compare and swap

[LLVMdev] compare and swap

Reasonably Related Threads