thr3ads.net - llvm dev - [LLVMdev] Proposal for atomic and synchronization instructions [Jul 2007]

If this information is useful, please help other people find it:
Share via:

Scott Michel

2007-Jul-09 23:38 UTC

[LLVMdev] Proposal for atomic and synchronization instructions

Torvald Riegel wrote:> On Monday 09 July 2007 19:33, Scott Michel wrote:
>> Torvald Riegel wrote:
>>> Hi,
>>>
>>> I'd like to see support for something like this. I have some
comments,
>>> and I think there is existing work that you can reuse.
>> "reuse within the compiler."
> 
> within the LLVM compiler framework, to be precise.
> 
>>> "While the processor may spin and attempt the atomic operation
more than
>>> once before it is successful, research indicates this is extremely
>>> uncommon." I don't understand this sentence, what do you
mean?
>> I'm not sure I can pinpoint the paper from which the statement is
based,
>> but I seem to recall something similar in the original LL-SC papers
>> (Maurice Herlihy, DEC Western Research Labs?) It's a foundation for
>> lock-free algorithms.
> 
> Well, the statement says that often you have low contention. But that's
> something you want, not necessarily something you will get, and depends on 
> the workload/algorithm. I'm missing the context. Is the actual
statement as
> obvious as that you should try to use the atomic instructions offered by
your
> processor, instead of doing blocking algorithms?
As Chandler pointed out, LL/SC isn't blocking. It belongs to the
optimistic concurrency class of constructs. One of the earliest papers
(IIRC, the first paper) on LL/SC was:

Herlihy, M. 1993. A methodology for implementing highly concurrent data
objects. ACM Trans. Program. Lang. Syst. 15, 5 (Nov. 1993), 745-770.
DOI= http://doi.acm.org/10.1145/161468.161469

LL/SC on the various RISC architectures are used for spin locks, but
they don't have to be used that way. I suspect that current work on
software transactional memory is LL/SC-like on memory regions -- if you
look at the paper, there is a chunk of code in the examples that rolls
back or restarts a computation if the SC operation fails.
> Please have a real look at atomic_ops first. It does have a library part to
> it -- but that's just for a nonblocking stack.
It's a lot like Apple's (and gcc's) work to reconcile the Intel and
PPC
vector intrinsics. Nice work but an unnecessary dependency, in my
personal and not so humble opinion.
> Second, I guess there has been some serious effort put into selecting the 
> specific model. So, for example, if you look at some of Hans' published
> slides etc., there are some arguments in favor of associating membars with 
> specific instructions. Do you know reasons why LLVM shouldn't do this?
You mean the papers that don't have to do with garbage collection? :-)

Seriously, I think that's the overall purpose for some of this work so
that llvm can do a better job in instruction-level parallelism.
> Has anyone looked at the memory models that are being in discussion for
C/C++?
> Although there is no consensus yet AFAIK, it should be good for LLVM to
stay
> close.
Even when consensus is achieved, it still has to be implemented on the
hardware. As you point out, LL/SC is used to create spinlocks. But LL/SC
is somewhat more powerful than that.



-scooter

Torvald Riegel

2007-Jul-10 09:57 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

On Tuesday 10 July 2007 01:38, Scott Michel wrote:> As Chandler pointed out, LL/SC isn't blocking. It belongs to the
> optimistic concurrency class of constructs. One of the earliest papers
> (IIRC, the first paper) on LL/SC was:
>
> Herlihy, M. 1993. A methodology for implementing highly concurrent data
> objects. ACM Trans. Program. Lang. Syst. 15, 5 (Nov. 1993), 745-770.
> DOI= http://doi.acm.org/10.1145/161468.161469
>
> LL/SC on the various RISC architectures are used for spin locks, but
> they don't have to be used that way. I suspect that current work on
> software transactional memory is LL/SC-like on memory regions -- if you
> look at the paper, there is a chunk of code in the examples that rolls
> back or restarts a computation if the SC operation fails.
First of all, I know LL/SC. Did I say it's equivalent to get-and-set? No.
So what are you trying to say, why is the paragraph in the proposal? You 
seem to be speculating about architectures in general in one paragraph. 
IMHO, I wouldn't try that, because I would have to be either imprecise or 
don't state anything new.

> > Please have a real look at atomic_ops first. It does have a library
> > part to it -- but that's just for a nonblocking stack.
>
> It's a lot like Apple's (and gcc's) work to reconcile the Intel
and PPC
> vector intrinsics. Nice work but an unnecessary dependency, in my
> personal and not so humble opinion.
So when you are reinventing the wheel, it doesn't give you a dependency on 
the wheel, is that what you're saying?

The idea is to review the atomic_ops model, and if it makes sense, just 
reuse it. (e.g., atomic_ops seems to have (basic?) support for Alpha).

> > Second, I guess there has been some serious effort put into selecting
> > the specific model. So, for example, if you look at some of Hans'
> > published slides etc., there are some arguments in favor of
associating
> > membars with specific instructions. Do you know reasons why LLVM
> > shouldn't do this?
>
> You mean the papers that don't have to do with garbage collection? :-)
>
> Seriously, I think that's the overall purpose for some of this work so
> that llvm can do a better job in instruction-level parallelism.
Can I get an answer to the actual question, please?

> > Has anyone looked at the memory models that are being in discussion
for
> > C/C++? Although there is no consensus yet AFAIK, it should be good for
> > LLVM to stay close.
>
> Even when consensus is achieved, it still has to be implemented on the
> hardware. As you point out, LL/SC is used to create spinlocks. But LL/SC
> is somewhat more powerful than that.
Well, the proposals actually consider the implementation (and, e.g., 
atomic_ops is an actual implementation). How stupid would be a model that 
doesn't map well to hardware?
And, sorry, I don't see how the particular LL/SC mechanism is related to my 
question at all. Could we stay on topic, please?

torvald

Scott Michel

2007-Jul-10 16:43 UTC

head link

[LLVMdev] Proposal for atomic and synchronization instructions

Torvald Riegel wrote:> First of all, I know LL/SC. Did I say it's equivalent to get-and-set?
No.
> So what are you trying to say, why is the paragraph in the proposal? You 
> seem to be speculating about architectures in general in one paragraph. 
> IMHO, I wouldn't try that, because I would have to be either imprecise
or
> don't state anything new.
I was rebutting your point regarding spin locks going through the loop
once; spinning for more than one iteration is generally rare. And no, I
am not speculating about architectures in general, for that matter. I
simply like LL/SC and think it's superior to most other primitives,
being a matter of good taste.

BTW: It's not my proposal. I merely work with Chandler.
>>> Please have a real look at atomic_ops first. It does have a library
>>> part to it -- but that's just for a nonblocking stack.
>> It's a lot like Apple's (and gcc's) work to reconcile the
Intel and PPC
>> vector intrinsics. Nice work but an unnecessary dependency, in my
>> personal and not so humble opinion.
> 
> So when you are reinventing the wheel, it doesn't give you a dependency
on
> the wheel, is that what you're saying?
No. I'm not saying that at all. If you actually took a look at LLVM,
you'd notice that it stands alone. It has very few dependencies upon
outside code and it generates __no__ dependencies to outside code or
libraries. From previous experience developing for LLVM, I happen to
know that unnecessary dependencies are not viewed favorably.
> The idea is to review the atomic_ops model, and if it makes sense, just 
> reuse it. (e.g., atomic_ops seems to have (basic?) support for Alpha).
atomic_ops may have interesting ideas on how Chandler might proceed and
implement, but using its code is very unlikely.
>>> Second, I guess there has been some serious effort put into
selecting
>>> the specific model. So, for example, if you look at some of
Hans'
>>> published slides etc., there are some arguments in favor of
associating
>>> membars with specific instructions. Do you know reasons why LLVM
>>> shouldn't do this?
>> You mean the papers that don't have to do with garbage collection?
:-)
>>
>> Seriously, I think that's the overall purpose for some of this work
so
>> that llvm can do a better job in instruction-level parallelism.
> 
> Can I get an answer to the actual question, please?
Being argumentative for the sake of being argumentative isn't going to
motivate me to answer your question. LLVM is a machine IR and its
instructions have to map to something that exists and has reasonable
properties. Asking for acquire/release, which only exists on one
architecture, doesn't fit the pattern (and is arguably not such a great
idea within that particular implementation.)
> Could we stay on topic, please?
Bite me.


-scooter

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Jul 2007 - [LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

[LLVMdev] Proposal for atomic and synchronization instructions

Apparently Analagous Threads