thr3ads.net - llvm dev - [LLVMdev] Why can't atomic loads and stores handle floats? [May 2014]

If this information is useful, please help other people find it:
Share via:

Philip Reames

2014-May-26 18:49 UTC

[LLVMdev] Why can't atomic loads and stores handle floats?

David provided one good answer.  I'll give another.

The current design pushes complexity into the language frontend for - as 
far as I know - no good reason.  I can say from recent experience that 
the corner cases around atomics are both surprising and result in odd 
looking hacks in the frontend.  To say this differently, why should 
marking loads and stores atomic required me to rewrite largish chunks of 
code around the load or store?  There's nothing "wrong" per se
with that
design, but why complicate a bunch of frontends when a single IR level 
desugarring pass could preform the same logic?

Another answer would be that bitcasts make the IR less readable. They 
consume memory.  Unless handled carefully, they inhibit optimizations.  
(i.e. if you forget to strip casts in a peephole optimization)  When 
dealing with large IR files from a language where *every* field access 
is atomic "unordered", the first two are particularly important.

p.s.   I'm currently operating under the assumption that there is no 
*technical* reason LLVM could represent atomic loads and stores on 
floating point types.  If this is not true, please correct me.

Philip

On 05/24/2014 03:18 PM, Filip Pizlo wrote:> What is the downside of the currently generated IR?  There ain't 
> nothin' wrong with bitcasts, IMO.
>
> -Filip
>
> On May 24, 2014, at 2:17 PM, Philip Reames <listmail at philipreames.com
> <mailto:listmail at philipreames.com>> wrote:
>
>> Looking through the documentation, I discovered that atomic loads and 
>> stores are only supported for integer types.  Can anyone provide some 
>> background on this?  Why is this true?
>>
>> Currently, given code:
>> std::atomic<float> aFloat;
>> void foo() {
>>   float f = atomic_load(&aFloat);
>>   ..
>> }
>>
>> Clang generates code like:||
>> %"struct.std::atomic.2" = type { float }
>> @aFloat = global %"struct.std::atomic.2" zeroinitializer,
align 4
>>
>> define void @foo() {
>>   %1 = load atomic i32* bitcast (%"struct.std::atomic.2"*
@aFloat to
>> i32*) seq_cst, align 4
>>   %2 = bitcast i32 %1 to float
>>   ...
>> }
>>
>> This seems less than ideal.  I would expect that we might have to 
>> desugar floats into integer & cast operations in the backend, but
why
>> is this imposed on the frontend?
>>
>> More generally, is there anyone who is knowledgeable and/or working 
>> on atomics and synchronization in LLVM?  I've got a number of 
>> questions w.r.t. semantics and have found a number of what I believe 
>> to be missed optimizations.  I'm happy to file the later, but
I'd
>> like to talk them over with a knowledgeable party first.
>>
>> Philip
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140526/ea492052/attachment.html>

Tim Northover

2014-May-26 19:51 UTC

head link

[LLVMdev] Why can't atomic loads and stores handle floats?

> There's nothing "wrong" per se with that design, but why
> complicate a bunch of frontends when a single IR level desugarring pass
> could preform the same logic?
I quite like this idea. It could give David his atomic ops where an
integer really can't do the right thing, and isn't just shunting the
burden onto all of the backends. Some restrictions would still be
needed. A "load atomic [1000 x i64]* %addr" is just being cheeky.

The biggest issue I see is modelling the legal loads for a target. For
example AArch64 probably has "legal" monotonic loads for most sane
types, in the sense that they can be implemented in the same way as
non-atomic ones. But there's no "ldar s0, [addr]", and you
can't
simply replace an atomic load with a normal load even in the weaker
cases because you have no say in what passes run after your shiny
expansion pass.

With appropriate target hooks, I think it could be made to work.

Cheers.

Tim.

David Chisnall

2014-May-26 20:26 UTC

head link

[LLVMdev] Why can't atomic loads and stores handle floats?

On 26 May 2014, at 20:51, Tim Northover <t.p.northover at gmail.com>
wrote:
> I quite like this idea. It could give David his atomic ops where an
> integer really can't do the right thing, and isn't just shunting
the
> burden onto all of the backends. Some restrictions would still be
> needed. A "load atomic [1000 x i64]* %addr" is just being cheeky.
Currently, the frontend will have to lower these to calls to the __atomic
functions, but there's no technical reason for this on all architectures. 
Haswell and newer Intel chips *can* implement atomic loads of 1000 x i64: with
the transactional extensions, the limit for loads is very large (the limit for
atomic writes is around 30KB).

As transactional memory becomes more common, large atomicrmw operations become
possible, but LLVM IR can't meaningfully express them.  Currently, two
architectures in LLVM support hardware transactional memory in some form: x86
and BlueGene/Q.

One of the biggest issues I face implementing the back end for our architecture
is the willingness, both in mapping to IR and then to SelectionDAG for LLVM to
throw away information that is not yet meaningful to existing back ends. 
Let's try not to make that any worse for future back-end authors.  Lots of
people are trying to use LLVM for custom ASICs and at EuroLLVM there were a
number of people who had encountered similar problems in exactly this.

David

Philip Reames

2014-May-27 16:30 UTC

head link

[LLVMdev] Why can't atomic loads and stores handle floats?

On 05/26/2014 12:51 PM, Tim Northover wrote:>> There's nothing "wrong" per se with that design, but why
>> complicate a bunch of frontends when a single IR level desugarring pass
>> could preform the same logic?
> I quite like this idea. It could give David his atomic ops where an
> integer really can't do the right thing, and isn't just shunting
the
> burden onto all of the backends. Some restrictions would still be
> needed. A "load atomic [1000 x i64]* %addr" is just being cheeky.
>
> The biggest issue I see is modelling the legal loads for a target. For
> example AArch64 probably has "legal" monotonic loads for most
sane
> types, in the sense that they can be implemented in the same way as
> non-atomic ones. But there's no "ldar s0, [addr]", and you
can't
> simply replace an atomic load with a normal load even in the weaker
> cases because you have no say in what passes run after your shiny
> expansion pass.
>
> With appropriate target hooks, I think it could be made to work.I'm wiling to take this on.  It's not going to be immediately, but 
cleaning up the code in my frontend is worth the work.

It seems like the logical place for this would either be in CodeGenPrep 
or SelectionDAGBuilder.  Does anyone know of any reason why it would 
need to be done earlier?

I'm going to ignore the generalized transactional-memory use cases for 
the moment.  I want to stick to the subset of features which have fairly 
wide support across platforms.  Honestly, the transactional memory bits 
feel like they should be solved differently anyways.

Philip

llvm dev - May 2014 - [LLVMdev] Why can't atomic loads and stores handle floats?

[LLVMdev] Why can't atomic loads and stores handle floats?

[LLVMdev] Why can't atomic loads and stores handle floats?

[LLVMdev] Why can't atomic loads and stores handle floats?

[LLVMdev] Why can't atomic loads and stores handle floats?