Paweł Batko via llvm-dev
2017-Nov-20 21:17 UTC
[llvm-dev] Meaning of loads/stores marked both atomic and volatile
Hi Tim, On 20 November 2017 at 16:41, Tim Northover <t.p.northover at gmail.com> wrote:> There are only a couple of valid uses for volatile these daysDo you mean volatile used alone or also the combination 'atomic volatile'? It think that 'atomic volatile' is very useful. Consider following pseudo-code examples, where all loads and stores are atomic (with some memory ordering constraints) but not volatile. Example 1. // shared variable int i = 0; // call this method from 10 threads void foo(){ int i = rand() % 2; int j = i; while(i == j){ printf("In the loop\n"); } } I claim that the loop can be optimized to an infinite loop by a compiler, because apparently j == i at all times in a single threaded program. If loads and stores (particularly the read in loop predicated) were also marked as volatile, it wouldn't have been possible. Is this correct? Example 2. // shared variable int i = 0; void signalHandler(){ i = 1; } void main(){ while(i == 0){ printf("In the loop\n"); } } Here I also claim that the loop can be optimized into an infinite loop if volatile is not used. Is this correct? -- Paweł Batko
Tim Northover via llvm-dev
2017-Nov-21 07:53 UTC
[llvm-dev] Meaning of loads/stores marked both atomic and volatile
On 20 November 2017 at 21:17, Paweł Batko <pawel.batko at gmail.com> wrote:> > There are only a couple of valid uses for volatile these days > Do you mean volatile used alone or also the combination 'atomic volatile'?Volatile alone.> Example 1. > > // shared variable > int i = 0; > > // call this method from 10 threads > void foo(){ > int i = rand() % 2; > int j = i; > while(i == j){ > printf("In the loop\n"); > } > } > > I claim that the loop can be optimized to an infinite loop by a > compiler, because apparently j == i at all times in a single threaded > program.The global variable i is shadowed by the local there and I can't be sure exactly what you intended so I won't comment on it directly. But in general terms atomic LLVM operations with at least "monotonic" ordering forbid unrestricted store-forwarding within a thread (which I think would be the first step in eliminating the loop). See https://llvm.org/docs/LangRef.html#atomic-memory-ordering-constraints where it's explicitly called out: "If an address is written monotonic-ally by one thread, and other threads monotonic-ally read that address repeatedly, the other threads must eventually see the write."> Example 2. > > // shared variable > int i = 0; > > void signalHandler(){ > i = 1; > } > > void main(){ > while(i == 0){ > printf("In the loop\n"); > } > } > > Here I also claim that the loop can be optimized into an infinite loop > if volatile is not used.This is an interesting one. Monotonic atomic is again sufficient to synchronize with another thread (or signal handler I'd argue). But if this is a signal handler within a thread then that is actually one of the other valid uses of volatile in C (nearly, it has to be a sig_atomic_t too). In LLVM IR I think you'd use an atomic with synchscope("singlethread") for that instead. Cheers. Tim.
Paweł Batko via llvm-dev
2017-Nov-21 10:27 UTC
[llvm-dev] Meaning of loads/stores marked both atomic and volatile
On 21 November 2017 at 08:53, Tim Northover <t.p.northover at gmail.com> wrote:> The global variable i is shadowed by the local there and I can't be > sure exactly what you intended so I won't comment on it directly.My mistake. I intended it to be a write to the shared variable i. Let me fix it. Example 1. // shared variable int i = 0; // call this method from 10 threads void foo(){ i = rand() % 2; // was: int i = rand() % 2; int j = i; while(i == j){ printf("In the loop\n"); } }> But in general terms atomic LLVM operations with at least "monotonic" > ordering forbid unrestricted store-forwarding within a thread> "If an address is written > monotonic-ally by one thread, and other threads monotonic-ally read > that address repeatedly, the other threads must eventually see the > write."Ok, let's say in Example 1. monotonic atomic prevents loop optimization because compiler assumes existence of other threads. (Compiler (effectively) assumes existence of other threads the moment one starts using at least monotonic atomic load/stores?)>> Example 2. >> >> // shared variable >> int i = 0; >> >> void signalHandler(){ >> i = 1; >> } >> >> void main(){ >> while(i == 0){ >> printf("In the loop\n"); >> } >> } > > This is an interesting one. Monotonic atomic is again sufficient to > synchronize with another thread (or signal handler I'd argue).In Example. 2 let's consider signal handler in a single thread situation. If monotonic atomic prevents loop optimization as in Example 1., then I say it does the same Example 2. It's because compiler cannot know that 'signalHandler' function is a signal handler function, so it must assume it might be executed in another thread. -- Paweł Batko