On Fri, Nov 25, 2016 at 5:17 PM, Peter Zijlstra <peterz at infradead.org> wrote:>> > What are use cases for such primitive that won't be OK with "read once >> > _and_ atomically"? >> >> I have none to hand. > > Whatever triggers the __builtin_memcpy() paths, and even the size==8 > paths on 32bit. > > You could put a WARN in there to easily find them. > > The advantage of introducing the SINGLE_{LOAD,STORE}() helpers is that > they compiletime validate this the size is 'right' and can runtime check > alignment constraints. > > IE, they are strictly stronger than {READ,WRITE}_ONCE().Uh, so, READ/WRITE_ONCE are non-atomic now. I missed that. If READ/WRITE_ONCE are non-atomic, half of kernel is broken. All these loads of flags, ringbuffer positions, pointers, etc are broken. What about restoring READ/WRITE_ONCE as atomic, and introducing separate primitives for _non_ atomic loads/stores? It seems to me that there is just a dozen of cases that don't need atomicity and where performance is any important (though, some of these should probably try to write to shared memory less frequently and save hundreds of cycles, rather than try to save few cycles on local instructions). I've compiled kernel with restored size checks in READ/WRITE/ACCESS_ONCE and the following places seem to expect that access is actually atomic (while it is not). But if we don't guarantee that word-sized READ/WRITE_ONCE are atomic, then I am sure we can find a hundred more of broken places. arch/x86/entry/vdso/vdso32/../vclock_gettime.c:297:18: note: in expansion of macro ?ACCESS_ONCE? time_t result = ACCESS_ONCE(gtod->wall_time_sec); kernel/events/ring_buffer.c:160:10: error: call to ?__compiletime_assert_160? declared with attribute error: Need native word sized stores/loads for atomicity. tail = READ_ONCE(rb->user_page->data_tail); kernel/events/core.c:5145:16: error: call to ?__compiletime_assert_5145? declared with attribute error: Need native word sized stores/loads for atomicity. aux_offset = ACCESS_ONCE(rb->user_page->aux_offset); ^ kernel/events/core.c:5146:14: error: call to ?__compiletime_assert_5146? declared with attribute error: Need native word sized stores/loads for atomicity. aux_size = ACCESS_ONCE(rb->user_page->aux_size); drivers/cpufreq/cpufreq_governor.c:283:8: error: call to ?__compiletime_assert_283? declared with attribute error: Need native word sized stores/loads for atomicity. lst = READ_ONCE(policy_dbs->last_sample_time); ^ drivers/cpufreq/cpufreq_governor.c:301:7: error: call to ?__compiletime_assert_301? declared with attribute error: Need native word sized stores/loads for atomicity. if (unlikely(lst != READ_ONCE(policy_dbs->last_sample_time))) { net/core/gen_estimator.c:136:3: error: call to ?__compiletime_assert_136? declared with attribute error: Need native word sized stores/loads for atomicity. WRITE_ONCE(e->rate_est->bps, (e->avbps + 0xF) >> 5); ^ net/core/gen_estimator.c:142:3: error: call to ?__compiletime_assert_142? declared with attribute error: Need native word sized stores/loads for atomicity. WRITE_ONCE(e->rate_est->pps, (e->avpps + 0xF) >> 5); fs/proc_namespace.c:28:10: error: call to ?__compiletime_assert_28? declared with attribute error: Need native word sized stores/loads for atomicity. event = ACCESS_ONCE(ns->event); drivers/md/dm-stats.c:700:32: error: call to ?__compiletime_assert_700? declared with attribute error: Need native word sized stores/loads for atomicity. shared->tmp.sectors[READ] += ACCESS_ONCE(p->sectors[READ]); ^ drivers/md/dm-stats.c:701:33: error: call to ?__compiletime_assert_701? declared with attribute error: Need native word sized stores/loads for atomicity. shared->tmp.sectors[WRITE] += ACCESS_ONCE(p->sectors[WRITE]); ^
On Fri, Nov 25, 2016 at 06:28:53PM +0100, Dmitry Vyukov wrote:> On Fri, Nov 25, 2016 at 5:17 PM, Peter Zijlstra <peterz at infradead.org> wrote: > >> > What are use cases for such primitive that won't be OK with "read once > >> > _and_ atomically"? > >> > >> I have none to hand. > > > > Whatever triggers the __builtin_memcpy() paths, and even the size==8 > > paths on 32bit. > > > > You could put a WARN in there to easily find them. > > > > The advantage of introducing the SINGLE_{LOAD,STORE}() helpers is that > > they compiletime validate this the size is 'right' and can runtime check > > alignment constraints. > > > > IE, they are strictly stronger than {READ,WRITE}_ONCE(). > > Uh, so, READ/WRITE_ONCE are non-atomic now. I missed that.Yes, but *only* for types larger than word size. That has *always* been the case. It's still assumed that *_ONCE are single-copy-atomic for word size (or smaller). Some architectures may also provide that guarnatee for accesses larger than word size (e.g. 32-bit ARM w/ LPAE). ... It's just that as things stand we can't put checks in *_ONCE() for the access size, since they're *also* used for larger accesses that don't need atomicity.> If READ/WRITE_ONCE are non-atomic, half of kernel is broken. All these > loads of flags, ringbuffer positions, pointers, etc are broken.Most of these will be fine, as above.> What about restoring READ/WRITE_ONCE as atomic, and introducing > separate primitives for _non_ atomic loads/stores?Having a separate *_ONCE_TEARABLE() would certainly limit the number of things we have to fix up, and would also make it clear that atomicity is not expected. ... but we might have to go with SINGLE_*() if we can't convince Linus. Thanks, Mark.
On Fri, Nov 25, 2016 at 9:28 AM, Dmitry Vyukov <dvyukov at google.com> wrote: On Fri, Nov 25, 2016 at 5:17 PM, Peter Zijlstra <peterz at infradead.org> wrote:>> >> IE, they are strictly stronger than {READ,WRITE}_ONCE().No, they are strictly bullshit. Stop this idiocy. We went through this once already.> Uh, so, READ/WRITE_ONCE are non-atomic now. I missed that.No. READ/WRITE_ONCE() are atomic *WHEN*THAT*IS*POSSIBLE*. So for something that fits in a register, it will read it in one atomic access. For something that fits in a register and is _possible_ to write atomically, it will do so. But sometimes it's not going to be atomic. We do not for a moment try to make multi-word accesses be atomic. Not even if you could try to use some magic cmpxchg16b thing. It's not "atomic" in that sense: it will be doing multiple accesses. Similarly, if you try to write a 8- or 16-bit word on alpha with WRITE_ONCE() or you try to do other things, you have what's coming to you. And they just force some "copy to stable storage" when it isn't (ie a "memcpy()" is not necessarily a single access and might be done as multiple overlapping reads, but the end result is stable). So trying to make anything else out of them is f*cking stupid. READ_ONCE() and friends do the right thing. Trying to limit them is *wrong*, because the restrictions would simply make them less useful. And trying to make up something new is pointless and stupid. So leave this code alone. Don't add some stupid "SINGLE_LOAD()" crap. That's just moronic. READ_ONCE() is that, and so much more. Linus
On Fri, Nov 25, 2016 at 09:52:50AM -0800, Linus Torvalds wrote:> READ/WRITE_ONCE() are atomic *WHEN*THAT*IS*POSSIBLE*.> But sometimes it's not going to be atomic.That's the problem. Common code may rely on something being atomic when that's only true on a subset of platforms. On others, it's silently "fixed" into something that isn't atomic, and we get no diagnostic. The bug lurks beneath the surface. Thanks, Mark.