Christian Borntraeger
2016-Nov-25 16:49 UTC
[PATCH 0/3] virtio/vringh: kill off ACCESS_ONCE()
On 11/25/2016 05:17 PM, Peter Zijlstra wrote:> On Fri, Nov 25, 2016 at 04:10:04PM +0000, Mark Rutland wrote: >> On Fri, Nov 25, 2016 at 04:21:39PM +0100, Dmitry Vyukov wrote: > >>> What are use cases for such primitive that won't be OK with "read once >>> _and_ atomically"? >> >> I have none to hand. > > Whatever triggers the __builtin_memcpy() paths, and even the size==8 > paths on 32bit. > > You could put a WARN in there to easily find them.There were several cases that I found during writing the *ONCE stuff. For example there are some 32bit ppc variants with 64bit PTEs. Some for others (I think sparc). And the mm/ code is perfectly fine with these PTE accesses being done NOT atomic.> > The advantage of introducing the SINGLE_{LOAD,STORE}() helpers is that > they compiletime validate this the size is 'right' and can runtime check > alignment constraints. > > IE, they are strictly stronger than {READ,WRITE}_ONCE(). >
On Fri, Nov 25, 2016 at 05:49:45PM +0100, Christian Borntraeger wrote:> On 11/25/2016 05:17 PM, Peter Zijlstra wrote: > > On Fri, Nov 25, 2016 at 04:10:04PM +0000, Mark Rutland wrote: > >> On Fri, Nov 25, 2016 at 04:21:39PM +0100, Dmitry Vyukov wrote: > > > >>> What are use cases for such primitive that won't be OK with "read once > >>> _and_ atomically"? > >> > >> I have none to hand. > > > > Whatever triggers the __builtin_memcpy() paths, and even the size==8 > > paths on 32bit. > > > > You could put a WARN in there to easily find them. > > There were several cases that I found during writing the *ONCE stuff. > For example there are some 32bit ppc variants with 64bit PTEs. Some for > others (I think sparc).We have similar on 32-bit ARM w/ LPAE. LPAE implies that a naturally aligned 64-bit access is single-copy atomic, which is what makes that ok.> And the mm/ code is perfectly fine with these PTE accesses being done > NOT atomic.That strikes me as surprising. Is there some mutual exclusion that prevents writes from occuring wherever a READ_ONCE() happens to a PTE? Otherwise, how is tearing not a problem? Does it have some pattern like the lockref cmpxchg? Thanks, Mark.
On Fri, Nov 25, 2016 at 05:28:01PM +0000, Mark Rutland wrote:> On Fri, Nov 25, 2016 at 05:49:45PM +0100, Christian Borntraeger wrote: > > On 11/25/2016 05:17 PM, Peter Zijlstra wrote:> > There were several cases that I found during writing the *ONCE stuff. > > For example there are some 32bit ppc variants with 64bit PTEs. Some for > > others (I think sparc). > > We have similar on 32-bit ARM w/ LPAE. LPAE implies that a naturally > aligned 64-bit access is single-copy atomic, which is what makes that > ok. > > > And the mm/ code is perfectly fine with these PTE accesses being done > > NOT atomic. > > That strikes me as surprising. Is there some mutual exclusion that > prevents writes from occuring wherever a READ_ONCE() happens to a PTE? > > Otherwise, how is tearing not a problem? Does it have some pattern like > the lockref cmpxchg?On x86 PAE we play silly games, see arch/x86/mm/gup.c:gup_get_ptr(). Those two loads really should be READ_ONCE()/LOAD_SINGLE().
Christian Borntraeger
2016-Nov-25 18:46 UTC
[PATCH 0/3] virtio/vringh: kill off ACCESS_ONCE()
On 11/25/2016 06:28 PM, Mark Rutland wrote:> On Fri, Nov 25, 2016 at 05:49:45PM +0100, Christian Borntraeger wrote: >> On 11/25/2016 05:17 PM, Peter Zijlstra wrote: >>> On Fri, Nov 25, 2016 at 04:10:04PM +0000, Mark Rutland wrote: >>>> On Fri, Nov 25, 2016 at 04:21:39PM +0100, Dmitry Vyukov wrote: >>> >>>>> What are use cases for such primitive that won't be OK with "read once >>>>> _and_ atomically"? >>>> >>>> I have none to hand. >>> >>> Whatever triggers the __builtin_memcpy() paths, and even the size==8 >>> paths on 32bit. >>> >>> You could put a WARN in there to easily find them. >> >> There were several cases that I found during writing the *ONCE stuff. >> For example there are some 32bit ppc variants with 64bit PTEs. Some for >> others (I think sparc). > > We have similar on 32-bit ARM w/ LPAE. LPAE implies that a naturally > aligned 64-bit access is single-copy atomic, which is what makes that > ok. > >> And the mm/ code is perfectly fine with these PTE accesses being done >> NOT atomic. > > That strikes me as surprising. Is there some mutual exclusion that > prevents writes from occuring wherever a READ_ONCE() happens to a PTE?See for example mm/memory.c handle_pte_fault. ---snip---- /* * some architectures can have larger ptes than wordsize, * e.g.ppc44x-defconfig has CONFIG_PTE_64BIT=y and * CONFIG_32BIT=y, so READ_ONCE or ACCESS_ONCE cannot guarantee * atomic accesses. The code below just needs a consistent * view for the ifs and we later double check anyway with the * ptl lock held. So here a barrier will do. */ ---snip--- The trick is that the code only does a specific check, but all other accesses are under the pte lock.
Michael S. Tsirkin
2016-Nov-25 21:08 UTC
[PATCH 0/3] virtio/vringh: kill off ACCESS_ONCE()
On Fri, Nov 25, 2016 at 05:49:45PM +0100, Christian Borntraeger wrote:> On 11/25/2016 05:17 PM, Peter Zijlstra wrote: > > On Fri, Nov 25, 2016 at 04:10:04PM +0000, Mark Rutland wrote: > >> On Fri, Nov 25, 2016 at 04:21:39PM +0100, Dmitry Vyukov wrote: > > > >>> What are use cases for such primitive that won't be OK with "read once > >>> _and_ atomically"? > >> > >> I have none to hand. > > > > Whatever triggers the __builtin_memcpy() paths, and even the size==8 > > paths on 32bit. > > > > You could put a WARN in there to easily find them. > > There were several cases that I found during writing the *ONCE stuff. > For example there are some 32bit ppc variants with 64bit PTEs. Some for > others (I think sparc). And the mm/ code is perfectly fine with these > PTE accesses being done NOT atomic.In that case do we even need _ONCE at all? Are there assumptions these are two 32 bit reads?> > > > > The advantage of introducing the SINGLE_{LOAD,STORE}() helpers is that > > they compiletime validate this the size is 'right' and can runtime check > > alignment constraints. > > > > IE, they are strictly stronger than {READ,WRITE}_ONCE(). > >
Christian Borntraeger
2016-Nov-25 21:45 UTC
[PATCH 0/3] virtio/vringh: kill off ACCESS_ONCE()
On 11/25/2016 10:08 PM, Michael S. Tsirkin wrote:> On Fri, Nov 25, 2016 at 05:49:45PM +0100, Christian Borntraeger wrote: >> On 11/25/2016 05:17 PM, Peter Zijlstra wrote: >>> On Fri, Nov 25, 2016 at 04:10:04PM +0000, Mark Rutland wrote: >>>> On Fri, Nov 25, 2016 at 04:21:39PM +0100, Dmitry Vyukov wrote: >>> >>>>> What are use cases for such primitive that won't be OK with "read once >>>>> _and_ atomically"? >>>> >>>> I have none to hand. >>> >>> Whatever triggers the __builtin_memcpy() paths, and even the size==8 >>> paths on 32bit. >>> >>> You could put a WARN in there to easily find them. >> >> There were several cases that I found during writing the *ONCE stuff. >> For example there are some 32bit ppc variants with 64bit PTEs. Some for >> others (I think sparc). And the mm/ code is perfectly fine with these >> PTE accesses being done NOT atomic. > > In that case do we even need _ONCE at all?Yes. For example look at gup_pmd_range. Here several checks are made on the pmd. It is important the the check for pmd_none is made on the same value than the check for pmd_trans_huge, but it is not important that the value is still up to date. And there are really cases where we cannot read the thing atomically, e.g. on m68k and sparc(32bit) pmd_t is defined as array of longs. Another problem is that a compiler can implement the following code as 2 memory reads (e.g. if you have compare instructions that work on memory) instead of a memory read and 2 compares int check(unsigned long *value_p) { unsigned long value = *value_p; if (condition_a(value)) return 1; if (condition_b(value)) return 2; return 3; } With READ_ONCE you forbid that. In past times you would have used barrier() after the assignment to achieve the same goal.> Are there assumptions these are two 32 bit reads?It depends on the code. Some places (e.g. in gup) assumes that the access via READ_ONCE is atomic (which it is for sane compilers as long as the pointer is <= word size). In some others places just one bit is tested.> > >> >>> >>> The advantage of introducing the SINGLE_{LOAD,STORE}() helpers is that >>> they compiletime validate this the size is 'right' and can runtime check >>> alignment constraints. >>> >>> IE, they are strictly stronger than {READ,WRITE}_ONCE(). >>> >