Jason Gunthorpe
2019-Aug-04 00:14 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On Sat, Aug 03, 2019 at 05:36:13PM -0400, Michael S. Tsirkin wrote:> On Fri, Aug 02, 2019 at 02:24:18PM -0300, Jason Gunthorpe wrote: > > On Fri, Aug 02, 2019 at 10:27:21AM -0400, Michael S. Tsirkin wrote: > > > On Fri, Aug 02, 2019 at 09:46:13AM -0300, Jason Gunthorpe wrote: > > > > On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: > > > > > > This must be a proper barrier, like a spinlock, mutex, or > > > > > > synchronize_rcu. > > > > > > > > > > > > > > > I start with synchronize_rcu() but both you and Michael raise some > > > > > concern. > > > > > > > > I've also idly wondered if calling synchronize_rcu() under the various > > > > mm locks is a deadlock situation. > > > > > > > > > Then I try spinlock and mutex: > > > > > > > > > > 1) spinlock: add lots of overhead on datapath, this leads 0 performance > > > > > improvement. > > > > > > > > I think the topic here is correctness not performance improvement > > > > > > The topic is whether we should revert > > > commit 7f466032dc9 ("vhost: access vq metadata through kernel virtual address") > > > > > > or keep it in. The only reason to keep it is performance. > > > > Yikes, I'm not sure you can ever win against copy_from_user using > > mmu_notifiers? > > Ever since copy_from_user started playing with flags (for SMAP) and > added speculation barriers there's a chance we can win by accessing > memory through the kernel address.You think copy_to_user will be more expensive than the minimum two atomics required to synchronize with another thread?> > Also, why can't this just permanently GUP the pages? In fact, where > > does it put_page them anyhow? Worrying that 7f466 adds a get_user page > > but does not add a put_page??You didn't answer this.. Why not just use GUP? Jason
Michael S. Tsirkin
2019-Aug-04 08:07 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On Sat, Aug 03, 2019 at 09:14:00PM -0300, Jason Gunthorpe wrote:> On Sat, Aug 03, 2019 at 05:36:13PM -0400, Michael S. Tsirkin wrote: > > On Fri, Aug 02, 2019 at 02:24:18PM -0300, Jason Gunthorpe wrote: > > > On Fri, Aug 02, 2019 at 10:27:21AM -0400, Michael S. Tsirkin wrote: > > > > On Fri, Aug 02, 2019 at 09:46:13AM -0300, Jason Gunthorpe wrote: > > > > > On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: > > > > > > > This must be a proper barrier, like a spinlock, mutex, or > > > > > > > synchronize_rcu. > > > > > > > > > > > > > > > > > > I start with synchronize_rcu() but both you and Michael raise some > > > > > > concern. > > > > > > > > > > I've also idly wondered if calling synchronize_rcu() under the various > > > > > mm locks is a deadlock situation. > > > > > > > > > > > Then I try spinlock and mutex: > > > > > > > > > > > > 1) spinlock: add lots of overhead on datapath, this leads 0 performance > > > > > > improvement. > > > > > > > > > > I think the topic here is correctness not performance improvement > > > > > > > > The topic is whether we should revert > > > > commit 7f466032dc9 ("vhost: access vq metadata through kernel virtual address") > > > > > > > > or keep it in. The only reason to keep it is performance. > > > > > > Yikes, I'm not sure you can ever win against copy_from_user using > > > mmu_notifiers? > > > > Ever since copy_from_user started playing with flags (for SMAP) and > > added speculation barriers there's a chance we can win by accessing > > memory through the kernel address. > > You think copy_to_user will be more expensive than the minimum two > atomics required to synchronize with another thread?I frankly don't know. With SMAP you flip flags twice, and with spectre you flush the pipeline. Is that cheaper or more expensive than an atomic operation? Testing is the only way to tell.> > > Also, why can't this just permanently GUP the pages? In fact, where > > > does it put_page them anyhow? Worrying that 7f466 adds a get_user page > > > but does not add a put_page?? > > You didn't answer this.. Why not just use GUP? > > JasonSorry I misunderstood the question. Permanent GUP breaks lots of functionality we need such as THP and numa balancing. release_pages is used instead of put_page. -- MST
Jason Wang
2019-Aug-05 04:39 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On 2019/8/4 ??4:07, Michael S. Tsirkin wrote:> On Sat, Aug 03, 2019 at 09:14:00PM -0300, Jason Gunthorpe wrote: >> On Sat, Aug 03, 2019 at 05:36:13PM -0400, Michael S. Tsirkin wrote: >>> On Fri, Aug 02, 2019 at 02:24:18PM -0300, Jason Gunthorpe wrote: >>>> On Fri, Aug 02, 2019 at 10:27:21AM -0400, Michael S. Tsirkin wrote: >>>>> On Fri, Aug 02, 2019 at 09:46:13AM -0300, Jason Gunthorpe wrote: >>>>>> On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: >>>>>>>> This must be a proper barrier, like a spinlock, mutex, or >>>>>>>> synchronize_rcu. >>>>>>> >>>>>>> I start with synchronize_rcu() but both you and Michael raise some >>>>>>> concern. >>>>>> I've also idly wondered if calling synchronize_rcu() under the various >>>>>> mm locks is a deadlock situation. >>>>>> >>>>>>> Then I try spinlock and mutex: >>>>>>> >>>>>>> 1) spinlock: add lots of overhead on datapath, this leads 0 performance >>>>>>> improvement. >>>>>> I think the topic here is correctness not performance improvement >>>>> The topic is whether we should revert >>>>> commit 7f466032dc9 ("vhost: access vq metadata through kernel virtual address") >>>>> >>>>> or keep it in. The only reason to keep it is performance. >>>> Yikes, I'm not sure you can ever win against copy_from_user using >>>> mmu_notifiers? >>> Ever since copy_from_user started playing with flags (for SMAP) and >>> added speculation barriers there's a chance we can win by accessing >>> memory through the kernel address. >> You think copy_to_user will be more expensive than the minimum two >> atomics required to synchronize with another thread? > I frankly don't know. With SMAP you flip flags twice, and with spectre > you flush the pipeline. Is that cheaper or more expensive than an atomic > operation? Testing is the only way to tell.Let me test, I only did test on a non SMAP machine. Switching to spinlock kills all performance improvement. Thanks> >>>> Also, why can't this just permanently GUP the pages? In fact, where >>>> does it put_page them anyhow? Worrying that 7f466 adds a get_user page >>>> but does not add a put_page?? >> You didn't answer this.. Why not just use GUP? >> >> Jason > Sorry I misunderstood the question. Permanent GUP breaks lots of > functionality we need such as THP and numa balancing. > > release_pages is used instead of put_page. > > > >
Jason Gunthorpe
2019-Aug-06 11:53 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On Sun, Aug 04, 2019 at 04:07:17AM -0400, Michael S. Tsirkin wrote:> > > > Also, why can't this just permanently GUP the pages? In fact, where > > > > does it put_page them anyhow? Worrying that 7f466 adds a get_user page > > > > but does not add a put_page?? > > > > You didn't answer this.. Why not just use GUP? > > > > Jason > > Sorry I misunderstood the question. Permanent GUP breaks lots of > functionality we need such as THP and numa balancing.Really? It doesn't look like that many pages are involved.. Jason
Apparently Analagous Threads
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker