Jason Gunthorpe
2019-Aug-02 12:46 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote:> > This must be a proper barrier, like a spinlock, mutex, or > > synchronize_rcu. > > > I start with synchronize_rcu() but both you and Michael raise some > concern.I've also idly wondered if calling synchronize_rcu() under the various mm locks is a deadlock situation.> Then I try spinlock and mutex: > > 1) spinlock: add lots of overhead on datapath, this leads 0 performance > improvement.I think the topic here is correctness not performance improvement> 2) SRCU: full memory barrier requires on srcu_read_lock(), which still leads > little performance improvement> 3) mutex: a possible issue is need to wait for the page to be swapped in (is > this unacceptable ?), another issue is that we need hold vq lock during > range overlap check.I have a feeling that mmu notififers cannot safely become dependent on progress of swap without causing deadlock. You probably should avoid this.> > And, again, you can't re-invent a spinlock with open coding and get > > something better. > > So the question is if waiting for swap is considered to be unsuitable for > MMU notifiers. If not, it would simplify codes. If not, we still need to > figure out a possible solution. > > Btw, I come up another idea, that is to disable preemption when vhost thread > need to access the memory. Then register preempt notifier and if vhost > thread is preempted, we're sure no one will access the memory and can do the > cleanup.I think you should use the spinlock so at least the code is obviously functionally correct and worry about designing some properly justified performance change after. Jason
Michael S. Tsirkin
2019-Aug-02 14:27 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On Fri, Aug 02, 2019 at 09:46:13AM -0300, Jason Gunthorpe wrote:> On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: > > > This must be a proper barrier, like a spinlock, mutex, or > > > synchronize_rcu. > > > > > > I start with synchronize_rcu() but both you and Michael raise some > > concern. > > I've also idly wondered if calling synchronize_rcu() under the various > mm locks is a deadlock situation. > > > Then I try spinlock and mutex: > > > > 1) spinlock: add lots of overhead on datapath, this leads 0 performance > > improvement. > > I think the topic here is correctness not performance improvementThe topic is whether we should revert commit 7f466032dc9 ("vhost: access vq metadata through kernel virtual address") or keep it in. The only reason to keep it is performance. Now as long as all this code is disabled anyway, we can experiment a bit. I personally feel we would be best served by having two code paths: - Access to VM memory directly mapped into kernel - Access to userspace Having it all cleanly split will allow a bunch of optimizations, for example for years now we planned to be able to process an incoming short packet directly on softirq path, or an outgoing on directly within eventfd. -- MST
Jason Gunthorpe
2019-Aug-02 17:24 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On Fri, Aug 02, 2019 at 10:27:21AM -0400, Michael S. Tsirkin wrote:> On Fri, Aug 02, 2019 at 09:46:13AM -0300, Jason Gunthorpe wrote: > > On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: > > > > This must be a proper barrier, like a spinlock, mutex, or > > > > synchronize_rcu. > > > > > > > > > I start with synchronize_rcu() but both you and Michael raise some > > > concern. > > > > I've also idly wondered if calling synchronize_rcu() under the various > > mm locks is a deadlock situation. > > > > > Then I try spinlock and mutex: > > > > > > 1) spinlock: add lots of overhead on datapath, this leads 0 performance > > > improvement. > > > > I think the topic here is correctness not performance improvement > > The topic is whether we should revert > commit 7f466032dc9 ("vhost: access vq metadata through kernel virtual address") > > or keep it in. The only reason to keep it is performance.Yikes, I'm not sure you can ever win against copy_from_user using mmu_notifiers? The synchronization requirements are likely always more expensive unless large and scattered copies are being done.. The rcu is about the only simple approach that could be less expensive, and that gets back to the question if you can block an invalidate_start_range in synchronize_rcu or not.. So, frankly, I'd revert it until someone could prove the rcu solution is OK.. BTW, how do you get copy_from_user to work outside a syscall? Also, why can't this just permanently GUP the pages? In fact, where does it put_page them anyhow? Worrying that 7f466 adds a get_user page but does not add a put_page?? Jason
Jason Wang
2019-Aug-05 04:20 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On 2019/8/2 ??8:46, Jason Gunthorpe wrote:> On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: >>> This must be a proper barrier, like a spinlock, mutex, or >>> synchronize_rcu. >> >> I start with synchronize_rcu() but both you and Michael raise some >> concern. > I've also idly wondered if calling synchronize_rcu() under the various > mm locks is a deadlock situation.Maybe, that's why I suggest to use vhost_work_flush() which is much lightweight can can achieve the same function. It can guarantee all previous work has been processed after vhost_work_flush() return.> >> Then I try spinlock and mutex: >> >> 1) spinlock: add lots of overhead on datapath, this leads 0 performance >> improvement. > I think the topic here is correctness not performance improvementBut the whole series is to speed up vhost.> >> 2) SRCU: full memory barrier requires on srcu_read_lock(), which still leads >> little performance improvement > >> 3) mutex: a possible issue is need to wait for the page to be swapped in (is >> this unacceptable ?), another issue is that we need hold vq lock during >> range overlap check. > I have a feeling that mmu notififers cannot safely become dependent on > progress of swap without causing deadlock. You probably should avoid > this.Yes, so that's why I try to synchronize the critical region by myself.>>> And, again, you can't re-invent a spinlock with open coding and get >>> something better. >> So the question is if waiting for swap is considered to be unsuitable for >> MMU notifiers. If not, it would simplify codes. If not, we still need to >> figure out a possible solution. >> >> Btw, I come up another idea, that is to disable preemption when vhost thread >> need to access the memory. Then register preempt notifier and if vhost >> thread is preempted, we're sure no one will access the memory and can do the >> cleanup. > I think you should use the spinlock so at least the code is obviously > functionally correct and worry about designing some properly justified > performance change after. > > JasonSpinlock is correct but make the whole series meaningless consider it won't bring any performance improvement. Thanks
Jason Wang
2019-Aug-05 04:36 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On 2019/8/2 ??10:27, Michael S. Tsirkin wrote:> On Fri, Aug 02, 2019 at 09:46:13AM -0300, Jason Gunthorpe wrote: >> On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: >>>> This must be a proper barrier, like a spinlock, mutex, or >>>> synchronize_rcu. >>> >>> I start with synchronize_rcu() but both you and Michael raise some >>> concern. >> I've also idly wondered if calling synchronize_rcu() under the various >> mm locks is a deadlock situation. >> >>> Then I try spinlock and mutex: >>> >>> 1) spinlock: add lots of overhead on datapath, this leads 0 performance >>> improvement. >> I think the topic here is correctness not performance improvement > The topic is whether we should revert > commit 7f466032dc9 ("vhost: access vq metadata through kernel virtual address") > > or keep it in. The only reason to keep it is performance.Maybe it's time to introduce the config option?> > Now as long as all this code is disabled anyway, we can experiment a > bit. > > I personally feel we would be best served by having two code paths: > > - Access to VM memory directly mapped into kernel > - Access to userspace > > > Having it all cleanly split will allow a bunch of optimizations, for > example for years now we planned to be able to process an incoming short > packet directly on softirq path, or an outgoing on directly within > eventfd.It's not hard consider we've already had our own accssors. But the question is (as asked in another thread), do you want permanent GUP or still use MMU notifiers. Thanks
Jason Gunthorpe
2019-Aug-06 12:04 UTC
[PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
On Mon, Aug 05, 2019 at 12:20:45PM +0800, Jason Wang wrote:> > On 2019/8/2 ??8:46, Jason Gunthorpe wrote: > > On Fri, Aug 02, 2019 at 05:40:07PM +0800, Jason Wang wrote: > > > > This must be a proper barrier, like a spinlock, mutex, or > > > > synchronize_rcu. > > > > > > I start with synchronize_rcu() but both you and Michael raise some > > > concern. > > I've also idly wondered if calling synchronize_rcu() under the various > > mm locks is a deadlock situation. > > > Maybe, that's why I suggest to use vhost_work_flush() which is much > lightweight can can achieve the same function. It can guarantee all previous > work has been processed after vhost_work_flush() return.If things are already running in a work, then yes, you can piggyback on the existing spinlocks inside the workqueue and be Ok However, if that work is doing any copy_from_user, then the flush becomes dependent on swap and it won't work again...> > > 1) spinlock: add lots of overhead on datapath, this leads 0 performance > > > improvement. > > I think the topic here is correctness not performance improvement>> But the whole series is to speed up vhost.So? Starting with a whole bunch of crazy, possibly broken, locking and claiming a performance win is not reasonable.> Spinlock is correct but make the whole series meaningless consider it won't > bring any performance improvement.You can't invent a faster spinlock by opencoding some wild scheme. There is nothing special about the usage here, it needs a blocking lock, plain and simple. Jason
Reasonably Related Threads
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker
- [PATCH V2 7/9] vhost: do not use RCU to synchronize MMU notifier with worker