Jason Wang
2021-Jan-28 04:31 UTC
[RFC v3 01/11] eventfd: track eventfd_signal() recursion depth separately in different cases
On 2021/1/28 ??11:52, Yongji Xie wrote:> On Thu, Jan 28, 2021 at 11:05 AM Jason Wang <jasowang at redhat.com> wrote: >> >> On 2021/1/27 ??5:11, Yongji Xie wrote: >>> On Wed, Jan 27, 2021 at 11:38 AM Jason Wang <jasowang at redhat.com> wrote: >>>> On 2021/1/20 ??2:52, Yongji Xie wrote: >>>>> On Wed, Jan 20, 2021 at 12:24 PM Jason Wang <jasowang at redhat.com> wrote: >>>>>> On 2021/1/19 ??12:59, Xie Yongji wrote: >>>>>>> Now we have a global percpu counter to limit the recursion depth >>>>>>> of eventfd_signal(). This can avoid deadlock or stack overflow. >>>>>>> But in stack overflow case, it should be OK to increase the >>>>>>> recursion depth if needed. So we add a percpu counter in eventfd_ctx >>>>>>> to limit the recursion depth for deadlock case. Then it could be >>>>>>> fine to increase the global percpu counter later. >>>>>> I wonder whether or not it's worth to introduce percpu for each eventfd. >>>>>> >>>>>> How about simply check if eventfd_signal_count() is greater than 2? >>>>>> >>>>> It can't avoid deadlock in this way. >>>> I may miss something but the count is to avoid recursive eventfd call. >>>> So for VDUSE what we suffers is e.g the interrupt injection path: >>>> >>>> userspace write IRQFD -> vq->cb() -> another IRQFD. >>>> >>>> It looks like increasing EVENTFD_WAKEUP_DEPTH should be sufficient? >>>> >>> Actually I mean the deadlock described in commit f0b493e ("io_uring: >>> prevent potential eventfd recursion on poll"). It can break this bug >>> fix if we just increase EVENTFD_WAKEUP_DEPTH. >> >> Ok, so can wait do something similar in that commit? (using async stuffs >> like wq). >> > We can do that. But it will reduce the performance. Because the > eventfd recursion will be triggered every time kvm kick eventfd in > vhost-vdpa cases: > > KVM write KICKFD -> ops->kick_vq -> VDUSE write KICKFD > > Thanks, > YongjiRight, I think in the future we need to find a way to let KVM to wakeup VDUSE directly. Havn't had a deep thought but it might work like irq bypass manager. Thanks