Michael S. Tsirkin
2021-Feb-10 09:14 UTC
[PATCH net] virtio-net: suppress bad irq warning for tx napi
On Tue, Feb 09, 2021 at 10:00:22AM -0800, Wei Wang wrote:> On Tue, Feb 9, 2021 at 6:58 AM Willem de Bruijn > <willemdebruijn.kernel at gmail.com> wrote: > > > > > >>> I have no preference. Just curious, especially if it complicates the patch. > > > >>> > > > >> My understanding is that. It's probably ok for net. But we probably need > > > >> to document the assumptions to make sure it was not abused in other drivers. > > > >> > > > >> Introduce new parameters for find_vqs() can help to eliminate the subtle > > > >> stuffs but I agree it looks like a overkill. > > > >> > > > >> (Btw, I forget the numbers but wonder how much difference if we simple > > > >> remove the free_old_xmits() from the rx NAPI path?) > > > > The committed patchset did not record those numbers, but I found them > > > > in an earlier iteration: > > > > > > > > [PATCH net-next 0/3] virtio-net tx napi > > > > https://lists.openwall.net/netdev/2017/04/02/55 > > > > > > > > It did seem to significantly reduce compute cycles ("Gcyc") at the > > > > time. For instance: > > > > > > > > TCP_RR Latency (us): > > > > 1x: > > > > p50 24 24 21 > > > > p99 27 27 27 > > > > Gcycles 299 432 308 > > > > > > > > I'm concerned that removing it now may cause a regression report in a > > > > few months. That is higher risk than the spurious interrupt warning > > > > that was only reported after years of use. > > > > > > > > > Right. > > > > > > So if Michael is fine with this approach, I'm ok with it. But we > > > probably need to a TODO to invent the interrupt handlers that can be > > > used for more than one virtqueues. When MSI-X is enabled, the interrupt > > > handler (vring_interrup()) assumes the interrupt is used by a single > > > virtqueue. > > > > Thanks. > > > > The approach to schedule tx-napi from virtnet_poll_cleantx instead of > > cleaning directly in this rx-napi function was not effective at > > suppressing the warning, I understand. > > Correct. I tried the approach to schedule tx napi instead of directly > do free_old_xmit_skbs() in virtnet_poll_cleantx(). But the warning > still happens.Two questions here: is the device using packed or split vqs? And is event index enabled? I think one issue is that at the moment with split and event index we don't actually disable events at all. static void virtqueue_disable_cb_split(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) { vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT; if (!vq->event) vq->split.vring.avail->flags cpu_to_virtio16(_vq->vdev, vq->split.avail_flags_shadow); } } Can you try your napi patch + disable event index? -- MST
Jason Wang
2021-Feb-18 05:39 UTC
[PATCH net] virtio-net: suppress bad irq warning for tx napi
On 2021/2/10 ??5:14, Michael S. Tsirkin wrote:> On Tue, Feb 09, 2021 at 10:00:22AM -0800, Wei Wang wrote: >> On Tue, Feb 9, 2021 at 6:58 AM Willem de Bruijn >> <willemdebruijn.kernel at gmail.com> wrote: >>>>>>> I have no preference. Just curious, especially if it complicates the patch. >>>>>>> >>>>>> My understanding is that. It's probably ok for net. But we probably need >>>>>> to document the assumptions to make sure it was not abused in other drivers. >>>>>> >>>>>> Introduce new parameters for find_vqs() can help to eliminate the subtle >>>>>> stuffs but I agree it looks like a overkill. >>>>>> >>>>>> (Btw, I forget the numbers but wonder how much difference if we simple >>>>>> remove the free_old_xmits() from the rx NAPI path?) >>>>> The committed patchset did not record those numbers, but I found them >>>>> in an earlier iteration: >>>>> >>>>> [PATCH net-next 0/3] virtio-net tx napi >>>>> https://lists.openwall.net/netdev/2017/04/02/55 >>>>> >>>>> It did seem to significantly reduce compute cycles ("Gcyc") at the >>>>> time. For instance: >>>>> >>>>> TCP_RR Latency (us): >>>>> 1x: >>>>> p50 24 24 21 >>>>> p99 27 27 27 >>>>> Gcycles 299 432 308 >>>>> >>>>> I'm concerned that removing it now may cause a regression report in a >>>>> few months. That is higher risk than the spurious interrupt warning >>>>> that was only reported after years of use. >>>> >>>> Right. >>>> >>>> So if Michael is fine with this approach, I'm ok with it. But we >>>> probably need to a TODO to invent the interrupt handlers that can be >>>> used for more than one virtqueues. When MSI-X is enabled, the interrupt >>>> handler (vring_interrup()) assumes the interrupt is used by a single >>>> virtqueue. >>> Thanks. >>> >>> The approach to schedule tx-napi from virtnet_poll_cleantx instead of >>> cleaning directly in this rx-napi function was not effective at >>> suppressing the warning, I understand. >> Correct. I tried the approach to schedule tx napi instead of directly >> do free_old_xmit_skbs() in virtnet_poll_cleantx(). But the warning >> still happens. > Two questions here: is the device using packed or split vqs? > And is event index enabled? > > I think one issue is that at the moment with split and event index we > don't actually disable events at all.Do we really have a way to disable that? (We don't have a flag like packed virtqueue) Or you mean the trick [1] when I post tx interrupt RFC? Thanks [1] https://lkml.org/lkml/2015/2/9/113> > static void virtqueue_disable_cb_split(struct virtqueue *_vq) > { > struct vring_virtqueue *vq = to_vvq(_vq); > > if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) { > vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT; > if (!vq->event) > vq->split.vring.avail->flags > cpu_to_virtio16(_vq->vdev, > vq->split.avail_flags_shadow); > } > } > > Can you try your napi patch + disable event index? > >