thr3ads.net - Virtualization - [PATCH] virtio_ring: fix packed ring event may missing [Oct 2019]

If this information is useful, please help other people find it:
Share via:

Jason Wang

2019-Oct-24 03:50 UTC

[PATCH] virtio_ring: fix packed ring event may missing

On 2019/10/24 ??11:26, Liu, Yong wrote:>
>> -----Original Message-----
>> From: Jason Wang [mailto:jasowang at redhat.com]
>> Sent: Tuesday, October 22, 2019 9:06 PM
>> To: Liu, Yong <yong.liu at intel.com>; mst at redhat.com; Bie,
Tiwei
>> <tiwei.bie at intel.com>
>> Cc: virtualization at lists.linux-foundation.org
>> Subject: Re: [PATCH] virtio_ring: fix packed ring event may missing
>>
>>
>> On 2019/10/22 ??2:48, Liu, Yong wrote:
>>> Hi Jason,
>>> My answers are inline.
>>>
>>>> -----Original Message-----
>>>> From: Jason Wang [mailto:jasowang at redhat.com]
>>>> Sent: Tuesday, October 22, 2019 10:45 AM
>>>> To: Liu, Yong <yong.liu at intel.com>; mst at redhat.com;
Bie, Tiwei
>>>> <tiwei.bie at intel.com>
>>>> Cc: virtualization at lists.linux-foundation.org
>>>> Subject: Re: [PATCH] virtio_ring: fix packed ring event may
missing
>>>>
>>>>
>>>> On 2019/10/22 ??1:10, Marvin Liu wrote:
>>>>> When callback is delayed, virtio expect that vhost will
kick when
>>>>> rolling over event offset. Recheck should be taken as used
index may
>>>>> exceed event offset between status check and driver event
update.
>>>>>
>>>>> However, it is possible that flags was not modified if
descriptors are
>>>>> chained or in_order feature was negotiated. So flags at
event offset
>>>>> may not be valid for descriptor's status checking. Fix
it by using last
>>>>> used index as replacement. Tx queue will be stopped if
there's not
>>>>> enough freed buffers after recheck.
>>>>>
>>>>> Signed-off-by: Marvin Liu <yong.liu at intel.com>
>>>>>
>>>>> diff --git a/drivers/virtio/virtio_ring.c
>> b/drivers/virtio/virtio_ring.c
>>>>> index bdc08244a648..a8041e451e9e 100644
>>>>> --- a/drivers/virtio/virtio_ring.c
>>>>> +++ b/drivers/virtio/virtio_ring.c
>>>>> @@ -1499,9 +1499,6 @@ static bool
>>>> virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
>>>>>     		 * counter first before updating event flags.
>>>>>     		 */
>>>>>     		virtio_wmb(vq->weak_barriers);
>>>>> -	} else {
>>>>> -		used_idx = vq->last_used_idx;
>>>>> -		wrap_counter = vq->packed.used_wrap_counter;
>>>>>     	}
>>>>>
>>>>>     	if (vq->packed.event_flags_shadow =>>
VRING_PACKED_EVENT_FLAG_DISABLE)
>>>> {
>>>>> @@ -1518,7 +1515,9 @@ static bool
>>>> virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
>>>>>     	 */
>>>>>     	virtio_mb(vq->weak_barriers);
>>>>>
>>>>> -	if (is_used_desc_packed(vq, used_idx, wrap_counter)) {
>>>>> +	if (is_used_desc_packed(vq,
>>>>> +				vq->last_used_idx,
>>>>> +				vq->packed.used_wrap_counter)) {
>>>>>     		END_USE(vq);
>>>>>     		return false;
>>>>>     	}
>>>> Hi Marvin:
>>>>
>>>> Two questions:
>>>>
>>>> 1) Do we support IN_ORDER in kernel driver?
>>>>
>>> Not support by now. But issue still can be possible if in_direct
disabled
>> and meanwhile descs are chained.
>>> Due to packed ring desc status should check one by one, chose
arbitrary
>> position may cause issue.
>>
>>
>> Right, then it's better to mention IN_ORDER as future features.
>>
>>
>>>> 2) Should we check IN_ORDER in this case otherwise we may end
up with
>>>> interrupt storm when IN_ORDER is not negotiated?
>>> Interrupt number will not increase here, event offset value
calculated as
>> before.
>>> Here just recheck whether new used descs is enough for next around
xmit.
>>> If backend was slow, most likely Tx queue will sleep for a while
until
>> used index go over event offset.
>>
>>
>> Ok, but what if the backend is almost as fast as guest driver? E.g in
>> virtio-net we had:
>>
>>   ??? if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
>>   ??? ??? netif_stop_subqueue(dev, qnum);
>>   ??? ??? if (!use_napi &&
>>   ??? ??? ??? unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
>>   ??? ??? ??? /* More just got used, free them then recheck. */
>>   ??? ??? ??? free_old_xmit_skbs(sq, false);
>>   ??? ??? ??? if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
>>   ??? ??? ??? ??? netif_start_subqueue(dev, qnum);
>>   ??? ??? ??? ??? virtqueue_disable_cb(sq->vq);
>>   ??? ??? ??? }
>>   ??? ??? }
>>   ??? }
>>
>> I worry that we may end up with toggling queue state in the case
>> (sq->vq->num_free is near 2 + MAX_SKB_FRAGS).
>>
> Yes, at this worst case each packet will add extra twice event flags write.
Due to backend only read this value, the cost won't too much.

For driver, it means extra overheads, atomics, less batching, stats 
updating etc. For backend, cacheline will bounce between two cpus.

> Even we can track down chained descs and figure out whether event offset
indexed desc is used. There's still possibility that flags is invalid.
> One case is that backend can buffer multiple descs by not updating the
first one. We cannot guarantee that later flags is usable until check from the
first one.

In this case, since we've stopped tx queue, so there's no new buffers 
added. It doesn't matter we get notified when the 3/4 or all of the 
descriptors has been used.

Thanks

>
> Regards,
> Marvin
>
>> It looks to me the correct thing to implement is to calculate the head
>> descriptor of a chain that sits at 3/4.
>>
>> Thanks
>>
>>
>>> Thanks,
>>> Marvin
>>>
>>>> Thanks
>>>>

Michael S. Tsirkin

2019-Oct-27 09:54 UTC

head link

[PATCH] virtio_ring: fix packed ring event may missing

On Thu, Oct 24, 2019 at 11:50:51AM +0800, Jason Wang
wrote:> 
> On 2019/10/24 ??11:26, Liu, Yong wrote:
> > 
> > > -----Original Message-----
> > > From: Jason Wang [mailto:jasowang at redhat.com]
> > > Sent: Tuesday, October 22, 2019 9:06 PM
> > > To: Liu, Yong <yong.liu at intel.com>; mst at redhat.com;
Bie, Tiwei
> > > <tiwei.bie at intel.com>
> > > Cc: virtualization at lists.linux-foundation.org
> > > Subject: Re: [PATCH] virtio_ring: fix packed ring event may
missing
> > > 
> > > 
> > > On 2019/10/22 ??2:48, Liu, Yong wrote:
> > > > Hi Jason,
> > > > My answers are inline.
> > > > 
> > > > > -----Original Message-----
> > > > > From: Jason Wang [mailto:jasowang at redhat.com]
> > > > > Sent: Tuesday, October 22, 2019 10:45 AM
> > > > > To: Liu, Yong <yong.liu at intel.com>; mst at
redhat.com; Bie, Tiwei
> > > > > <tiwei.bie at intel.com>
> > > > > Cc: virtualization at lists.linux-foundation.org
> > > > > Subject: Re: [PATCH] virtio_ring: fix packed ring event
may missing
> > > > > 
> > > > > 
> > > > > On 2019/10/22 ??1:10, Marvin Liu wrote:
> > > > > > When callback is delayed, virtio expect that vhost
will kick when
> > > > > > rolling over event offset. Recheck should be taken
as used index may
> > > > > > exceed event offset between status check and
driver event update.
> > > > > > 
> > > > > > However, it is possible that flags was not
modified if descriptors are
> > > > > > chained or in_order feature was negotiated. So
flags at event offset
> > > > > > may not be valid for descriptor's status
checking. Fix it by using last
> > > > > > used index as replacement. Tx queue will be
stopped if there's not
> > > > > > enough freed buffers after recheck.
> > > > > > 
> > > > > > Signed-off-by: Marvin Liu <yong.liu at
intel.com>
> > > > > > 
> > > > > > diff --git a/drivers/virtio/virtio_ring.c
> > > b/drivers/virtio/virtio_ring.c
> > > > > > index bdc08244a648..a8041e451e9e 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -1499,9 +1499,6 @@ static bool
> > > > > virtqueue_enable_cb_delayed_packed(struct virtqueue
*_vq)
> > > > > >     		 * counter first before updating event
flags.
> > > > > >     		 */
> > > > > >     		virtio_wmb(vq->weak_barriers);
> > > > > > -	} else {
> > > > > > -		used_idx = vq->last_used_idx;
> > > > > > -		wrap_counter = vq->packed.used_wrap_counter;
> > > > > >     	}
> > > > > > 
> > > > > >     	if (vq->packed.event_flags_shadow =>
> > VRING_PACKED_EVENT_FLAG_DISABLE)
> > > > > {
> > > > > > @@ -1518,7 +1515,9 @@ static bool
> > > > > virtqueue_enable_cb_delayed_packed(struct virtqueue
*_vq)
> > > > > >     	 */
> > > > > >     	virtio_mb(vq->weak_barriers);
> > > > > > 
> > > > > > -	if (is_used_desc_packed(vq, used_idx,
wrap_counter)) {
> > > > > > +	if (is_used_desc_packed(vq,
> > > > > > +				vq->last_used_idx,
> > > > > > +				vq->packed.used_wrap_counter)) {
> > > > > >     		END_USE(vq);
> > > > > >     		return false;
> > > > > >     	}
> > > > > Hi Marvin:
> > > > > 
> > > > > Two questions:
> > > > > 
> > > > > 1) Do we support IN_ORDER in kernel driver?
> > > > > 
> > > > Not support by now. But issue still can be possible if
in_direct disabled
> > > and meanwhile descs are chained.
> > > > Due to packed ring desc status should check one by one,
chose arbitrary
> > > position may cause issue.
> > > 
> > > 
> > > Right, then it's better to mention IN_ORDER as future
features.
> > > 
> > > 
> > > > > 2) Should we check IN_ORDER in this case otherwise we
may end up with
> > > > > interrupt storm when IN_ORDER is not negotiated?
> > > > Interrupt number will not increase here, event offset value
calculated as
> > > before.
> > > > Here just recheck whether new used descs is enough for next
around xmit.
> > > > If backend was slow, most likely Tx queue will sleep for a
while until
> > > used index go over event offset.
> > > 
> > > 
> > > Ok, but what if the backend is almost as fast as guest driver?
E.g in
> > > virtio-net we had:
> > > 
> > >   ??? if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > >   ??? ??? netif_stop_subqueue(dev, qnum);
> > >   ??? ??? if (!use_napi &&
> > >   ??? ??? ??? unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
{
> > >   ??? ??? ??? /* More just got used, free them then recheck. */
> > >   ??? ??? ??? free_old_xmit_skbs(sq, false);
> > >   ??? ??? ??? if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > >   ??? ??? ??? ??? netif_start_subqueue(dev, qnum);
> > >   ??? ??? ??? ??? virtqueue_disable_cb(sq->vq);
> > >   ??? ??? ??? }
> > >   ??? ??? }
> > >   ??? }
> > > 
> > > I worry that we may end up with toggling queue state in the case
> > > (sq->vq->num_free is near 2 + MAX_SKB_FRAGS).
> > > 
> > Yes, at this worst case each packet will add extra twice event flags
write. Due to backend only read this value, the cost won't too much.
> 
> 
> For driver, it means extra overheads, atomics, less batching, stats
updating
> etc. For backend, cacheline will bounce between two cpus.
> 
> 
> > Even we can track down chained descs and figure out whether event
offset indexed desc is used. There's still possibility that flags is
invalid.
> > One case is that backend can buffer multiple descs by not updating the
first one. We cannot guarantee that later flags is usable until check from the
first one.
> 
> 
> In this case, since we've stopped tx queue, so there's no new
buffers added.
> It doesn't matter we get notified when the 3/4 or all of the
descriptors has
> been used.
> 
> Thanks
Well - checking the next descriptor will likely result in moving the
event index forward, which will thinkably reduce the # of interrupts.
So it's hard to predict which is better.  I'll apply the patch for now
as it's simple and safe.  If someone has the time to work on tuning all
this, that would be great.

> 
> > 
> > Regards,
> > Marvin
> > 
> > > It looks to me the correct thing to implement is to calculate the
head
> > > descriptor of a chain that sits at 3/4.
> > > 
> > > Thanks
> > > 
> > > 
> > > > Thanks,
> > > > Marvin
> > > > 
> > > > > Thanks
> > > > >

Virtualization - Oct 2019 - [PATCH] virtio_ring: fix packed ring event may missing

[PATCH] virtio_ring: fix packed ring event may missing

[PATCH] virtio_ring: fix packed ring event may missing