thr3ads.net - Linux Virtualization - [PATCH RFC 1/2] virtio-net: bql support [Jan 2019]

If this information is useful, please help other people find it:
Share via:

Jason Wang

2019-Jan-07 03:51 UTC

[PATCH RFC 1/2] virtio-net: bql support

On 2019/1/7 ??11:17, Michael S. Tsirkin wrote:> On Mon, Jan 07, 2019 at 10:14:37AM +0800, Jason Wang wrote:
>> On 2019/1/2 ??9:59, Michael S. Tsirkin wrote:
>>> On Wed, Jan 02, 2019 at 11:28:43AM +0800, Jason Wang wrote:
>>>> On 2018/12/31 ??2:45, Michael S. Tsirkin wrote:
>>>>> On Thu, Dec 27, 2018 at 06:00:36PM +0800, Jason Wang wrote:
>>>>>> On 2018/12/26 ??11:19, Michael S. Tsirkin wrote:
>>>>>>> On Thu, Dec 06, 2018 at 04:17:36PM +0800, Jason
Wang wrote:
>>>>>>>> On 2018/12/6 ??6:54, Michael S. Tsirkin wrote:
>>>>>>>>> When use_napi is set, let's enable
BQLs.  Note: some of the issues are
>>>>>>>>> similar to wifi.  It's worth
considering whether something similar to
>>>>>>>>> commit 36148c2bbfbe ("mac80211: Adjust
TSQ pacing shift") might be
>>>>>>>>> benefitial.
>>>>>>>> I've played a similar patch several days
before. The tricky part is the mode
>>>>>>>> switching between napi and no napi. We should
make sure when the packet is
>>>>>>>> sent and trakced by BQL,? it should be consumed
by BQL as well. I did it by
>>>>>>>> tracking it through skb->cb.? And deal with
the freeze by reset the BQL
>>>>>>>> status. Patch attached.
>>>>>>>>
>>>>>>>> But when testing with vhost-net, I don't
very a stable performance,
>>>>>>> So how about increasing TSQ pacing shift then?
>>>>>> I can test this. But changing default TCP value is much
more than a
>>>>>> virtio-net specific thing.
>>>>> Well same logic as wifi applies. Unpredictable latencies
related
>>>>> to radio in one case, to host scheduler in the other.
>>>>>
>>>>>>>> it was
>>>>>>>> probably because we batch the used ring
updating so tx interrupt may come
>>>>>>>> randomly. We probably need to implement time
bounded coalescing mechanism
>>>>>>>> which could be configured from userspace.
>>>>>>> I don't think it's reasonable to expect
userspace to be that smart ...
>>>>>>> Why do we need time bounded? used ring is always
updated when ring
>>>>>>> becomes empty.
>>>>>> We don't add used when means BQL may not see the
consumed packet in time.
>>>>>> And the delay varies based on the workload since we
count packets not bytes
>>>>>> or time before doing the batched updating.
>>>>>>
>>>>>> Thanks
>>>>> Sorry I still don't get it.
>>>>> When nothing is outstanding then we do update the used.
>>>>> So if BQL stops userspace from sending packets then
>>>>> we get an interrupt and packets start flowing again.
>>>> Yes, but how about the cases of multiple flows. That's
where I see unstable
>>>> results.
>>>>
>>>>
>>>>> It might be suboptimal, we might need to tune it but I
doubt running
>>>>> timers is a solution, timer interrupts cause VM exits.
>>>> Probably not a timer but a time counter (or event byte counter)
in vhost to
>>>> add used and signal guest if it exceeds a value instead of
waiting the
>>>> number of packets.
>>>>
>>>>
>>>> Thanks
>>> Well we already have VHOST_NET_WEIGHT - is it too big then?
>>
>> I'm not sure, it might be too big.
>>
>>
>>> And maybe we should expose the "MORE" flag in the
descriptor -
>>> do you think that will help?
>>>
>> I don't know. But how a "more" flag can help here?
>>
>> Thanks
> It sounds like we should be a bit more aggressive in updating used ring.
> But if we just do it naively we will harm performance for sure as that
> is how we are doing batching right now.

I agree but the problem is to balance the PPS and throughput. More 
batching helps for PPS but may damage TCP throughput.

>   Instead we could make guest
> control batching using the more flag - if that's not set we write out
> the used ring.

It's under the control of guest, so I'm afraid we still need some more 
guard (e.g time/bytes counters) on host.

Thanks

>

Michael S. Tsirkin

2019-Jan-07 04:01 UTC

head link

[PATCH RFC 1/2] virtio-net: bql support

On Mon, Jan 07, 2019 at 11:51:55AM +0800, Jason Wang
wrote:> 
> On 2019/1/7 ??11:17, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:14:37AM +0800, Jason Wang wrote:
> > > On 2019/1/2 ??9:59, Michael S. Tsirkin wrote:
> > > > On Wed, Jan 02, 2019 at 11:28:43AM +0800, Jason Wang wrote:
> > > > > On 2018/12/31 ??2:45, Michael S. Tsirkin wrote:
> > > > > > On Thu, Dec 27, 2018 at 06:00:36PM +0800, Jason
Wang wrote:
> > > > > > > On 2018/12/26 ??11:19, Michael S. Tsirkin
wrote:
> > > > > > > > On Thu, Dec 06, 2018 at 04:17:36PM
+0800, Jason Wang wrote:
> > > > > > > > > On 2018/12/6 ??6:54, Michael S.
Tsirkin wrote:
> > > > > > > > > > When use_napi is set,
let's enable BQLs.  Note: some of the issues are
> > > > > > > > > > similar to wifi.  It's
worth considering whether something similar to
> > > > > > > > > > commit 36148c2bbfbe
("mac80211: Adjust TSQ pacing shift") might be
> > > > > > > > > > benefitial.
> > > > > > > > > I've played a similar patch
several days before. The tricky part is the mode
> > > > > > > > > switching between napi and no napi.
We should make sure when the packet is
> > > > > > > > > sent and trakced by BQL,? it should
be consumed by BQL as well. I did it by
> > > > > > > > > tracking it through skb->cb.?
And deal with the freeze by reset the BQL
> > > > > > > > > status. Patch attached.
> > > > > > > > > 
> > > > > > > > > But when testing with vhost-net, I
don't very a stable performance,
> > > > > > > > So how about increasing TSQ pacing shift
then?
> > > > > > > I can test this. But changing default TCP
value is much more than a
> > > > > > > virtio-net specific thing.
> > > > > > Well same logic as wifi applies. Unpredictable
latencies related
> > > > > > to radio in one case, to host scheduler in the
other.
> > > > > > 
> > > > > > > > > it was
> > > > > > > > > probably because we batch the used
ring updating so tx interrupt may come
> > > > > > > > > randomly. We probably need to
implement time bounded coalescing mechanism
> > > > > > > > > which could be configured from
userspace.
> > > > > > > > I don't think it's reasonable to
expect userspace to be that smart ...
> > > > > > > > Why do we need time bounded? used ring
is always updated when ring
> > > > > > > > becomes empty.
> > > > > > > We don't add used when means BQL may not
see the consumed packet in time.
> > > > > > > And the delay varies based on the workload
since we count packets not bytes
> > > > > > > or time before doing the batched updating.
> > > > > > > 
> > > > > > > Thanks
> > > > > > Sorry I still don't get it.
> > > > > > When nothing is outstanding then we do update the
used.
> > > > > > So if BQL stops userspace from sending packets
then
> > > > > > we get an interrupt and packets start flowing
again.
> > > > > Yes, but how about the cases of multiple flows.
That's where I see unstable
> > > > > results.
> > > > > 
> > > > > 
> > > > > > It might be suboptimal, we might need to tune it
but I doubt running
> > > > > > timers is a solution, timer interrupts cause VM
exits.
> > > > > Probably not a timer but a time counter (or event byte
counter) in vhost to
> > > > > add used and signal guest if it exceeds a value instead
of waiting the
> > > > > number of packets.
> > > > > 
> > > > > 
> > > > > Thanks
> > > > Well we already have VHOST_NET_WEIGHT - is it too big then?
> > > 
> > > I'm not sure, it might be too big.
> > > 
> > > 
> > > > And maybe we should expose the "MORE" flag in the
descriptor -
> > > > do you think that will help?
> > > > 
> > > I don't know. But how a "more" flag can help here?
> > > 
> > > Thanks
> > It sounds like we should be a bit more aggressive in updating used
ring.
> > But if we just do it naively we will harm performance for sure as that
> > is how we are doing batching right now.
> 
> 
> I agree but the problem is to balance the PPS and throughput. More batching
> helps for PPS but may damage TCP throughput.
That is what more flag is supposed to be I think - it is only set if
there's a socket that actually needs the skb freed in order to go on.
> 
> >   Instead we could make guest
> > control batching using the more flag - if that's not set we write
out
> > the used ring.
> 
> 
> It's under the control of guest, so I'm afraid we still need some
more guard
> (e.g time/bytes counters) on host.
> 
> Thanks
Point is if guest does not care about the skb being freed, then there is no
rush host side to mark buffer used.

> 
> >

Jason Wang

2019-Jan-07 06:31 UTC

head link

[PATCH RFC 1/2] virtio-net: bql support

On 2019/1/7 ??12:01, Michael S. Tsirkin wrote:> On Mon, Jan 07, 2019 at 11:51:55AM +0800, Jason Wang wrote:
>> On 2019/1/7 ??11:17, Michael S. Tsirkin wrote:
>>> On Mon, Jan 07, 2019 at 10:14:37AM +0800, Jason Wang wrote:
>>>> On 2019/1/2 ??9:59, Michael S. Tsirkin wrote:
>>>>> On Wed, Jan 02, 2019 at 11:28:43AM +0800, Jason Wang wrote:
>>>>>> On 2018/12/31 ??2:45, Michael S. Tsirkin wrote:
>>>>>>> On Thu, Dec 27, 2018 at 06:00:36PM +0800, Jason
Wang wrote:
>>>>>>>> On 2018/12/26 ??11:19, Michael S. Tsirkin
wrote:
>>>>>>>>> On Thu, Dec 06, 2018 at 04:17:36PM +0800,
Jason Wang wrote:
>>>>>>>>>> On 2018/12/6 ??6:54, Michael S. Tsirkin
wrote:
>>>>>>>>>>> When use_napi is set, let's
enable BQLs.  Note: some of the issues are
>>>>>>>>>>> similar to wifi.  It's worth
considering whether something similar to
>>>>>>>>>>> commit 36148c2bbfbe
("mac80211: Adjust TSQ pacing shift") might be
>>>>>>>>>>> benefitial.
>>>>>>>>>> I've played a similar patch several
days before. The tricky part is the mode
>>>>>>>>>> switching between napi and no napi. We
should make sure when the packet is
>>>>>>>>>> sent and trakced by BQL,? it should be
consumed by BQL as well. I did it by
>>>>>>>>>> tracking it through skb->cb.? And
deal with the freeze by reset the BQL
>>>>>>>>>> status. Patch attached.
>>>>>>>>>>
>>>>>>>>>> But when testing with vhost-net, I
don't very a stable performance,
>>>>>>>>> So how about increasing TSQ pacing shift
then?
>>>>>>>> I can test this. But changing default TCP value
is much more than a
>>>>>>>> virtio-net specific thing.
>>>>>>> Well same logic as wifi applies. Unpredictable
latencies related
>>>>>>> to radio in one case, to host scheduler in the
other.
>>>>>>>
>>>>>>>>>> it was
>>>>>>>>>> probably because we batch the used ring
updating so tx interrupt may come
>>>>>>>>>> randomly. We probably need to implement
time bounded coalescing mechanism
>>>>>>>>>> which could be configured from
userspace.
>>>>>>>>> I don't think it's reasonable to
expect userspace to be that smart ...
>>>>>>>>> Why do we need time bounded? used ring is
always updated when ring
>>>>>>>>> becomes empty.
>>>>>>>> We don't add used when means BQL may not
see the consumed packet in time.
>>>>>>>> And the delay varies based on the workload
since we count packets not bytes
>>>>>>>> or time before doing the batched updating.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>> Sorry I still don't get it.
>>>>>>> When nothing is outstanding then we do update the
used.
>>>>>>> So if BQL stops userspace from sending packets then
>>>>>>> we get an interrupt and packets start flowing
again.
>>>>>> Yes, but how about the cases of multiple flows.
That's where I see unstable
>>>>>> results.
>>>>>>
>>>>>>
>>>>>>> It might be suboptimal, we might need to tune it
but I doubt running
>>>>>>> timers is a solution, timer interrupts cause VM
exits.
>>>>>> Probably not a timer but a time counter (or event byte
counter) in vhost to
>>>>>> add used and signal guest if it exceeds a value instead
of waiting the
>>>>>> number of packets.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>> Well we already have VHOST_NET_WEIGHT - is it too big then?
>>>> I'm not sure, it might be too big.
>>>>
>>>>
>>>>> And maybe we should expose the "MORE" flag in the
descriptor -
>>>>> do you think that will help?
>>>>>
>>>> I don't know. But how a "more" flag can help
here?
>>>>
>>>> Thanks
>>> It sounds like we should be a bit more aggressive in updating used
ring.
>>> But if we just do it naively we will harm performance for sure as
that
>>> is how we are doing batching right now.
>>
>> I agree but the problem is to balance the PPS and throughput. More
batching
>> helps for PPS but may damage TCP throughput.
> That is what more flag is supposed to be I think - it is only set if
> there's a socket that actually needs the skb freed in order to go on.

I'm not quite sure I get, but is this something similar to what you want?

https://lists.linuxfoundation.org/pipermail/virtualization/2014-October/027667.html

Which enables tx interrupt for TCP packets, and you want to add used 
more aggressively for those sockets?


Thanks

>>>    Instead we could make guest
>>> control batching using the more flag - if that's not set we
write out
>>> the used ring.
>>
>> It's under the control of guest, so I'm afraid we still need
some more guard
>> (e.g time/bytes counters) on host.
>>
>> Thanks
> Point is if guest does not care about the skb being freed, then there is no
> rush host side to mark buffer used.
>
>

Reasonably Related Threads

Search for more maybe matching threads

Linux Virtualization - Jan 2019 - [PATCH RFC 1/2] virtio-net: bql support

[PATCH RFC 1/2] virtio-net: bql support

[PATCH RFC 1/2] virtio-net: bql support

[PATCH RFC 1/2] virtio-net: bql support

Reasonably Related Threads