thr3ads.net - Virtualization - [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails [Dec 2020]

If this information is useful, please help other people find it:
Share via:

Jason Wang

2020-Dec-22 04:41 UTC

[PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails

On 2020/12/22 ??7:07, Willem de Bruijn wrote:> On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian at
huawei.com>  wrote:
>> From: Yunjian Wang<wangyunjian at huawei.com>
>>
>> Currently we break the loop and wake up the vhost_worker when
>> sendmsg fails. When the worker wakes up again, we'll meet the
>> same error.
> The patch is based on the assumption that such error cases always
> return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
>
>> This will cause high CPU load. To fix this issue,
>> we can skip this description by ignoring the error. When we
>> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
>> the case we don't skip the description and don't drop packet.
> the -> that
>
> here and above: description -> descriptor
>
> Perhaps slightly revise to more explicitly state that
>
> 1. in the case of persistent failure (i.e., bad packet), the driver
> drops the packet
> 2. in the case of transient failure (e.g,. memory pressure) the driver
> schedules the worker to try again later

If we want to go with this way, we need a better time to wakeup the 
worker. Otherwise it just produces more stress on the cpu that is what 
this patch tries to avoid.

Thanks

>
>

Willem de Bruijn

2020-Dec-22 14:24 UTC

head link

[PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails

On Mon, Dec 21, 2020 at 11:41 PM Jason Wang <jasowang at redhat.com>
wrote:>
>
> On 2020/12/22 ??7:07, Willem de Bruijn wrote:
> > On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<wangyunjian at
huawei.com>  wrote:
> >> From: Yunjian Wang<wangyunjian at huawei.com>
> >>
> >> Currently we break the loop and wake up the vhost_worker when
> >> sendmsg fails. When the worker wakes up again, we'll meet the
> >> same error.
> > The patch is based on the assumption that such error cases always
> > return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb?
> >
> >> This will cause high CPU load. To fix this issue,
> >> we can skip this description by ignoring the error. When we
> >> exceeds sndbuf, the return value of sendmsg is -EAGAIN. In
> >> the case we don't skip the description and don't drop
packet.
> > the -> that
> >
> > here and above: description -> descriptor
> >
> > Perhaps slightly revise to more explicitly state that
> >
> > 1. in the case of persistent failure (i.e., bad packet), the driver
> > drops the packet
> > 2. in the case of transient failure (e.g,. memory pressure) the driver
> > schedules the worker to try again later
>
>
> If we want to go with this way, we need a better time to wakeup the
> worker. Otherwise it just produces more stress on the cpu that is what
> this patch tries to avoid.
Perhaps I misunderstood the purpose of the patch: is it to drop
everything, regardless of transient or persistent failure, until the
ring runs out of descriptors?

I can understand both a blocking and drop strategy during memory
pressure. But partial drop strategy until exceeding ring capacity
seems like a peculiar hybrid?

Virtualization - Dec 2020 - [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails

[PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails

[PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg fails