On Tue, Oct 24, 2023 at 11:17?AM Liang Chen <liangchen.linux at gmail.com>
wrote:>
> The current vhost code uses 'copy_from_user' to copy descriptors
from
> userspace to vhost. We attempted to 'vmap' the descriptor table to
> reduce the overhead associated with 'copy_from_user' during
descriptor
> access, because it can be accessed quite frequently. This change
> resulted in a moderate performance improvement (approximately 3%) in
> our pktgen test, as shown below. Additionally, the latency in the
> 'vhost_get_vq_desc' function shows a noticeable decrease.
>
> current code:
> IFACE rxpck/s txpck/s rxkB/s txkB/s
> rxcmp/s txcmp/s rxmcst/s %ifutil
> Average: vnet0 0.31 1330807.03 0.02 77976.98
> 0.00 0.00 0.00 6.39
> # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
> avg = 145 nsecs, total: 1455980341 nsecs, count: 9985224
>
> vmap:
> IFACE rxpck/s txpck/s rxkB/s txkB/s
> rxcmp/s txcmp/s rxmcst/s %ifutil
> Average: vnet0 0.07 1371870.49 0.00 80383.04
> 0.00 0.00 0.00 6.58
> # /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
> avg = 122 nsecs, total: 1286983929 nsecs, count: 10478134
>
> We are uncertain if there are any aspects we may have overlooked and
> would appreciate any advice before we submit an actual patch.
So the idea is to use a shadow page table instead of the userspace one
to avoid things like spec barriers or SMAP.
I've tried this in the past:
commit 7f466032dc9e5a61217f22ea34b2df932786bbfc (HEAD)
Author: Jason Wang <jasowang at redhat.com>
Date: Fri May 24 04:12:18 2019 -0400
vhost: access vq metadata through kernel virtual address
It was noticed that the copy_to/from_user() friends that was used to
access virtqueue metdata tends to be very expensive for dataplane
implementation like vhost since it involves lots of software checks,
speculation barriers, hardware feature toggling (e.g SMAP). The
extra cost will be more obvious when transferring small packets since
the time spent on metadata accessing become more significant.
...
Note that it tries to use a direct map instead of a VMAP as Andrea
suggests. But it led to several fallouts which were tricky to be
fixed[1] (like the use of MMU notifiers to do synchronization). So it
is reverted finally.
I'm not saying it's a dead end. But we need to find a way to solve the
issues or use something different. I'm happy to offer help.
1) Avoid using SMAP for vhost kthread, for example using shed
notifier, I'm not sure this is possible or not
2) A new iov iterator that doesn't do SMAP at all, this looks
dangerous and Al might not like it
3) (Re)using HMM
...
You may want to see archives for more information. We've had a lot of
discussions.
Thanks
[1] https://lore.kernel.org/lkml/20190731084655.7024-1-jasowang at redhat.com/
>
>
> Thanks,
> Liang
>