The current vhost code uses 'copy_from_user' to copy descriptors from
userspace to vhost. We attempted to 'vmap' the descriptor table to
reduce the overhead associated with 'copy_from_user' during descriptor
access, because it can be accessed quite frequently. This change
resulted in a moderate performance improvement (approximately 3%) in
our pktgen test, as shown below. Additionally, the latency in the
'vhost_get_vq_desc' function shows a noticeable decrease.
current code:
IFACE rxpck/s txpck/s rxkB/s txkB/s
rxcmp/s txcmp/s rxmcst/s %ifutil
Average: vnet0 0.31 1330807.03 0.02 77976.98
0.00 0.00 0.00 6.39
# /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
avg = 145 nsecs, total: 1455980341 nsecs, count: 9985224
kmap:
IFACE rxpck/s txpck/s rxkB/s txkB/s
rxcmp/s txcmp/s rxmcst/s %ifutil
Average: vnet0 0.07 1371870.49 0.00 80383.04
0.00 0.00 0.00 6.58
# /usr/share/bcc/tools/funclatency -d 10 vhost_get_vq_desc
avg = 122 nsecs, total: 1286983929 nsecs, count: 10478134
We are uncertain if there are any aspects we may have overlooked and
would appreciate any advice before we submit an actual patch.
Thanks,
Liang