Michael S. Tsirkin
2022-Jan-06 12:28 UTC
[PATCH v3 0/3] virtio support cache indirect desc
On Fri, Oct 29, 2021 at 02:28:11PM +0800, Xuan Zhuo wrote:> If the VIRTIO_RING_F_INDIRECT_DESC negotiation succeeds, and the number > of sgs used for sending packets is greater than 1. We must constantly > call __kmalloc/kfree to allocate/release desc.So where is this going? I really like the performance boost. My concern is that if guest spans NUMA nodes and when handler switches from node to another this will keep reusing the cache from the old node. A bunch of ways were suggested to address this, but even just making the cache per numa node would help.> In the case of extremely fast package delivery, the overhead cannot be > ignored: > > 27.46% [kernel] [k] virtqueue_add > 16.66% [kernel] [k] detach_buf_split > 16.51% [kernel] [k] virtnet_xsk_xmit > 14.04% [kernel] [k] virtqueue_add_outbuf > 5.18% [kernel] [k] __kmalloc > 4.08% [kernel] [k] kfree > 2.80% [kernel] [k] virtqueue_get_buf_ctx > 2.22% [kernel] [k] xsk_tx_peek_desc > 2.08% [kernel] [k] memset_erms > 0.83% [kernel] [k] virtqueue_kick_prepare > 0.76% [kernel] [k] virtnet_xsk_run > 0.62% [kernel] [k] __free_old_xmit_ptr > 0.60% [kernel] [k] vring_map_one_sg > 0.53% [kernel] [k] native_apic_mem_write > 0.46% [kernel] [k] sg_next > 0.43% [kernel] [k] sg_init_table > 0.41% [kernel] [k] kmalloc_slab > > This patch adds a cache function to virtio to cache these allocated indirect > desc instead of constantly allocating and releasing desc. > > v3: > pre-allocate per buffer indirect descriptors array > > v2: > use struct list_head to cache the desc > > *** BLURB HERE *** > > Xuan Zhuo (3): > virtio: cache indirect desc for split > virtio: cache indirect desc for packed > virtio-net: enable virtio desc cache > > drivers/net/virtio_net.c | 11 +++ > drivers/virtio/virtio.c | 6 ++ > drivers/virtio/virtio_ring.c | 131 ++++++++++++++++++++++++++++++----- > include/linux/virtio.h | 14 ++++ > 4 files changed, 145 insertions(+), 17 deletions(-) > > -- > 2.31.0
On Thu, 6 Jan 2022 07:28:31 -0500, Michael S. Tsirkin <mst at redhat.com> wrote:> On Fri, Oct 29, 2021 at 02:28:11PM +0800, Xuan Zhuo wrote: > > If the VIRTIO_RING_F_INDIRECT_DESC negotiation succeeds, and the number > > of sgs used for sending packets is greater than 1. We must constantly > > call __kmalloc/kfree to allocate/release desc. > > > So where is this going? I really like the performance boost. My concern > is that if guest spans NUMA nodes and when handler switches from > node to another this will keep reusing the cache from > the old node. A bunch of ways were suggested to address this, but > even just making the cache per numa node would help. >In fact, this is the problem I encountered in implementing virtio-net to support xdp socket. With virtqueue reset[0] has been merged into virtio spec. I am completing this series of work. My plan is: 1. virtio support advance dma 2. linux kernel/qemu support virtqueue reset 3. virtio-net support AF_XDP 4. virtio support cache indirect desc [0]: https://github.com/oasis-tcs/virtio-spec/issues/124 Thanks.> > > In the case of extremely fast package delivery, the overhead cannot be > > ignored: > > > > 27.46% [kernel] [k] virtqueue_add > > 16.66% [kernel] [k] detach_buf_split > > 16.51% [kernel] [k] virtnet_xsk_xmit > > 14.04% [kernel] [k] virtqueue_add_outbuf > > 5.18% [kernel] [k] __kmalloc > > 4.08% [kernel] [k] kfree > > 2.80% [kernel] [k] virtqueue_get_buf_ctx > > 2.22% [kernel] [k] xsk_tx_peek_desc > > 2.08% [kernel] [k] memset_erms > > 0.83% [kernel] [k] virtqueue_kick_prepare > > 0.76% [kernel] [k] virtnet_xsk_run > > 0.62% [kernel] [k] __free_old_xmit_ptr > > 0.60% [kernel] [k] vring_map_one_sg > > 0.53% [kernel] [k] native_apic_mem_write > > 0.46% [kernel] [k] sg_next > > 0.43% [kernel] [k] sg_init_table > > 0.41% [kernel] [k] kmalloc_slab > > > > This patch adds a cache function to virtio to cache these allocated indirect > > desc instead of constantly allocating and releasing desc. > > > > v3: > > pre-allocate per buffer indirect descriptors array > > > > v2: > > use struct list_head to cache the desc > > > > *** BLURB HERE *** > > > > Xuan Zhuo (3): > > virtio: cache indirect desc for split > > virtio: cache indirect desc for packed > > virtio-net: enable virtio desc cache > > > > drivers/net/virtio_net.c | 11 +++ > > drivers/virtio/virtio.c | 6 ++ > > drivers/virtio/virtio_ring.c | 131 ++++++++++++++++++++++++++++++----- > > include/linux/virtio.h | 14 ++++ > > 4 files changed, 145 insertions(+), 17 deletions(-) > > > > -- > > 2.31.0 >