Michael S. Tsirkin
2021-Nov-08 13:49 UTC
[PATCH v4 0/3] virtio support cache indirect desc
On Mon, Nov 08, 2021 at 07:49:48PM +0800, Xuan Zhuo wrote:> If the VIRTIO_RING_F_INDIRECT_DESC negotiation succeeds, and the number > of sgs used for sending packets is greater than 1. We must constantly > call __kmalloc/kfree to allocate/release desc. > > In the case of extremely fast package delivery, the overhead cannot be > ignored: > > 27.46% [kernel] [k] virtqueue_add > 16.66% [kernel] [k] detach_buf_split > 16.51% [kernel] [k] virtnet_xsk_xmit > 14.04% [kernel] [k] virtqueue_add_outbuf > 5.18% [kernel] [k] __kmalloc > 4.08% [kernel] [k] kfree > 2.80% [kernel] [k] virtqueue_get_buf_ctx > 2.22% [kernel] [k] xsk_tx_peek_desc > 2.08% [kernel] [k] memset_erms > 0.83% [kernel] [k] virtqueue_kick_prepare > 0.76% [kernel] [k] virtnet_xsk_run > 0.62% [kernel] [k] __free_old_xmit_ptr > 0.60% [kernel] [k] vring_map_one_sg > 0.53% [kernel] [k] native_apic_mem_write > 0.46% [kernel] [k] sg_next > 0.43% [kernel] [k] sg_init_table > 0.41% [kernel] [k] kmalloc_slab > > This patch adds a cache function to virtio to cache these allocated indirect > desc instead of constantly allocating and releasing desc.Hmm a bunch of comments got ignored. See e.g. https://lore.kernel.org/r/20211027043851-mutt-send-email-mst%40kernel.org if they aren't relevant add code comments or commit log text explaining the design choice please.> v4: > 1. Only allow desc cache when VIRTIO_RING_F_INDIRECT_DESC negotiation is successful > 2. The desc cache threshold can be set for each virtqueue > > v3: > pre-allocate per buffer indirect descriptors array > > v2: > use struct list_head to cache the desc > > Xuan Zhuo (3): > virtio: cache indirect desc for split > virtio: cache indirect desc for packed > virtio-net: enable virtio desc cache > > drivers/net/virtio_net.c | 12 ++- > drivers/virtio/virtio_ring.c | 152 +++++++++++++++++++++++++++++++---- > include/linux/virtio.h | 17 ++++ > 3 files changed, 163 insertions(+), 18 deletions(-) > > -- > 2.31.0
On Mon, 8 Nov 2021 08:49:27 -0500, Michael S. Tsirkin <mst at redhat.com> wrote:> > Hmm a bunch of comments got ignored. See e.g. > https://lore.kernel.org/r/20211027043851-mutt-send-email-mst%40kernel.org > if they aren't relevant add code comments or commit log text explaining the > design choice please.I should have responded to related questions, I am guessing whether some emails have been lost. I have sorted out the following 6 questions, if there are any missing questions, please let me know. 1. use list_head In the earliest version, I used pointers directly. You suggest that I use llist_head, but considering that llist_head has atomic operations. There is no competition problem here, so I used list_head. In fact, I did not increase the allocated space for list_head. use as desc array: | vring_desc | vring_desc | vring_desc | vring_desc | use as queue item: | list_head ........................................| 2.> > + if (vq->use_desc_cache && total_sg <= VIRT_QUEUE_CACHE_DESC_NUM) { > > + if (vq->desc_cache_chain) { > > + desc = vq->desc_cache_chain; > > + vq->desc_cache_chain = (void *)desc->addr; > > + goto got; > > + } > > + n = VIRT_QUEUE_CACHE_DESC_NUM; > > Hmm. This will allocate more entries than actually used. Why do it?This is because the size of each cache item is fixed, and the logic has been modified in the latest code. I think this problem no longer exists. 3.> What bothers me here is what happens if cache gets > filled on one numa node, then used on another?I'm thinking about another question, how did the cross-numa appear here, and virtio desc queue also has the problem of cross-numa. So is it necessary for us to deal with the cross-numa scene? Indirect desc is used as virtio desc, so as long as it is in the same numa as virito desc. So we can allocate indirect desc cache at the same time when allocating virtio desc queue. 4.> So e.g. for rx, we are wasting memory since indirect isn't used.In the current version, desc cache is set up based on pre-queue. So if the desc cache is not used, we don't need to set the desc cache. For example, virtio-net, as long as the tx queue and the rx queue in big packet mode enable desc cache. 5.> Would a better API be a cache size in bytes? This controls how much > memory is spent after all.My design is to set a threshold. When total_sg is greater than this threshold, it will fall back to kmalloc/kfree. When total_sg is less than or equal to this threshold, use the allocated cache. 6. kmem_cache_* I have tested these, the performance is not as good as the method used in this patch. Thanks.