From: Rob Landley <rlandley at parallels.com> Going indirect for only two buffers isn't likely to be a performance win because the kmalloc/kfree overhead for the indirect block can't be cheaper than one extra linked list traversal. Properly "tuning" the threshold would probably be workload-specific. (One big downside of not going indirect is extra pressure on the table entries, and table size varies.) But I think that in the general case, 2 is a defensible minimum? Signed-off-by: Rob Landley <rlandley at parallels.com> --- drivers/virtio/virtio_ring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index b0043fb..2b69441 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -173,7 +173,7 @@ int virtqueue_add_buf_gfp(struct virtqueue *_vq, /* If the host supports indirect descriptor tables, and we have multiple * buffers, then go indirect. FIXME: tune this threshold */ - if (vq->indirect && (out + in) > 1 && vq->num_free) { + if (vq->indirect && (out + in) > 2 && vq->num_free) { head = vring_add_indirect(vq, sg, out, in, gfp); if (likely(head >= 0)) goto add_head;
On Sat, 23 Apr 2011 18:13:34 -0500, Rob Landley <rlandley at parallels.com> wrote:> From: Rob Landley <rlandley at parallels.com> > > Going indirect for only two buffers isn't likely to be a performance win > because the kmalloc/kfree overhead for the indirect block can't be cheaper > than one extra linked list traversal.Unfortunately it's not completely clear. QEMU sets fairly small rings, and the virtio-net driver uses 2 descriptors minimum. The effect can be a real bottleneck for small packets. Now, virtio-net could often stuff the virtio_net_hdr in the space before the packet data (saving a descriptor) but I think that will need a feature bit since qemu (incorrectly) used to insist on a separate descriptor for that header.> Properly "tuning" the threshold would probably be workload-specific. > (One big downside of not going indirect is extra pressure on the table > entries, and table size varies.) But I think that in the general case, > 2 is a defensible minimum?I'd be tempted to say that once we fill the ring, we should drop the threshold. Michael? Thanks, Rusty.
Possibly Parallel Threads
- [RFC PATCH TRIVIAL] Reading the virtio code...
- [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf
- [PATCH 2 of 5] virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf
- [PATCH 1/5] virtio: document functions better.
- [PATCH 1/5] virtio: document functions better.