Ming Lei
2014-Jul-01 01:36 UTC
[PATCH v3 0/2] block: virtio-blk: support multi vq per virtio-blk
Hi Jens and Rusty, On Thu, Jun 26, 2014 at 8:04 PM, Ming Lei <ming.lei at canonical.com> wrote:> On Thu, Jun 26, 2014 at 5:41 PM, Ming Lei <ming.lei at canonical.com> wrote: >> Hi, >> >> These patches try to support multi virtual queues(multi-vq) in one >> virtio-blk device, and maps each virtual queue(vq) to blk-mq's >> hardware queue. >> >> With this approach, both scalability and performance on virtio-blk >> device can get improved. >> >> For verifying the improvement, I implements virtio-blk multi-vq over >> qemu's dataplane feature, and both handling host notification >> from each vq and processing host I/O are still kept in the per-device >> iothread context, the change is based on qemu v2.0.0 release, and >> can be accessed from below tree: >> >> git://kernel.ubuntu.com/ming/qemu.git #v2.0.0-virtblk-mq.1 >> >> For enabling the multi-vq feature, 'num_queues=N' need to be added into >> '-device virtio-blk-pci ...' of qemu command line, and suggest to pass >> 'vectors=N+1' to keep one MSI irq vector per each vq, and the feature >> depends on x-data-plane. >> >> Fio(libaio, randread, iodepth=64, bs=4K, jobs=N) is run inside VM to >> verify the improvement. >> >> I just create a small quadcore VM and run fio inside the VM, and >> num_queues of the virtio-blk device is set as 2, but looks the >> improvement is still obvious. The host is 2 sockets, 8cores(16threads) >> server. >> >> 1), about scalability >> - jobs = 2, thoughput: +33% >> - jobs = 4, thoughput: +100% >> >> 2), about top thoughput: +39% >> >> So in my test, even for a quad-core VM, if the virtqueue number >> is increased from 1 to 2, both scalability and performance can >> get improved a lot. >> >> In above qemu implementation of virtio-blk-mq device, only one >> IOthread handles requests from all vqs, and the above throughput >> data has been very close to same fio test in host side with single >> job. So more improvement should be observed once more IOthreads are >> used for handling requests from multi vqs. >> >> TODO: >> - adjust vq's irq smp_affinity according to blk-mq hw queue's cpumask >> >> V3: >> - fix use-after-free on vq->name reported by Michael >> >> V2: (suggestions from Michael and Dave Chinner) >> - allocate virtqueues' pointers dynamically >> - make sure the per-queue spinlock isn't kept in same cache line >> - make each queue's name different >> >> V1: >> - remove RFC since no one objects >> - add '__u8 unused' for pending as suggested by Rusty >> - use virtio_cread_feature() directly, suggested by Rusty > > Sorry, please add Jens' reviewed-by. > > Reviewed-by: Jens Axboe <axboe at kernel.dk>I appreciate very much that one of you may queue these two patches into your tree so that userspace work can be kicked off, since Michael has acked both patches and all comments have been addressed already. Thanks, -- Ming Lei
Jens Axboe
2014-Jul-01 03:01 UTC
[PATCH v3 0/2] block: virtio-blk: support multi vq per virtio-blk
On 2014-06-30 19:36, Ming Lei wrote:> Hi Jens and Rusty, > > On Thu, Jun 26, 2014 at 8:04 PM, Ming Lei <ming.lei at canonical.com> wrote: >> On Thu, Jun 26, 2014 at 5:41 PM, Ming Lei <ming.lei at canonical.com> wrote: >>> Hi, >>> >>> These patches try to support multi virtual queues(multi-vq) in one >>> virtio-blk device, and maps each virtual queue(vq) to blk-mq's >>> hardware queue. >>> >>> With this approach, both scalability and performance on virtio-blk >>> device can get improved. >>> >>> For verifying the improvement, I implements virtio-blk multi-vq over >>> qemu's dataplane feature, and both handling host notification >>> from each vq and processing host I/O are still kept in the per-device >>> iothread context, the change is based on qemu v2.0.0 release, and >>> can be accessed from below tree: >>> >>> git://kernel.ubuntu.com/ming/qemu.git #v2.0.0-virtblk-mq.1 >>> >>> For enabling the multi-vq feature, 'num_queues=N' need to be added into >>> '-device virtio-blk-pci ...' of qemu command line, and suggest to pass >>> 'vectors=N+1' to keep one MSI irq vector per each vq, and the feature >>> depends on x-data-plane. >>> >>> Fio(libaio, randread, iodepth=64, bs=4K, jobs=N) is run inside VM to >>> verify the improvement. >>> >>> I just create a small quadcore VM and run fio inside the VM, and >>> num_queues of the virtio-blk device is set as 2, but looks the >>> improvement is still obvious. The host is 2 sockets, 8cores(16threads) >>> server. >>> >>> 1), about scalability >>> - jobs = 2, thoughput: +33% >>> - jobs = 4, thoughput: +100% >>> >>> 2), about top thoughput: +39% >>> >>> So in my test, even for a quad-core VM, if the virtqueue number >>> is increased from 1 to 2, both scalability and performance can >>> get improved a lot. >>> >>> In above qemu implementation of virtio-blk-mq device, only one >>> IOthread handles requests from all vqs, and the above throughput >>> data has been very close to same fio test in host side with single >>> job. So more improvement should be observed once more IOthreads are >>> used for handling requests from multi vqs. >>> >>> TODO: >>> - adjust vq's irq smp_affinity according to blk-mq hw queue's cpumask >>> >>> V3: >>> - fix use-after-free on vq->name reported by Michael >>> >>> V2: (suggestions from Michael and Dave Chinner) >>> - allocate virtqueues' pointers dynamically >>> - make sure the per-queue spinlock isn't kept in same cache line >>> - make each queue's name different >>> >>> V1: >>> - remove RFC since no one objects >>> - add '__u8 unused' for pending as suggested by Rusty >>> - use virtio_cread_feature() directly, suggested by Rusty >> >> Sorry, please add Jens' reviewed-by. >> >> Reviewed-by: Jens Axboe <axboe at kernel.dk> > > I appreciate very much that one of you may queue these two > patches into your tree so that userspace work can be kicked off, > since Michael has acked both patches and all comments have > been addressed already.Given that Michael also acked it and Rusty is on his sabbatical, I'll queue it up for 3.17. -- Jens Axboe
Christoph Hellwig
2014-Jul-01 08:13 UTC
[PATCH v3 0/2] block: virtio-blk: support multi vq per virtio-blk
On Mon, Jun 30, 2014 at 09:01:07PM -0600, Jens Axboe wrote:> >I appreciate very much that one of you may queue these two > >patches into your tree so that userspace work can be kicked off, > >since Michael has acked both patches and all comments have > >been addressed already. > > Given that Michael also acked it and Rusty is on his sabbatical, I'll queue > it up for 3.17.So Rusty is offline? Who is taking care of module/moduleparam patches in the meantime?
Possibly Parallel Threads
- [PATCH v3 0/2] block: virtio-blk: support multi vq per virtio-blk
- [PATCH v3 0/2] block: virtio-blk: support multi vq per virtio-blk
- [PATCH v3 0/2] block: virtio-blk: support multi vq per virtio-blk
- [PATCH v3 0/2] block: virtio-blk: support multi vq per virtio-blk
- [PATCH v2 0/2] block: virtio-blk: support multi vq per virtio-blk