On Fri, Oct 22, 2021 at 12:18:59AM -0500, Mike Christie wrote:> The following patches apply over linus's tree and this patchset > > https://lore.kernel.org/all/20211007214448.6282-1-michael.christie at oracle.com/ > > which allows us to check the vhost owner thread's RLIMITs:Unfortunately that patchset in turn triggers kbuild warnings. I was hoping you would address them, I don't think merging that patchset before kbuild issues are addressed is possible. It also doesn't have lots of acks, I'm a bit apprehensive of merging core changes like this through the vhost tree. Try to CC more widely/ping people?> It looks like that patchset has been ok'd by all the major parties > and just needs a small cleanup to apply to Jens and Paul trees, so I > wanted to post my threading patches based over it for review. > > The following patches allow us to support multiple vhost workers per > device. I ended up just doing Stefan's original idea where userspace has > the kernel create a worker and we pass back the pid. This has the benefit > over the workqueue and userspace thread approach where we only have > one'ish code path in the kernel during setup to detect old tools. The > main IO paths and device/vq setup/teardown paths all use common code. > > I've also included a patch for qemu so you can get an idea of how it > works. If we are ok with the kernel code then I'll break that up into > a patchset and send to qemu-devel for review. > > Results: > -------- > > fio jobs 1 2 4 8 12 16 > ---------------------------------------------------------- > 1 worker 84k 492k 510k - - - > worker per vq 184k 380k 744k 1422k 2256k 2434k > > Notes: > 0. This used a simple fio command: > > fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=4k \ > --ioengine=libaio --iodepth=128 --numjobs=$JOBS_ABOVE > > and I used a VM with 16 vCPUs and 16 virtqueues. > > 1. The patches were tested with emulate_pr=0 and these patches: > > https://lore.kernel.org/all/yq1tuhge4bg.fsf at ca-mkp.ca.oracle.com/t/ > > which are in mkp's scsi branches for the next kernel. They fix the perf > issues where IOPs dropped at 12 vqs/jobs. > > 2. Because we have a hard limit of 1024 cmds, if the num jobs * iodepth > was greater than 1024, I would decrease iodepth. So 12 jobs used 85 cmds, > and 16 used 64. > > 3. The perf issue above at 2 jobs is because when we only have 1 worker > we execute more cmds per vhost_work due to all vqs funneling to one worker. > This results in less context switches and better batching without having to > tweak any settings. I'm working on patches to add back batching during lio > completion and do polling on the submission side. > > We will still want the threading patches, because if we batch at the fio > level plus use the vhost theading patches, we can see a big boost like > below. So hopefully doing it at the kernel will allow apps to just work > without having to be smart like fio. > > fio using io_uring and batching with the iodepth_batch* settings: > > fio jobs 1 2 4 8 12 16 > ------------------------------------------------------------- > 1 worker 494k 520k - - - - > worker per vq 496k 878k 1542k 2436k 2304k 2590k > > V3: > - fully convert vhost code to use vq based APIs instead of leaving it > half per dev and half per vq. > - rebase against kernel worker API. > - Drop delayed worker creation. We always create the default worker at > VHOST_SET_OWNER time. Userspace can create and bind workers after that. > > v2: > - change loop that we take a refcount to the worker in > - replaced pid == -1 with define. > - fixed tabbing/spacing coding style issue > - use hash instead of list to lookup workers. > - I dropped the patch that added an ioctl cmd to get a vq's worker's > pid. I saw we might do a generic netlink interface instead. > > >
michael.christie at oracle.com
2021-Oct-22 15:54 UTC
[PATCH V3 00/11] vhost: multiple worker support
Ccing Christian for the kernel worker API merging stuff. On 10/22/21 4:48 AM, Michael S. Tsirkin wrote:> On Fri, Oct 22, 2021 at 12:18:59AM -0500, Mike Christie wrote: >> The following patches apply over linus's tree and this patchset >> >> https://urldefense.com/v3/__https://lore.kernel.org/all/20211007214448.6282-1-michael.christie at oracle.com/__;!!ACWV5N9M2RV99hQ!aqbE06mycEW-AMIj5avlBMDSvg2FONlNdYHr8PcNKdvl5FeO4QLCxCOyaVg8g8C2_Kp5$ >> >> which allows us to check the vhost owner thread's RLIMITs: > > > Unfortunately that patchset in turn triggers kbuild warnings.Yeah, that's the Jens/Paul issue I mentioned. I have to remove the old create_io_thread code and resolve issues with their trees. Paul's tree has a conflict with Jens and then my patch has a issue with Paul's patches. So Christian and I thought we would re-push the patchset through Christian after that has settled in 5.16-rc1 and then shoot for 5.17 so it has time to bake in next.> I was hoping you would address them, I don't think > merging that patchset before kbuild issues are addressed > is possible. > > It also doesn't have lots of acks, I'm a bit apprehensive > of merging core changes like this through the vhost tree.Ok. Just to make sure we are on the same page. Christian was going to push the kernel worker API changes.> Try to CC more widely/ping people?