Jason Wang
2020-Dec-08 02:30 UTC
[RFC PATCH 5/8] vhost: allow userspace to bind vqs to CPUs
On 2020/12/8 ??2:31, Mike Christie wrote:> On 12/6/20 10:27 PM, Jason Wang wrote: >> >> On 2020/12/5 ??12:32, Mike Christie wrote: >>> On 12/4/20 2:09 AM, Jason Wang wrote: >>>> >>>> On 2020/12/4 ??3:56, Mike Christie wrote: >>>>> +static long vhost_vring_set_cpu(struct vhost_dev *d, struct >>>>> vhost_virtqueue *vq, >>>>> +??????????????? void __user *argp) >>>>> +{ >>>>> +??? struct vhost_vring_state s; >>>>> +??? int ret = 0; >>>>> + >>>>> +??? if (vq->private_data) >>>>> +??????? return -EBUSY; >>>>> + >>>>> +??? if (copy_from_user(&s, argp, sizeof s)) >>>>> +??????? return -EFAULT; >>>>> + >>>>> +??? if (s.num == -1) { >>>>> +??????? vq->cpu = s.num; >>>>> +??????? return 0; >>>>> +??? } >>>>> + >>>>> +??? if (s.num >= nr_cpu_ids) >>>>> +??????? return -EINVAL; >>>>> + >>>>> +??? if (!d->ops || !d->ops->get_workqueue) >>>>> +??????? return -EINVAL; >>>>> + >>>>> +??? if (!d->wq) >>>>> +??????? d->wq = d->ops->get_workqueue(); >>>>> +??? if (!d->wq) >>>>> +??????? return -EINVAL; >>>>> + >>>>> +??? vq->cpu = s.num; >>>>> +??? return ret; >>>>> +} >>>> >>>> >>>> So one question here. Who is in charge of doing this set_cpu? Note >>>> that?sched_setaffinity(2) requires CAP_SYS_NICE to work, so I >>>> wonder whether or not it's legal for unprivileged Qemu to do this. >>> >>> >>> I was having qemu do it when it's setting up the vqs since it had >>> the info there already. >>> >>> Is it normally the tool that makes calls into qemu that does the >>> operations that require CAP_SYS_NICE? >> >> >> My understanding is that it only matter scheduling. And this patch >> wants to change the affinity which should check that capability. >> >> >>> If so, then I see the interface needs to be changed. >> >> >> Actually, if I read this patch correctly it requires e.g qemu to make >> the decision instead of the management layer. This may bring some >> troubles to for e.g the libvirt emulatorpin[1] implementation. >> > > Let me make sure I understood you. > > I thought qemu would just have a new property, and users would pass > that in like they do for the number of queues setting. Then qemu would > pass that to the kernel. The primary user I have to support at work > does not use libvirt based tools so I thought that was a common point > that would work for everyone.I think we need talk with libvirt guys to see if it works for them. My understanding is the scheduling should be the charge of them not qemu.> > For my work use requirement, your emulatorpin and CAP_SYS_NICE comment > then that means we want an interface that something other than qemu > can use right? So the tools would call directly into the kernel and > not go through qemu right?Yes, usually qemu runs without any privilege. So could it be e.g a sysfs interface or other? Thanks> >