Jason Wang
2022-Nov-18 07:23 UTC
[PATCH 3/6] vduse: Add sysfs interface for irq affinity setup
On Thu, Nov 17, 2022 at 4:54 PM Yongji Xie <xieyongji at bytedance.com> wrote:> > On Thu, Nov 17, 2022 at 2:07 PM Jason Wang <jasowang at redhat.com> wrote: > > > > On Thu, Nov 17, 2022 at 1:48 PM Yongji Xie <xieyongji at bytedance.com> wrote: > > > > > > On Thu, Nov 17, 2022 at 11:37 AM Jason Wang <jasowang at redhat.com> wrote: > > > > > > > > On Wed, Nov 16, 2022 at 3:46 PM Yongji Xie <xieyongji at bytedance.com> wrote: > > > > > > > > > > On Wed, Nov 16, 2022 at 3:11 PM Jason Wang <jasowang at redhat.com> wrote: > > > > > > > > > > > > On Tue, Nov 15, 2022 at 10:49 AM Yongji Xie <xieyongji at bytedance.com> wrote: > > > > > > > > > > > > > > On Mon, Nov 14, 2022 at 4:55 PM Jason Wang <jasowang at redhat.com> wrote: > > > > > > > > > > > > > > > > On Mon, Nov 14, 2022 at 4:20 PM Yongji Xie <xieyongji at bytedance.com> wrote: > > > > > > > > > > > > > > > > > > On Mon, Nov 14, 2022 at 3:58 PM Jason Wang <jasowang at redhat.com> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, Nov 14, 2022 at 3:16 PM Xie Yongji <xieyongji at bytedance.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Add sysfs interface for each vduse virtqueue to setup > > > > > > > > > > > irq affinity. This would be useful for performance > > > > > > > > > > > tuning, e.g., mitigate the virtqueue lock contention > > > > > > > > > > > in virtio block driver. > > > > > > > > > > > > > > > > > > > > Do we have any perforamnce numbers for this? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Almost 50% improvement (600k iops -> 900k iops) in the high iops > > > > > > > > > workloads. I have mentioned it in the cover-letter. > > > > > > > > > > > > > > > > For some reason, I miss that. > > > > > > > > > > > > > > > > I also wonder if we can do this automatically, then there's no need to > > > > > > > > play with sysfs which is kind of a burden for the management layer. > > > > > > > > > > > > > > > > > > > > > > This is hard to do since vduse doesn't know which cpu should be bound > > > > > > > for a certain virtqueue. > > > > > > > > > > > > Probably via the kick_vq()? It probably won't work when notification > > > > > > is disabled. But we need to think a little bit more about this. > > > > > > > > > > Yes, another problem is that this way can only work when the cpu and > > > > > virtqueue are 1:1 mapping. It's still hard to decide which cpu to bind > > > > > in the N:1 mapping case. > > > > > > > > This is the same situation as what you propose here. I think it would > > > > be better to use cpumask instead of cpu id here. > > > > > > > > > > If so, we need to know which cpu to bind for one virtqueue. Do you > > > mean using the cpu who kicks the virtqueue? > > > > I meant you're using: > > > > int irq_affinity; > > > > This seems to assume that the callback can only be delivered to a > > specific cpu. It would make more sense to use cpumask_t. This may have > > broader use cases. > > > > Yes, I see. I meant we need to know how to choose the cpu to run the > irq callback if we use cpumask_t, e.g., round-robin or choosing the > cpu who kicked the virtqueue before. > > > > > > > > > > > > > > So I think it could be an optimization, but the sysfs interface is still needed. > > > > > > > > > > > Requiring management software to do ad-hoc running just for VDUSE > > > > > > seems not easy. > > > > > > > > > > > > > > > > I'm not sure. In the kubernetes environment, something like a CSI/CNI > > > > > plugin can do it. > > > > > > > > Only works when the process is bound to a specific cpu. If a process > > > > is migrated to another CPU, it would be hard to track. > > > > > > > > > > OK, I see. Seems like there's no good way to handle this case. > > > > Yes, using cpumask_t might improve things a little bit. > > > > > Maybe > > > it's better to leave it as it is. > > > > It would be better to think of an automatic method to do this as > > affinity managed irq used by virtio-pci (not sure how hard it is > > though). > > > > Do you mean making use of .set_vq_affinity and .get_vq_affinity callbacks?This works for net but not block. I know little about block but looks like block is using affinity descriptor to allow blk mq to do proper irq steering. Maybe we can do something the same. Thanks> > Thanks, > Yongji >