On Thu, Apr 28, 2022 at 10:43 AM Halil Pasic <pasic at linux.ibm.com> wrote:> > On Wed, 27 Apr 2022 11:27:03 +0200 > Cornelia Huck <cohuck at redhat.com> wrote: > > > On Tue, Apr 26 2022, "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > > > On Tue, Apr 26, 2022 at 05:47:17PM +0200, Cornelia Huck wrote: > > >> On Mon, Apr 25 2022, "Michael S. Tsirkin" <mst at redhat.com> wrote: > > >> > > >> > On Mon, Apr 25, 2022 at 11:53:24PM -0400, Michael S. Tsirkin wrote: > > >> >> On Tue, Apr 26, 2022 at 11:42:45AM +0800, Jason Wang wrote: > > >> >> > > > >> >> > ? 2022/4/26 11:38, Michael S. Tsirkin ??: > > >> >> > > On Mon, Apr 25, 2022 at 11:35:41PM -0400, Michael S. Tsirkin wrote: > > >> >> > > > On Tue, Apr 26, 2022 at 04:29:11AM +0200, Halil Pasic wrote: > > >> >> > > > > On Mon, 25 Apr 2022 09:59:55 -0400 > > >> >> > > > > "Michael S. Tsirkin" <mst at redhat.com> wrote: > > >> >> > > > > > > >> >> > > > > > On Mon, Apr 25, 2022 at 10:54:24AM +0200, Cornelia Huck wrote: > > >> >> > > > > > > On Mon, Apr 25 2022, "Michael S. Tsirkin" <mst at redhat.com> wrote: > > >> >> > > > > > > > On Mon, Apr 25, 2022 at 10:44:15AM +0800, Jason Wang wrote: > > >> >> > > > > > > > > This patch tries to implement the synchronize_cbs() for ccw. For the > > >> >> > > > > > > > > vring_interrupt() that is called via virtio_airq_handler(), the > > >> >> > > > > > > > > synchronization is simply done via the airq_info's lock. For the > > >> >> > > > > > > > > vring_interrupt() that is called via virtio_ccw_int_handler(), a per > > >> >> > > > > > > > > device spinlock for irq is introduced ans used in the synchronization > > >> >> > > > > > > > > method. > > >> >> > > > > > > > > > > >> >> > > > > > > > > Cc: Thomas Gleixner <tglx at linutronix.de> > > >> >> > > > > > > > > Cc: Peter Zijlstra <peterz at infradead.org> > > >> >> > > > > > > > > Cc: "Paul E. McKenney" <paulmck at kernel.org> > > >> >> > > > > > > > > Cc: Marc Zyngier <maz at kernel.org> > > >> >> > > > > > > > > Cc: Halil Pasic <pasic at linux.ibm.com> > > >> >> > > > > > > > > Cc: Cornelia Huck <cohuck at redhat.com> > > >> >> > > > > > > > > Signed-off-by: Jason Wang <jasowang at redhat.com> > > >> >> > > > > > > > > > >> >> > > > > > > > This is the only one that is giving me pause. Halil, Cornelia, > > >> >> > > > > > > > should we be concerned about the performance impact here? > > >> >> > > > > > > > Any chance it can be tested? > > >> >> > > > > > > We can have a bunch of devices using the same airq structure, and the > > >> >> > > > > > > sync cb creates a choke point, same as registering/unregistering. > > >> >> > > > > > BTW can callbacks for multiple VQs run on multiple CPUs at the moment? > > >> >> > > > > I'm not sure I understand the question. > > >> >> > > > > > > >> >> > > > > I do think we can have multiple CPUs that are executing some portion of > > >> >> > > > > virtio_ccw_int_handler(). So I guess the answer is yes. Connie what do you think? > > >> >> > > > > > > >> >> > > > > On the other hand we could also end up serializing synchronize_cbs() > > >> >> > > > > calls for different devices if they happen to use the same airq_info. But > > >> >> > > > > this probably was not your question > > >> >> > > > > > >> >> > > > I am less concerned about synchronize_cbs being slow and more about > > >> >> > > > the slowdown in interrupt processing itself. > > >> >> > > > > > >> >> > > > > > this patch serializes them on a spinlock. > > >> >> > > > > > > > >> >> > > > > Those could then pile up on the newly introduced spinlock. > > >> > > >> How bad would that be in practice? IIUC, we hit on the spinlock when > > >> - doing synchronize_cbs (should be rare) > > >> - processing queue interrupts for devices using per-device indicators > > >> (which is the non-preferred path, which I would basically only expect > > >> when running on an ancient or non-standard hypervisor) > > > > > > this one is my concern. I am worried serializing everything on a single lock > > > will drastically regress performance here. > > > > Yeah, that case could get much worse. OTOH, how likely is it that any > > setup that runs a recent kernel will actually end up with devices using > > per-device indicators? Anything running under a QEMU released in the > > last couple of years is unlikely to not use airqs, I think. Halil, do > > you think that the classic indicator setup would be more common on any > > non-QEMU hypervisors? > > > > I really don't know. My opinion is that, two stages indicators are kind > of recommended for anybody who cares about notifications performance. > > > IOW, how much effort is it worth spending on optimizing this case? We > > certainly should explore any simple solutions, but I don't think we need > > to twist ourselves into pretzels to solve it. > > > > Frankly, I would be fine with an rwlock based solution as proposed by > Jason. My rationale is: we recommend two stage indicators, and the two > stage indicators are already encumbered by an rwlock on the interrupt > path. Yes, the coalescence of adapter interrupts is architecturally > different, and so it is with GISA (without GISA, I'm not even sure), so > this rwlock end up being worse than the one for 2 stage. But my feeling > is, that it should be fine. On the other hand, I don't feel comfortable > with plain spinlock, and I am curious about a more advanced solution.Yes, I'm trying to use (S)RCU, let's see if it works.> But my guess is that rwlock + some testing for the legacy indicator case > just to double check if there is a heavy regression despite of our > expectations to see none should do the trick.I suggest this, rwlock (for not airq) seems better than spinlock, but at worst case it will cause cache line bouncing. But I wonder if it's noticeable (anyhow it has been used for airq). Thanks> > Regards, > Halil > > > > > > > > > >> - configuration change interrupts (should be rare) > > >> - during setup, reset, etc. (should not be a concern) > > >
Michael S. Tsirkin
2022-Apr-28 05:24 UTC
[PATCH V3 6/9] virtio-ccw: implement synchronize_cbs()
On Thu, Apr 28, 2022 at 11:04:41AM +0800, Jason Wang wrote:> > But my guess is that rwlock + some testing for the legacy indicator case > > just to double check if there is a heavy regression despite of our > > expectations to see none should do the trick. > > I suggest this, rwlock (for not airq) seems better than spinlock, but > at worst case it will cause cache line bouncing. But I wonder if it's > noticeable (anyhow it has been used for airq). > > ThanksWhich existing rwlock does airq use right now? Can we take it to sync? -- MST