Michael S. Tsirkin
2015-Mar-04 10:25 UTC
virtio balloon: do not call blocking ops when !TASK_RUNNING
On Wed, Mar 04, 2015 at 04:44:54PM +1030, Rusty Russell wrote:> "Michael S. Tsirkin" <mst at redhat.com> writes: > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote: > >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: > >> > On Thu, 26 Feb 2015 11:50:42 +1030 > >> > Rusty Russell <rusty at rustcorp.com.au> wrote: > >> > > >> >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: > >> >> > Hi all, > >> >> > > >> >> > with the recent kernel 3.19, I get a kernel warning when I start my > >> >> > KVM guest on s390 with virtio balloon enabled: > >> >> > >> >> The deeper problem is that virtio_ccw_get_config just silently fails on > >> >> OOM. > >> >> > >> >> Neither get_config nor set_config are expected to fail. > >> > > >> > AFAIK this is currently not a problem. According to > >> > http://lwn.net/Articles/627419/ these kmalloc calls never > >> > fail because they allocate less than a page. > >> > >> I strongly suggest you unlearn that fact. > >> The fix for this is in two parts: > >> > >> 1) Annotate using sched_annotate_sleep() and add a comment: we may spin > >> a few times in low memory situations, but this isn't a high > >> performance path. > >> > >> 2) Handle get_config (and other) failure in some more elegant way. > >> > >> Cheers, > >> Rusty. > > > > I agree, but I'd like to point out that even without kmalloc, > > on s390 get_config is blocking - it's waiting > > for a hardware interrupt. > > > > And it makes sense: config is not data path, I don't think > > we should spin there. > > > > So I think besides these two parts, we still need my two patches: > > virtio-balloon: do not call blocking ops when !TASK_RUNNING > > I prefer to annotate, over trying to fix this. > > Because it's not important. We might spin a few times, but it's very > unlikely, and it's certainly not performance critical. > > Thanks, > Rusty. > > Subject: virtio_balloon: annotate possible sleep waiting for event. > > CCW (s390) does this. > > Reported-by: Thomas Huth <thuth at linux.vnet.ibm.com> > Signed-off-by: Rusty Russell <rusty at rustcorp.com.au> > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 0413157f3b49..3f4d5acdbde0 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -340,6 +340,15 @@ static int balloon(void *_vballoon) > s64 diff; > > try_to_freeze(); > + > + /* > + * Reading the config on the ccw backend involves an > + * allocation, so we may actually sleep and have an > + * extra iteration. It's extremely unlikely,Hmm, this part of the comment seems wrong to me. Reading the config on the ccw backend always sleeps because it's interrupt driven. This is the relevant code: static int ccw_io_helper(struct virtio_ccw_device *vcdev, struct ccw1 *ccw, __u32 intparm) { int ret; unsigned long flags; int flag = intparm & VIRTIO_CCW_INTPARM_MASK; do { spin_lock_irqsave(get_ccwdev_lock(vcdev->cdev), flags); ret = ccw_device_start(vcdev->cdev, ccw, intparm, 0, 0); if (!ret) { if (!vcdev->curr_io) vcdev->err = 0; vcdev->curr_io |= flag; } spin_unlock_irqrestore(get_ccwdev_lock(vcdev->cdev), flags); cpu_relax(); } while (ret == -EBUSY); wait_event(vcdev->wait_q, doing_io(vcdev, flag) == 0); return ret ? ret : vcdev->err; }> and this > + * isn't a fast path in any sense. > + */ > + sched_annotate_sleep(); > + > wait_event_interruptible(vb->config_change, > (diff = towards_target(vb)) != 0 > || vb->need_stats_updateSo the wait_event_interruptible always calls wait_event which then becomes a busy wait on s390, which is not nice. So I suspect http://mid.gmane.org/1424874878-17155-1-git-send-email-mst at redhat.com is better. What do you think? -- MST
Cornelia Huck
2015-Mar-06 11:56 UTC
virtio balloon: do not call blocking ops when !TASK_RUNNING
On Wed, 4 Mar 2015 11:25:56 +0100 "Michael S. Tsirkin" <mst at redhat.com> wrote:> On Wed, Mar 04, 2015 at 04:44:54PM +1030, Rusty Russell wrote: > > "Michael S. Tsirkin" <mst at redhat.com> writes: > > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote: > > >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > >> > On Thu, 26 Feb 2015 11:50:42 +1030 > > >> > Rusty Russell <rusty at rustcorp.com.au> wrote: > > >> > > > >> >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > >> >> > Hi all, > > >> >> > > > >> >> > with the recent kernel 3.19, I get a kernel warning when I start my > > >> >> > KVM guest on s390 with virtio balloon enabled: > > >> >> > > >> >> The deeper problem is that virtio_ccw_get_config just silently fails on > > >> >> OOM. > > >> >> > > >> >> Neither get_config nor set_config are expected to fail. > > >> > > > >> > AFAIK this is currently not a problem. According to > > >> > http://lwn.net/Articles/627419/ these kmalloc calls never > > >> > fail because they allocate less than a page. > > >> > > >> I strongly suggest you unlearn that fact. > > >> The fix for this is in two parts: > > >> > > >> 1) Annotate using sched_annotate_sleep() and add a comment: we may spin > > >> a few times in low memory situations, but this isn't a high > > >> performance path. > > >> > > >> 2) Handle get_config (and other) failure in some more elegant way. > > >> > > >> Cheers, > > >> Rusty. > > > > > > I agree, but I'd like to point out that even without kmalloc, > > > on s390 get_config is blocking - it's waiting > > > for a hardware interrupt. > > > > > > And it makes sense: config is not data path, I don't think > > > we should spin there. > > > > > > So I think besides these two parts, we still need my two patches: > > > virtio-balloon: do not call blocking ops when !TASK_RUNNING > > > > I prefer to annotate, over trying to fix this. > > > > Because it's not important. We might spin a few times, but it's very > > unlikely, and it's certainly not performance critical. > > > > Thanks, > > Rusty. > > > > Subject: virtio_balloon: annotate possible sleep waiting for event. > > > > CCW (s390) does this. > > > > Reported-by: Thomas Huth <thuth at linux.vnet.ibm.com> > > Signed-off-by: Rusty Russell <rusty at rustcorp.com.au> > > > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > > index 0413157f3b49..3f4d5acdbde0 100644 > > --- a/drivers/virtio/virtio_balloon.c > > +++ b/drivers/virtio/virtio_balloon.c > > @@ -340,6 +340,15 @@ static int balloon(void *_vballoon) > > s64 diff; > > > > try_to_freeze(); > > + > > + /* > > + * Reading the config on the ccw backend involves an > > + * allocation, so we may actually sleep and have an > > + * extra iteration. It's extremely unlikely, > > Hmm, this part of the comment seems wrong to me. > Reading the config on the ccw backend always sleeps > because it's interrupt driven.(...)> So I suspect > http://mid.gmane.org/1424874878-17155-1-git-send-email-mst at redhat.com > is better. > > What do you think?I'd prefer to fix this as well. While the I/O request completes instantly on current qemu (the ssch backend handles the start function immediately, not asynchronously as on real hardware), this (a) is an implementation detail that may change and (b) doesn't account for the need to deliver the interrupt to the guest - which might take non-zero time.
Rusty Russell
2015-Mar-10 01:26 UTC
virtio balloon: do not call blocking ops when !TASK_RUNNING
Cornelia Huck <cornelia.huck at de.ibm.com> writes:> On Wed, 4 Mar 2015 11:25:56 +0100 > "Michael S. Tsirkin" <mst at redhat.com> wrote: > >> On Wed, Mar 04, 2015 at 04:44:54PM +1030, Rusty Russell wrote: >> > "Michael S. Tsirkin" <mst at redhat.com> writes: >> > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote: >> > >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: >> > >> > On Thu, 26 Feb 2015 11:50:42 +1030 >> > >> > Rusty Russell <rusty at rustcorp.com.au> wrote: >> > >> > >> > >> >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: >> > >> >> > Hi all, >> > >> >> > >> > >> >> > with the recent kernel 3.19, I get a kernel warning when I start my >> > >> >> > KVM guest on s390 with virtio balloon enabled: >> > >> >> >> > >> >> The deeper problem is that virtio_ccw_get_config just silently fails on >> > >> >> OOM. >> > >> >> >> > >> >> Neither get_config nor set_config are expected to fail. >> > >> > >> > >> > AFAIK this is currently not a problem. According to >> > >> > http://lwn.net/Articles/627419/ these kmalloc calls never >> > >> > fail because they allocate less than a page. >> > >> >> > >> I strongly suggest you unlearn that fact. >> > >> The fix for this is in two parts: >> > >> >> > >> 1) Annotate using sched_annotate_sleep() and add a comment: we may spin >> > >> a few times in low memory situations, but this isn't a high >> > >> performance path. >> > >> >> > >> 2) Handle get_config (and other) failure in some more elegant way. >> > >> >> > >> Cheers, >> > >> Rusty. >> > > >> > > I agree, but I'd like to point out that even without kmalloc, >> > > on s390 get_config is blocking - it's waiting >> > > for a hardware interrupt. >> > > >> > > And it makes sense: config is not data path, I don't think >> > > we should spin there. >> > > >> > > So I think besides these two parts, we still need my two patches: >> > > virtio-balloon: do not call blocking ops when !TASK_RUNNING >> > >> > I prefer to annotate, over trying to fix this. >> > >> > Because it's not important. We might spin a few times, but it's very >> > unlikely, and it's certainly not performance critical. >> > >> > Thanks, >> > Rusty. >> > >> > Subject: virtio_balloon: annotate possible sleep waiting for event. >> > >> > CCW (s390) does this. >> > >> > Reported-by: Thomas Huth <thuth at linux.vnet.ibm.com> >> > Signed-off-by: Rusty Russell <rusty at rustcorp.com.au> >> > >> > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c >> > index 0413157f3b49..3f4d5acdbde0 100644 >> > --- a/drivers/virtio/virtio_balloon.c >> > +++ b/drivers/virtio/virtio_balloon.c >> > @@ -340,6 +340,15 @@ static int balloon(void *_vballoon) >> > s64 diff; >> > >> > try_to_freeze(); >> > + >> > + /* >> > + * Reading the config on the ccw backend involves an >> > + * allocation, so we may actually sleep and have an >> > + * extra iteration. It's extremely unlikely, >> >> Hmm, this part of the comment seems wrong to me. >> Reading the config on the ccw backend always sleeps >> because it's interrupt driven. > > (...) > >> So I suspect >> http://mid.gmane.org/1424874878-17155-1-git-send-email-mst at redhat.com >> is better. >> >> What do you think? > > I'd prefer to fix this as well. While the I/O request completes > instantly on current qemu (the ssch backend handles the start function > immediately, not asynchronously as on real hardware), this (a) is an > implementation detail that may change and (b) doesn't account for the > need to deliver the interrupt to the guest - which might take non-zero > time.Ah, I see. My mistake. I've thrown out my patch, applied that one. Thanks, Rusty.
Possibly Parallel Threads
- virtio balloon: do not call blocking ops when !TASK_RUNNING
- virtio balloon: do not call blocking ops when !TASK_RUNNING
- virtio balloon: do not call blocking ops when !TASK_RUNNING
- virtio balloon: do not call blocking ops when !TASK_RUNNING
- virtio balloon: do not call blocking ops when !TASK_RUNNING