Michael S. Tsirkin
2015-Mar-02 11:46 UTC
virtio balloon: do not call blocking ops when !TASK_RUNNING
On Mon, Mar 02, 2015 at 12:31:06PM +0100, Cornelia Huck wrote:> On Mon, 2 Mar 2015 12:13:58 +0100 > "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote: > > > Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > > > On Thu, 26 Feb 2015 11:50:42 +1030 > > > > Rusty Russell <rusty at rustcorp.com.au> wrote: > > > > > > > >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > > >> > Hi all, > > > >> > > > > >> > with the recent kernel 3.19, I get a kernel warning when I start my > > > >> > KVM guest on s390 with virtio balloon enabled: > > > >> > > > >> The deeper problem is that virtio_ccw_get_config just silently fails on > > > >> OOM. > > > >> > > > >> Neither get_config nor set_config are expected to fail. > > > > > > > > AFAIK this is currently not a problem. According to > > > > http://lwn.net/Articles/627419/ these kmalloc calls never > > > > fail because they allocate less than a page. > > > > > > I strongly suggest you unlearn that fact. > > > The fix for this is in two parts: > > > > > > 1) Annotate using sched_annotate_sleep() and add a comment: we may spin > > > a few times in low memory situations, but this isn't a high > > > performance path. > > > > > > 2) Handle get_config (and other) failure in some more elegant way. > > Do you mean we need to enable the caller to deal with get_config > failures (and the transport to relay those failures)? I agree with that.We can certainly tweak code to bypass need to kmalloc on get_config. Why is it doing these allocs? What's wrong with using vcdev->config directly?> > > > > > Cheers, > > > Rusty. > > > > I agree, but I'd like to point out that even without kmalloc, > > on s390 get_config is blocking - it's waiting > > for a hardware interrupt. > > > > And it makes sense: config is not data path, I don't think > > we should spin there. > > > > So I think besides these two parts, we still need my two patches: > > virtio-balloon: do not call blocking ops when !TASK_RUNNING > > virtio_console: avoid config access from irq > > in 4.0. > > > > agree? > > I agree that we need those fixes as well.
Cornelia Huck
2015-Mar-02 12:11 UTC
virtio balloon: do not call blocking ops when !TASK_RUNNING
On Mon, 2 Mar 2015 12:46:57 +0100 "Michael S. Tsirkin" <mst at redhat.com> wrote:> On Mon, Mar 02, 2015 at 12:31:06PM +0100, Cornelia Huck wrote: > > On Mon, 2 Mar 2015 12:13:58 +0100 > > "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote: > > > > Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > > > > On Thu, 26 Feb 2015 11:50:42 +1030 > > > > > Rusty Russell <rusty at rustcorp.com.au> wrote: > > > > > > > > > >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > > > >> > Hi all, > > > > >> > > > > > >> > with the recent kernel 3.19, I get a kernel warning when I start my > > > > >> > KVM guest on s390 with virtio balloon enabled: > > > > >> > > > > >> The deeper problem is that virtio_ccw_get_config just silently fails on > > > > >> OOM. > > > > >> > > > > >> Neither get_config nor set_config are expected to fail. > > > > > > > > > > AFAIK this is currently not a problem. According to > > > > > http://lwn.net/Articles/627419/ these kmalloc calls never > > > > > fail because they allocate less than a page. > > > > > > > > I strongly suggest you unlearn that fact. > > > > The fix for this is in two parts: > > > > > > > > 1) Annotate using sched_annotate_sleep() and add a comment: we may spin > > > > a few times in low memory situations, but this isn't a high > > > > performance path. > > > > > > > > 2) Handle get_config (and other) failure in some more elegant way. > > > > Do you mean we need to enable the caller to deal with get_config > > failures (and the transport to relay those failures)? I agree with that. > > We can certainly tweak code to bypass need to kmalloc > on get_config. > > Why is it doing these allocs? What's wrong with using > vcdev->config directly?We'd need to make sure that vcdev->config is allocated with GFP_DMA, as we need it to be under 2G. And we need to be more careful wrt serialization, especially if we want to reuse the ccw structure as well, for example. Nothing complicated, I'd just need some free time to do it :) The more likely reason for get_config to fail is a device hotunplug, however. We'll get a seperate notification about that (via machine check + channel report), but it would be nice if we could stop poking the device immediately, as there's no use trying to do something with it anymore.
Michael S. Tsirkin
2015-Mar-02 12:19 UTC
virtio balloon: do not call blocking ops when !TASK_RUNNING
On Mon, Mar 02, 2015 at 01:11:02PM +0100, Cornelia Huck wrote:> On Mon, 2 Mar 2015 12:46:57 +0100 > "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > On Mon, Mar 02, 2015 at 12:31:06PM +0100, Cornelia Huck wrote: > > > On Mon, 2 Mar 2015 12:13:58 +0100 > > > "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > > > > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote: > > > > > Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > > > > > On Thu, 26 Feb 2015 11:50:42 +1030 > > > > > > Rusty Russell <rusty at rustcorp.com.au> wrote: > > > > > > > > > > > >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > > > > >> > Hi all, > > > > > >> > > > > > > >> > with the recent kernel 3.19, I get a kernel warning when I start my > > > > > >> > KVM guest on s390 with virtio balloon enabled: > > > > > >> > > > > > >> The deeper problem is that virtio_ccw_get_config just silently fails on > > > > > >> OOM. > > > > > >> > > > > > >> Neither get_config nor set_config are expected to fail. > > > > > > > > > > > > AFAIK this is currently not a problem. According to > > > > > > http://lwn.net/Articles/627419/ these kmalloc calls never > > > > > > fail because they allocate less than a page. > > > > > > > > > > I strongly suggest you unlearn that fact. > > > > > The fix for this is in two parts: > > > > > > > > > > 1) Annotate using sched_annotate_sleep() and add a comment: we may spin > > > > > a few times in low memory situations, but this isn't a high > > > > > performance path. > > > > > > > > > > 2) Handle get_config (and other) failure in some more elegant way. > > > > > > Do you mean we need to enable the caller to deal with get_config > > > failures (and the transport to relay those failures)? I agree with that. > > > > We can certainly tweak code to bypass need to kmalloc > > on get_config. > > > > Why is it doing these allocs? What's wrong with using > > vcdev->config directly? > > We'd need to make sure that vcdev->config is allocated with GFP_DMA, as > we need it to be under 2G. And we need to be more careful wrt > serialization, especially if we want to reuse the ccw structure as > well, for example. Nothing complicated, I'd just need some free time to > do it :) > > The more likely reason for get_config to fail is a device hotunplug, > however. We'll get a seperate notification about that (via machine > check + channel report), but it would be nice if we could stop poking > the device immediately, as there's no use trying to do something with > it anymore.Normally, hotunplug requires guest cooperation. IOW unplug request should send guest interrupt, then block until guest confirms it's not using the device anymore. virtio pci already handles that fine, can't ccw do something similar? -- MST
Michael S. Tsirkin
2015-Mar-02 20:39 UTC
virtio balloon: do not call blocking ops when !TASK_RUNNING
On Mon, Mar 02, 2015 at 01:11:02PM +0100, Cornelia Huck wrote:> On Mon, 2 Mar 2015 12:46:57 +0100 > "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > On Mon, Mar 02, 2015 at 12:31:06PM +0100, Cornelia Huck wrote: > > > On Mon, 2 Mar 2015 12:13:58 +0100 > > > "Michael S. Tsirkin" <mst at redhat.com> wrote: > > > > > > > On Mon, Mar 02, 2015 at 10:37:26AM +1030, Rusty Russell wrote: > > > > > Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > > > > > On Thu, 26 Feb 2015 11:50:42 +1030 > > > > > > Rusty Russell <rusty at rustcorp.com.au> wrote: > > > > > > > > > > > >> Thomas Huth <thuth at linux.vnet.ibm.com> writes: > > > > > >> > Hi all, > > > > > >> > > > > > > >> > with the recent kernel 3.19, I get a kernel warning when I start my > > > > > >> > KVM guest on s390 with virtio balloon enabled: > > > > > >> > > > > > >> The deeper problem is that virtio_ccw_get_config just silently fails on > > > > > >> OOM. > > > > > >> > > > > > >> Neither get_config nor set_config are expected to fail. > > > > > > > > > > > > AFAIK this is currently not a problem. According to > > > > > > http://lwn.net/Articles/627419/ these kmalloc calls never > > > > > > fail because they allocate less than a page. > > > > > > > > > > I strongly suggest you unlearn that fact. > > > > > The fix for this is in two parts: > > > > > > > > > > 1) Annotate using sched_annotate_sleep() and add a comment: we may spin > > > > > a few times in low memory situations, but this isn't a high > > > > > performance path. > > > > > > > > > > 2) Handle get_config (and other) failure in some more elegant way. > > > > > > Do you mean we need to enable the caller to deal with get_config > > > failures (and the transport to relay those failures)? I agree with that. > > > > We can certainly tweak code to bypass need to kmalloc > > on get_config. > > > > Why is it doing these allocs? What's wrong with using > > vcdev->config directly? > > We'd need to make sure that vcdev->config is allocated with GFP_DMA, as > we need it to be under 2G.I see - and that's expensive when there are many devices? One simple solution is to have a global buffer that everyone reuses. It'll need a lock, naturally.> And we need to be more careful wrt > serialization,Why does passing in vcdev->config mean we need to be more careful than when we allocate a buffer and then memcpy into vcdev->config?> especially if we want to reuse the ccw structure as > well, for example. Nothing complicated, I'd just need some free time to > do it :) > > The more likely reason for get_config to fail is a device hotunplug, > however. We'll get a seperate notification about that (via machine > check + channel report), but it would be nice if we could stop poking > the device immediately, as there's no use trying to do something with > it anymore.-- MST
Apparently Analagous Threads
- virtio balloon: do not call blocking ops when !TASK_RUNNING
- virtio balloon: do not call blocking ops when !TASK_RUNNING
- virtio balloon: do not call blocking ops when !TASK_RUNNING
- virtio balloon: do not call blocking ops when !TASK_RUNNING
- virtio balloon: do not call blocking ops when !TASK_RUNNING