Stefano Stabellini
2011-Jul-25 10:54 UTC
[Xen-devel] regression on Xen: 23573 breaks PV on HVM guests
Hi all, I would just let you know that I found a regression in the hypervisor: CS 23573 causes PV on HVM guests to hang during boot. The offending commit is the following: # HG changeset patch # User Jan Beulich <jbeulich@novell.com> # Date 1308825163 -3600 # Node ID 584c2e5e03d96f912cdfe90f8e9f910d5d661706 # Parent 4e9562c1ce4ecaac40011556c16712aefd47afa6 replace d->nr_pirqs sized arrays with radix tree With this it is questionable whether retaining struct domain''s nr_pirqs is actually necessary - the value now only serves for bounds checking, and this boundary could easily be nr_irqs. Note that ia64, the build of which is broken currently anyway, is only being partially fixed up. v2: adjustments for split setup/teardown of translation data v3: re-sync with radix tree implementation changes Signed-off-by: Jan Beulich <jbeulich@novell.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jul-25 11:17 UTC
[Xen-devel] Re: regression on Xen: 23573 breaks PV on HVM guests
>>> On 25.07.11 at 12:54, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > Hi all, > I would just let you know that I found a regression in the hypervisor: > CS 23573 causes PV on HVM guests to hang during boot.Any details (e.g. state of the guest, messages from the hypervisor) that might help finding out what the problem is? I''m not aware that I intentionally changed anything behavior-wise in the pv-on-hvm specific code. Jan> The offending commit is the following: > > > # HG changeset patch > # User Jan Beulich <jbeulich@novell.com> > # Date 1308825163 -3600 > # Node ID 584c2e5e03d96f912cdfe90f8e9f910d5d661706 > # Parent 4e9562c1ce4ecaac40011556c16712aefd47afa6 > replace d->nr_pirqs sized arrays with radix tree > > With this it is questionable whether retaining struct domain''s > nr_pirqs is actually necessary - the value now only serves for bounds > checking, and this boundary could easily be nr_irqs. > > Note that ia64, the build of which is broken currently anyway, is only > being partially fixed up. > > v2: adjustments for split setup/teardown of translation data > > v3: re-sync with radix tree implementation changes > > Signed-off-by: Jan Beulich <jbeulich@novell.com>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jul-25 11:34 UTC
[Xen-devel] Re: regression on Xen: 23573 breaks PV on HVM guests
On Mon, 25 Jul 2011, Jan Beulich wrote:> >>> On 25.07.11 at 12:54, Stefano Stabellini <stefano.stabellini@eu.citrix.com> wrote: > > Hi all, > > I would just let you know that I found a regression in the hypervisor: > > CS 23573 causes PV on HVM guests to hang during boot. > > Any details (e.g. state of the guest, messages from the hypervisor) > that might help finding out what the problem is? I''m not aware that I > intentionally changed anything behavior-wise in the pv-on-hvm specific > codeI think that 23573 introduced a problem similar to the one solved by 23550, that is hvm_domain_use_pirq returns a subtly wrong answer. In any case it is really easy to reproduce: download a recent upstream kernel tree (3.0.0 is fine), enable CONFIG_XEN, make sure you have: CONFIG_XEN=y CONFIG_PCI_XEN=y CONFIG_XEN_BLKDEV_FRONTEND=y CONFIG_XEN_NETDEV_FRONTEND=y CONFIG_XEN_XENBUS_FRONTEND=y CONFIG_XEN_PLATFORM_PCI=y that''s all. Try to boot the kernel in an HVM guest and it will hang. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jul-25 12:25 UTC
[Xen-devel] Re: regression on Xen: 23573 breaks PV on HVM guests
>>> On 25.07.11 at 13:34, Stefano Stabellini <stefano.stabellini@eu.citrix.com>wrote:> On Mon, 25 Jul 2011, Jan Beulich wrote: >> >>> On 25.07.11 at 12:54, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > wrote: >> > Hi all, >> > I would just let you know that I found a regression in the hypervisor: >> > CS 23573 causes PV on HVM guests to hang during boot. >> >> Any details (e.g. state of the guest, messages from the hypervisor) >> that might help finding out what the problem is? I''m not aware that I >> intentionally changed anything behavior-wise in the pv-on-hvm specific >> code > > I think that 23573 introduced a problem similar to the one solved by > 23550, that is hvm_domain_use_pirq returns a subtly wrong answer.Hmm, indeed, seems like I failed to remove the check of the assigned event channel when I merged my patch with the changes from 23550. Could you give the below a try? Jan --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -1991,6 +1991,5 @@ int unmap_domain_pirq_emuirq(struct doma bool_t hvm_domain_use_pirq(const struct domain *d, const struct pirq *pirq) { return is_hvm_domain(d) && pirq && - pirq->arch.hvm.emuirq != IRQ_UNBOUND && - pirq->evtchn != 0; + pirq->arch.hvm.emuirq != IRQ_UNBOUND; }> In any case it is really easy to reproduce: download a recent upstream > kernel tree (3.0.0 is fine), enable CONFIG_XEN, make sure you have: > > CONFIG_XEN=y > CONFIG_PCI_XEN=y > CONFIG_XEN_BLKDEV_FRONTEND=y > CONFIG_XEN_NETDEV_FRONTEND=y > CONFIG_XEN_XENBUS_FRONTEND=y > CONFIG_XEN_PLATFORM_PCI=y > > that''s all. Try to boot the kernel in an HVM guest and it will hang._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefano Stabellini
2011-Jul-25 16:45 UTC
[Xen-devel] Re: regression on Xen: 23573 breaks PV on HVM guests
On Mon, 25 Jul 2011, Jan Beulich wrote:> >>> On 25.07.11 at 13:34, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > wrote: > > On Mon, 25 Jul 2011, Jan Beulich wrote: > >> >>> On 25.07.11 at 12:54, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > wrote: > >> > Hi all, > >> > I would just let you know that I found a regression in the hypervisor: > >> > CS 23573 causes PV on HVM guests to hang during boot. > >> > >> Any details (e.g. state of the guest, messages from the hypervisor) > >> that might help finding out what the problem is? I''m not aware that I > >> intentionally changed anything behavior-wise in the pv-on-hvm specific > >> code > > > > I think that 23573 introduced a problem similar to the one solved by > > 23550, that is hvm_domain_use_pirq returns a subtly wrong answer. > > Hmm, indeed, seems like I failed to remove the check of the assigned > event channel when I merged my patch with the changes from 23550. > > Could you give the below a try? > > Jan > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -1991,6 +1991,5 @@ int unmap_domain_pirq_emuirq(struct doma > bool_t hvm_domain_use_pirq(const struct domain *d, const struct pirq *pirq) > { > return is_hvm_domain(d) && pirq && > - pirq->arch.hvm.emuirq != IRQ_UNBOUND && > - pirq->evtchn != 0; > + pirq->arch.hvm.emuirq != IRQ_UNBOUND; > } >even though it is certainly a good change, it is not enough to fix the issue _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel