Hey, Three patches that were discovered with v3.6. The first one: [PATCH 1/3] xen/pv-on-hvm kexec: add quirk for Xen 3.4 and shutdown should have been found earlier, but it seems there is only a small selection of provides run Xen 3.4 and without the 23839:42a45baf037d in xen-unstable.hg fix. It is easy to reproduce with OVM 2.2 product. The other two are a bit embarrassing. We had a similar issue with rdmsr_safe pvops call in v3.5 where it was not implemented for Xen PV (but rdmsr was). The fix was to make rdmsr_safe a macro around rdmsr and a lookout for other calls that were not implemented. Well, v3.6 has gone by and now I find three other culprits: [PATCH 2/3] xen/bootup: allow {read|write}_cr8 pvops call. [PATCH 3/3] xen/bootup: allow read_tscp call for Xen PV guests.
Konrad Rzeszutek Wilk
2012-Oct-10 17:40 UTC
[PATCH 1/3] xen/pv-on-hvm kexec: add quirk for Xen 3.4 and shutdown watches.
The commit 254d1a3f02ebc10ccc6e4903394d8d3f484f715e, titled "xen/pv-on-hvm kexec: shutdown watches from old kernel" assumes that the XenBus backend can deal with reading of values from: "control/platform-feature-xs_reset_watches": ... a patch for xenstored is required so that it accepts the XS_RESET_WATCHES request from a client (see changeset 23839:42a45baf037d in xen-unstable.hg). Without the patch for xenstored the registration of watches will fail and some features of a PVonHVM guest are not available. The guest is still able to boot, but repeated kexec boots will fail." Sadly this is not true when using a Xen 3.4 hypervisor and booting a PVHVM guest. We end up hanging at: err = xenbus_scanf(XBT_NIL, "control", "platform-feature-xs_reset_watches", "%d", &supported); This can easily be seen with guests hanging at xenbus_init: NX (Execute Disable) protection: active SMBIOS 2.4 present. DMI: Xen HVM domU, BIOS 3.4.0 05/13/2011 Hypervisor detected: Xen HVM Xen version 3.4. Xen Platform PCI: I/O protocol version 1 ... snip .. calling xenbus_init+0x0/0x27e @ 1 Reverting the commit or using the attached patch fixes the issue. This fix checks whether the hypervisor is older than 4.0 and if so does not try to perform the read. Fixes-Oracle-Bug: 14708233 CC: stable@vger.kernel.org CC: Olaf Hering <olaf@aepfle.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/xen/xenbus/xenbus_xs.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c index bce15cf..91f513b 100644 --- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -47,6 +47,7 @@ #include <xen/xenbus.h> #include <xen/xen.h> #include "xenbus_comms.h" +#include <asm/xen/hypervisor.h> struct xs_stored_msg { struct list_head list; @@ -617,7 +618,18 @@ static struct xenbus_watch *find_watch(const char *token) return NULL; } +static bool xen_strict_xenbus_quirk() +{ + uint32_t eax, ebx, ecx, edx, base; + + base = xen_cpuid_base(); + cpuid(base + 1, &eax, &ebx, &ecx, &edx); + + if ((eax >> 16) < 4) + return true; + return false; +} static void xs_reset_watches(void) { int err, supported = 0; @@ -625,6 +637,9 @@ static void xs_reset_watches(void) if (!xen_hvm_domain()) return; + if (xen_strict_xenbus_quirk()) + return; + err = xenbus_scanf(XBT_NIL, "control", "platform-feature-xs_reset_watches", "%d", &supported); if (err != 1 || !supported) -- 1.7.7.6
Konrad Rzeszutek Wilk
2012-Oct-10 17:40 UTC
[PATCH 2/3] xen/bootup: allow {read|write}_cr8 pvops call.
We actually do not do anything about it. Just return a default value of zero and if the kernel tries to write anything but 0 we BUG_ON. This fixes the case when an user tries to suspend the machine and it blows up in save_processor_state b/c ''read_cr8'' is set to NULL and we get: kernel BUG at /home/konrad/ssd/linux/arch/x86/include/asm/paravirt.h:100! invalid opcode: 0000 [#1] SMP Pid: 2687, comm: init.late Tainted: G O 3.6.0upstream-00002-gac264ac-dirty #4 Bochs Bochs RIP: e030:[<ffffffff814d5f42>] [<ffffffff814d5f42>] save_processor_state+0x212/0x270 .. snip.. Call Trace: [<ffffffff810733bf>] do_suspend_lowlevel+0xf/0xac [<ffffffff8107330c>] ? x86_acpi_suspend_lowlevel+0x10c/0x150 [<ffffffff81342ee2>] acpi_suspend_enter+0x57/0xd5 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/enlighten.c | 16 +++++++++++++++- 1 files changed, 15 insertions(+), 1 deletions(-) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 1fbe75a..e1e4636 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -984,7 +984,16 @@ static void xen_write_cr4(unsigned long cr4) native_write_cr4(cr4); } - +#ifdef CONFIG_X86_64 +static inline unsigned long xen_read_cr8(void) +{ + return 0; +} +static inline void xen_write_cr8(unsigned long val) +{ + BUG_ON(val); +} +#endif static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high) { int ret; @@ -1153,6 +1162,11 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = { .read_cr4_safe = native_read_cr4_safe, .write_cr4 = xen_write_cr4, +#ifdef CONFIG_X86_64 + .read_cr8 = xen_read_cr8, + .write_cr8 = xen_write_cr8, +#endif + .wbinvd = native_wbinvd, .read_msr = native_read_msr_safe, -- 1.7.7.6
Konrad Rzeszutek Wilk
2012-Oct-10 17:40 UTC
[PATCH 3/3] xen/bootup: allow read_tscp call for Xen PV guests.
The hypervisor will trap it. However without this patch, we would crash as the .read_tscp is set to NULL. This patch fixes it and sets it to the native_read_tscp call. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/enlighten.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index e1e4636..c1461de 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1175,6 +1175,8 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = { .read_tsc = native_read_tsc, .read_pmc = native_read_pmc, + .read_tscp = native_read_tscp, + .iret = xen_iret, .irq_enable_sysexit = xen_sysexit, #ifdef CONFIG_X86_64 -- 1.7.7.6
Olaf Hering
2012-Oct-10 18:09 UTC
Re: [PATCH 1/3] xen/pv-on-hvm kexec: add quirk for Xen 3.4 and shutdown watches.
On Wed, Oct 10, Konrad Rzeszutek Wilk wrote:> The commit 254d1a3f02ebc10ccc6e4903394d8d3f484f715e, titled > "xen/pv-on-hvm kexec: shutdown watches from old kernel" assumes that the > XenBus backend can deal with reading of values from: > "control/platform-feature-xs_reset_watches": > > ... a patch for xenstored is required so that it > accepts the XS_RESET_WATCHES request from a client (see changeset > 23839:42a45baf037d in xen-unstable.hg). Without the patch for xenstored > the registration of watches will fail and some features of a PVonHVM > guest are not available. The guest is still able to boot, but repeated > kexec boots will fail." > > Sadly this is not true when using a Xen 3.4 hypervisor and booting a PVHVM > guest. We end up hanging at:This is sad. So far I have not seen reports like that with a sles11sp1 or sles11sp2 guest. Thats either because noone uses Xen 3.4+sles11 (or recent openSuSE) HVM guests, or because it happens to work with that combination. If the patch solves the issue: Acked-by: Olaf Hering <olaf@aepfle.de> Olaf
Ian Campbell
2012-Oct-15 16:05 UTC
Re: [Xen-devel] [PATCH 1/3] xen/pv-on-hvm kexec: add quirk for Xen 3.4 and shutdown watches.
On Wed, 2012-10-10 at 18:40 +0100, Konrad Rzeszutek Wilk wrote:> The commit 254d1a3f02ebc10ccc6e4903394d8d3f484f715e, titled > "xen/pv-on-hvm kexec: shutdown watches from old kernel" assumes that the > XenBus backend can deal with reading of values from: > "control/platform-feature-xs_reset_watches": > > ... a patch for xenstored is required so that it > accepts the XS_RESET_WATCHES request from a client (see changeset > 23839:42a45baf037d in xen-unstable.hg). Without the patch for xenstored > the registration of watches will fail and some features of a PVonHVM > guest are not available. The guest is still able to boot, but repeated > kexec boots will fail." > > Sadly this is not true when using a Xen 3.4 hypervisor and booting a PVHVM > guest. We end up hanging at: > > err = xenbus_scanf(XBT_NIL, "control", > "platform-feature-xs_reset_watches", "%d", &supported); > > This can easily be seen with guests hanging at xenbus_init: > > NX (Execute Disable) protection: active > SMBIOS 2.4 present. > DMI: Xen HVM domU, BIOS 3.4.0 05/13/2011 > Hypervisor detected: Xen HVM > Xen version 3.4. > Xen Platform PCI: I/O protocol version 1 > ... snip .. > calling xenbus_init+0x0/0x27e @ 1 > > Reverting the commit or using the attached patch fixes the issue. This fix > checks whether the hypervisor is older than 4.0 and if so does not try to > perform the read. > > Fixes-Oracle-Bug: 14708233 > CC: stable@vger.kernel.org > CC: Olaf Hering <olaf@aepfle.de> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > drivers/xen/xenbus/xenbus_xs.c | 15 +++++++++++++++ > 1 files changed, 15 insertions(+), 0 deletions(-) > > diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c > index bce15cf..91f513b 100644 > --- a/drivers/xen/xenbus/xenbus_xs.c > +++ b/drivers/xen/xenbus/xenbus_xs.c > @@ -47,6 +47,7 @@ > #include <xen/xenbus.h> > #include <xen/xen.h> > #include "xenbus_comms.h" > +#include <asm/xen/hypervisor.h> > > struct xs_stored_msg { > struct list_head list; > @@ -617,7 +618,18 @@ static struct xenbus_watch *find_watch(const char *token) > > return NULL; > } > +static bool xen_strict_xenbus_quirk() > +{ > + uint32_t eax, ebx, ecx, edx, base; > + > + base = xen_cpuid_base(); > + cpuid(base + 1, &eax, &ebx, &ecx, &edx);This breaks on ARM because this is an x86 specific function. Can we ifdef it or properly wrap it in an arch interface please.> + > + if ((eax >> 16) < 4) > + return true; > + return false; > > +} > static void xs_reset_watches(void) > { > int err, supported = 0; > @@ -625,6 +637,9 @@ static void xs_reset_watches(void) > if (!xen_hvm_domain()) > return; > > + if (xen_strict_xenbus_quirk()) > + return; > + > err = xenbus_scanf(XBT_NIL, "control", > "platform-feature-xs_reset_watches", "%d", &supported); > if (err != 1 || !supported)
Konrad Rzeszutek Wilk
2012-Oct-15 16:09 UTC
Re: [Xen-devel] [PATCH 1/3] xen/pv-on-hvm kexec: add quirk for Xen 3.4 and shutdown watches.
On Mon, Oct 15, 2012 at 05:14:18PM +0100, Ian Campbell wrote:> On Mon, 2012-10-15 at 17:05 +0100, Ian Campbell wrote: > > > > > +static bool xen_strict_xenbus_quirk() > > > +{ > > > + uint32_t eax, ebx, ecx, edx, base; > > > + > > > + base = xen_cpuid_base(); > > > + cpuid(base + 1, &eax, &ebx, &ecx, &edx); > > > > This breaks on ARM because this is an x86 specific function. Can we > > ifdef it or properly wrap it in an arch interface please. > > Quick-n-dirty fix.applied. Thx> > 8<---------------------------- > > >From fe19515d8f7bed49c4474f475e6a8cbbdc5eb3f2 Mon Sep 17 00:00:00 2001 > From: Ian Campbell <ian.campbell@citrix.com> > Date: Mon, 15 Oct 2012 17:06:47 +0100 > Subject: [PATCH] xen: fix build on ARM > > xen_strict_xenbus_quirk is x86 specific. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > --- > drivers/xen/xenbus/xenbus_xs.c | 6 ++++++ > 1 files changed, 6 insertions(+), 0 deletions(-) > > diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c > index 48220e1..b46ad11 100644 > --- a/drivers/xen/xenbus/xenbus_xs.c > +++ b/drivers/xen/xenbus/xenbus_xs.c > @@ -619,6 +619,8 @@ static struct xenbus_watch *find_watch(const char *token) > > return NULL; > } > + > +#ifdef CONFIG_X86 > /* > * Certain older XenBus toolstack cannot handle reading values that are > * not populated. Some Xen 3.4 installation are incapable of doing this > @@ -637,6 +639,10 @@ static bool xen_strict_xenbus_quirk() > return false; > > } > +#else > +static bool xen_strict_xenbus_quirk(void) { return false; } > +#endif > + > static void xs_reset_watches(void) > { > int err, supported = 0; > -- > 1.7.2.5 > >
Ian Campbell
2012-Oct-15 16:14 UTC
Re: [Xen-devel] [PATCH 1/3] xen/pv-on-hvm kexec: add quirk for Xen 3.4 and shutdown watches.
On Mon, 2012-10-15 at 17:05 +0100, Ian Campbell wrote:> > > +static bool xen_strict_xenbus_quirk() > > +{ > > + uint32_t eax, ebx, ecx, edx, base; > > + > > + base = xen_cpuid_base(); > > + cpuid(base + 1, &eax, &ebx, &ecx, &edx); > > This breaks on ARM because this is an x86 specific function. Can we > ifdef it or properly wrap it in an arch interface please.Quick-n-dirty fix. 8<---------------------------- From fe19515d8f7bed49c4474f475e6a8cbbdc5eb3f2 Mon Sep 17 00:00:00 2001 From: Ian Campbell <ian.campbell@citrix.com> Date: Mon, 15 Oct 2012 17:06:47 +0100 Subject: [PATCH] xen: fix build on ARM xen_strict_xenbus_quirk is x86 specific. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- drivers/xen/xenbus/xenbus_xs.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c index 48220e1..b46ad11 100644 --- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -619,6 +619,8 @@ static struct xenbus_watch *find_watch(const char *token) return NULL; } + +#ifdef CONFIG_X86 /* * Certain older XenBus toolstack cannot handle reading values that are * not populated. Some Xen 3.4 installation are incapable of doing this @@ -637,6 +639,10 @@ static bool xen_strict_xenbus_quirk() return false; } +#else +static bool xen_strict_xenbus_quirk(void) { return false; } +#endif + static void xs_reset_watches(void) { int err, supported = 0; -- 1.7.2.5