Olaf Hering
2011-Apr-05 10:14 UTC
[Xen-devel] [PATCH 0 of 2] linux-2.6.18: kdump for pv-on-hvm guests
The following two patches add kdump support for PV-on-HVM guests. In the event of a crash, the PV drivers are still in connected state. In this state a reconnect by the kdump kernel is not possible. The connection for each connected device has to be closed first to allow a reconnect. The bus reset is only done when the kernel was booted with the ''reset_devices'' cmdline option, which is automatically added to the kdump kernel cmdline. This option was introduced in 2.6.19-rc1. Two additional changes are needed: The kdump script needs to omit the ''irqpoll'' option for the crash kernel. When booted irqpoll an interrupt flood will occour after a while. The kexec-tools package needs to check for a real PV environment. Doing a stat() on /proc/xen is not enough, instead /proc/xen/capabilities should be used. Olaf -- Documentation/kernel-parameters.txt | 3 + drivers/xen/xenbus/xenbus_comms.c | 4 + drivers/xen/xenbus/xenbus_probe.c | 96 ++++++++++++++++++++++++++++++++++++ include/linux/init.h | 1 init/main.c | 20 +++++++ 5 files changed, 123 insertions(+), 1 deletion(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Apr-05 10:14 UTC
[Xen-devel] [PATCH 1 of 2] kdump pv-on-hvm: introduce "reset_devices" command line option
# HG changeset patch # User Olaf Hering <olaf@aepfle.de> # Date 1301997980 -7200 # Node ID 6cc1c07f0e74cd481d1cde0f17bec2631a767891 # Parent 0bee20f8e418d32ab5828eb57c7542ca27ce425d kdump pv-on-hvm: introduce "reset_devices" command line option upstream commit 7e96287ddc4f42081e18248b6167041c0908004c [PATCH] kdump: introduce "reset_devices" command line option Resetting the devices during driver initialization can be a costly operation in terms of time (especially scsi devices). This option can be used by drivers to know that user forcibly wants the devices to be reset during initialization. This option can be useful while kernel is booting in unreliable environment. For ex. during kdump boot where devices are in unknown random state and BIOS execution has been skipped. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> --- Documentation/kernel-parameters.txt | 3 +++ include/linux/init.h | 1 + init/main.c | 20 ++++++++++++++++++++ 3 files changed, 24 insertions(+) diff -r 0bee20f8e418 -r 6cc1c07f0e74 Documentation/kernel-parameters.txt --- a/Documentation/kernel-parameters.txt Sat Apr 02 15:54:29 2011 +0100 +++ b/Documentation/kernel-parameters.txt Tue Apr 05 12:06:20 2011 +0200 @@ -1392,6 +1392,9 @@ reserve= [KNL,BUGS] Force the kernel to ignore some iomem area + reset_devices [KNL] Force drivers to reset the underlying device + during initialization. + resume= [SWSUSP] Specify the partition device for software suspend diff -r 0bee20f8e418 -r 6cc1c07f0e74 include/linux/init.h --- a/include/linux/init.h Sat Apr 02 15:54:29 2011 +0100 +++ b/include/linux/init.h Tue Apr 05 12:06:20 2011 +0200 @@ -68,6 +68,7 @@ /* Defined in init/main.c */ extern char saved_command_line[]; +extern unsigned int reset_devices; /* used by init/main.c */ extern void setup_arch(char **); diff -r 0bee20f8e418 -r 6cc1c07f0e74 init/main.c --- a/init/main.c Sat Apr 02 15:54:29 2011 +0100 +++ b/init/main.c Tue Apr 05 12:06:20 2011 +0200 @@ -128,6 +128,18 @@ static unsigned int max_cpus = NR_CPUS; /* + * If set, this is an indication to the drivers that reset the underlying + * device before going ahead with the initialization otherwise driver might + * rely on the BIOS and skip the reset operation. + * + * This is useful if kernel is booting in an unreliable environment. + * For ex. kdump situaiton where previous kernel has crashed, BIOS has been + * skipped and devices will be in unknown state. + */ +unsigned int reset_devices; +EXPORT_SYMBOL(reset_devices); + +/* * Setup routine for controlling SMP activation * * Command-line option of "nosmp" or "maxcpus=0" will disable SMP @@ -153,6 +165,14 @@ __setup("maxcpus=", maxcpus); +static int __init set_reset_devices(char *str) +{ + reset_devices = 1; + return 1; +} + +__setup("reset_devices", set_reset_devices); + static char * argv_init[MAX_INIT_ARGS+2] = { "init", NULL, }; char * envp_init[MAX_INIT_ENVS+2] = { "HOME=/", "TERM=linux", NULL, }; static const char *panic_later, *panic_param; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Apr-05 10:14 UTC
[Xen-devel] [PATCH 2 of 2] kdump pv-on-hvm: reset PV devices in crash kernel
# HG changeset patch # User Olaf Hering <olaf@aepfle.de> # Date 1301997987 -7200 # Node ID 0bbf646804818f582e91dc88038d1c79f76ba36e # Parent 6cc1c07f0e74cd481d1cde0f17bec2631a767891 kdump pv-on-hvm: reset PV devices in crash kernel After triggering a crash dump in a HVM guest, the PV backend drivers will remain in connected state. When the kdump kernel starts the PV drivers will skip such devices. As a result, no root device is found and the vmcore cant be saved. With this change all frontend devices with state XenbusStateConnected will be reset by changing the state file to Closing/Closed/Initializing. This will trigger a disconnect in the backend drivers. Now the frontend drivers will find the backend drivers in state Initwait and can connect. Signed-off-by: Olaf Hering <olaf@aepfle.de> --- drivers/xen/xenbus/xenbus_comms.c | 4 + drivers/xen/xenbus/xenbus_probe.c | 96 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 99 insertions(+), 1 deletion(-) diff -r 6cc1c07f0e74 -r 0bbf64680481 drivers/xen/xenbus/xenbus_comms.c --- a/drivers/xen/xenbus/xenbus_comms.c Tue Apr 05 12:06:20 2011 +0200 +++ b/drivers/xen/xenbus/xenbus_comms.c Tue Apr 05 12:06:27 2011 +0200 @@ -234,7 +234,9 @@ printk(KERN_WARNING "XENBUS response ring is not quiescent " "(%08x:%08x): fixing up\n", intf->rsp_cons, intf->rsp_prod); - intf->rsp_cons = intf->rsp_prod; + /* breaks kdump */ + if (!reset_devices) + intf->rsp_cons = intf->rsp_prod; } if (xenbus_irq) diff -r 6cc1c07f0e74 -r 0bbf64680481 drivers/xen/xenbus/xenbus_probe.c --- a/drivers/xen/xenbus/xenbus_probe.c Tue Apr 05 12:06:20 2011 +0200 +++ b/drivers/xen/xenbus/xenbus_probe.c Tue Apr 05 12:06:27 2011 +0200 @@ -856,11 +856,107 @@ } EXPORT_SYMBOL_GPL(unregister_xenstore_notifier); +#ifdef CONFIG_CRASH_DUMP +static DECLARE_WAIT_QUEUE_HEAD(be_state_wq); +static int be_state; + +static void xenbus_reset_state_changed(struct xenbus_watch *w, const char **v, unsigned int l) +{ + xenbus_scanf(XBT_NIL, v[XS_WATCH_PATH], "", "%i", &be_state); + printk(KERN_INFO "XENBUS: %s %s\n", v[XS_WATCH_PATH], xenbus_strstate(be_state)); + wake_up(&be_state_wq); +} + +static int xenbus_reset_check_final(int *st) +{ + return *st == XenbusStateInitialising || *st == XenbusStateInitWait; +} + +static void xenbus_reset_frontend_state(char *backend, char *frontend) +{ + struct xenbus_watch watch; + + memset(&watch, 0, sizeof(watch)); + watch.node = kasprintf(GFP_NOIO | __GFP_HIGH, "%s/state", backend); + if (!watch.node) + return; + + watch.callback = xenbus_reset_state_changed; + be_state = XenbusStateUnknown; + + printk(KERN_INFO "XENBUS: triggering reconnect on %s\n", backend); + register_xenbus_watch(&watch); + + xenbus_printf(XBT_NIL, frontend, "state", "%d", XenbusStateClosing); + wait_event_interruptible(be_state_wq, be_state == XenbusStateClosing); + + xenbus_printf(XBT_NIL, frontend, "state", "%d", XenbusStateClosed); + wait_event_interruptible(be_state_wq, be_state == XenbusStateClosed); + + xenbus_printf(XBT_NIL, frontend, "state", "%d", XenbusStateInitialising); + wait_event_interruptible(be_state_wq, xenbus_reset_check_final(&be_state)); + + unregister_xenbus_watch(&watch); + printk(KERN_INFO "XENBUS: reconnect done on %s\n", backend); + kfree(watch.node); +} + +static void xenbus_reset_check_state(char *class, char *dev) +{ + int state, err; + char *backend, *frontend; + + frontend = kasprintf(GFP_NOIO | __GFP_HIGH, "device/%s/%s", class, dev); + if (!frontend) + return; + + err = xenbus_scanf(XBT_NIL, frontend, "state", "%i", &state); + /* frontend connected? */ + if (err == 1 && state == XenbusStateConnected) { + backend = xenbus_read(XBT_NIL, frontend, "backend", NULL); + if (!backend || IS_ERR(backend)) + goto out; + err = xenbus_scanf(XBT_NIL, backend, "state", "%i", &state); + /* backend connected? */ + if (err == 1 && state == XenbusStateConnected) + xenbus_reset_frontend_state(backend, frontend); + kfree(backend); + } +out: + kfree(frontend); +} + +static void xenbus_reset_state(void) +{ + char **devclass, **dev; + int devclass_n, dev_n; + int i, j; + + devclass = xenbus_directory(XBT_NIL, "device", "", &devclass_n); + if (IS_ERR(devclass)) + return; + + for (i = 0; i < devclass_n; i++) { + dev = xenbus_directory(XBT_NIL, "device", devclass[i], &dev_n); + if (IS_ERR(dev)) + continue; + for (j = 0; j < dev_n; j++) + xenbus_reset_check_state(devclass[i], dev[j]); + kfree(dev); + } + kfree(devclass); +} +#endif void xenbus_probe(void *unused) { BUG_ON(!is_xenstored_ready()); +#ifdef CONFIG_CRASH_DUMP + /* reset devices in XenbusStateConnected state */ + if (!is_initial_xendomain() && reset_devices) + xenbus_reset_state(); +#endif /* Enumerate devices in xenstore and watch for changes. */ xenbus_probe_devices(&xenbus_frontend); register_xenbus_watch(&fe_watch); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Apr-05 13:54 UTC
Re: [Xen-devel] [PATCH 2 of 2] kdump pv-on-hvm: reset PV devices in crash kernel
On Tue, Apr 05, 2011 at 12:14:29PM +0200, Olaf Hering wrote:> # HG changeset patch > # User Olaf Hering <olaf@aepfle.de> > # Date 1301997987 -7200 > # Node ID 0bbf646804818f582e91dc88038d1c79f76ba36e > # Parent 6cc1c07f0e74cd481d1cde0f17bec2631a767891 > kdump pv-on-hvm: reset PV devices in crash kernel > > After triggering a crash dump in a HVM guest, the PV backend drivers > will remain in connected state. When the kdump kernel starts the PV > drivers will skip such devices. As a result, no root device is found > and the vmcore cant be saved.So this patch is for 2.6.18 but not for upstream?> > With this change all frontend devices with state XenbusStateConnected > will be reset by changing the state file to Closing/Closed/Initializing. > This will trigger a disconnect in the backend drivers. Now the frontend > drivers will find the backend drivers in state Initwait and can connect. > > Signed-off-by: Olaf Hering <olaf@aepfle.de> > > --- > drivers/xen/xenbus/xenbus_comms.c | 4 + > drivers/xen/xenbus/xenbus_probe.c | 96 ++++++++++++++++++++++++++++++++++++++ > 2 files changed, 99 insertions(+), 1 deletion(-) > > diff -r 6cc1c07f0e74 -r 0bbf64680481 drivers/xen/xenbus/xenbus_comms.c > --- a/drivers/xen/xenbus/xenbus_comms.c Tue Apr 05 12:06:20 2011 +0200 > +++ b/drivers/xen/xenbus/xenbus_comms.c Tue Apr 05 12:06:27 2011 +0200 > @@ -234,7 +234,9 @@ > printk(KERN_WARNING "XENBUS response ring is not quiescent " > "(%08x:%08x): fixing up\n", > intf->rsp_cons, intf->rsp_prod); > - intf->rsp_cons = intf->rsp_prod; > + /* breaks kdump */ > + if (!reset_devices) > + intf->rsp_cons = intf->rsp_prod; > } > > if (xenbus_irq) > diff -r 6cc1c07f0e74 -r 0bbf64680481 drivers/xen/xenbus/xenbus_probe.c > --- a/drivers/xen/xenbus/xenbus_probe.c Tue Apr 05 12:06:20 2011 +0200 > +++ b/drivers/xen/xenbus/xenbus_probe.c Tue Apr 05 12:06:27 2011 +0200 > @@ -856,11 +856,107 @@ > } > EXPORT_SYMBOL_GPL(unregister_xenstore_notifier); > > +#ifdef CONFIG_CRASH_DUMP > +static DECLARE_WAIT_QUEUE_HEAD(be_state_wq); > +static int be_state; > + > +static void xenbus_reset_state_changed(struct xenbus_watch *w, const char **v, unsigned int l) > +{ > + xenbus_scanf(XBT_NIL, v[XS_WATCH_PATH], "", "%i", &be_state); > + printk(KERN_INFO "XENBUS: %s %s\n", v[XS_WATCH_PATH], xenbus_strstate(be_state)); > + wake_up(&be_state_wq); > +} > + > +static int xenbus_reset_check_final(int *st) > +{ > + return *st == XenbusStateInitialising || *st == XenbusStateInitWait; > +} > + > +static void xenbus_reset_frontend_state(char *backend, char *frontend) > +{ > + struct xenbus_watch watch; > + > + memset(&watch, 0, sizeof(watch)); > + watch.node = kasprintf(GFP_NOIO | __GFP_HIGH, "%s/state", backend); > + if (!watch.node) > + return; > + > + watch.callback = xenbus_reset_state_changed; > + be_state = XenbusStateUnknown; > + > + printk(KERN_INFO "XENBUS: triggering reconnect on %s\n", backend); > + register_xenbus_watch(&watch); > + > + xenbus_printf(XBT_NIL, frontend, "state", "%d", XenbusStateClosing); > + wait_event_interruptible(be_state_wq, be_state == XenbusStateClosing); > + > + xenbus_printf(XBT_NIL, frontend, "state", "%d", XenbusStateClosed); > + wait_event_interruptible(be_state_wq, be_state == XenbusStateClosed); > + > + xenbus_printf(XBT_NIL, frontend, "state", "%d", XenbusStateInitialising); > + wait_event_interruptible(be_state_wq, xenbus_reset_check_final(&be_state)); > + > + unregister_xenbus_watch(&watch); > + printk(KERN_INFO "XENBUS: reconnect done on %s\n", backend); > + kfree(watch.node); > +} > + > +static void xenbus_reset_check_state(char *class, char *dev) > +{ > + int state, err; > + char *backend, *frontend; > + > + frontend = kasprintf(GFP_NOIO | __GFP_HIGH, "device/%s/%s", class, dev); > + if (!frontend) > + return; > + > + err = xenbus_scanf(XBT_NIL, frontend, "state", "%i", &state); > + /* frontend connected? */ > + if (err == 1 && state == XenbusStateConnected) { > + backend = xenbus_read(XBT_NIL, frontend, "backend", NULL); > + if (!backend || IS_ERR(backend)) > + goto out; > + err = xenbus_scanf(XBT_NIL, backend, "state", "%i", &state); > + /* backend connected? */ > + if (err == 1 && state == XenbusStateConnected) > + xenbus_reset_frontend_state(backend, frontend); > + kfree(backend); > + } > +out: > + kfree(frontend); > +} > + > +static void xenbus_reset_state(void) > +{ > + char **devclass, **dev; > + int devclass_n, dev_n; > + int i, j; > + > + devclass = xenbus_directory(XBT_NIL, "device", "", &devclass_n); > + if (IS_ERR(devclass)) > + return; > + > + for (i = 0; i < devclass_n; i++) { > + dev = xenbus_directory(XBT_NIL, "device", devclass[i], &dev_n); > + if (IS_ERR(dev)) > + continue; > + for (j = 0; j < dev_n; j++) > + xenbus_reset_check_state(devclass[i], dev[j]); > + kfree(dev); > + } > + kfree(devclass); > +} > +#endif > > void xenbus_probe(void *unused) > { > BUG_ON(!is_xenstored_ready()); > > +#ifdef CONFIG_CRASH_DUMP > + /* reset devices in XenbusStateConnected state */ > + if (!is_initial_xendomain() && reset_devices) > + xenbus_reset_state(); > +#endif > /* Enumerate devices in xenstore and watch for changes. */ > xenbus_probe_devices(&xenbus_frontend); > register_xenbus_watch(&fe_watch); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2011-Apr-05 13:56 UTC
Re: [Xen-devel] [PATCH 2 of 2] kdump pv-on-hvm: reset PV devices in crash kernel
On Tue, Apr 05, Konrad Rzeszutek Wilk wrote:> So this patch is for 2.6.18 but not for upstream?Yes, not yet. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel