Olaf Hering
2012-Mar-14 10:52 UTC
how to properly reset event channel in PVonHVM after kexec boot
What is the best way to reset all registered event channels after an kexec boot in a PVonHVM guest? Right now repeated kexec boots will fail after 1300 iterations because no more event channels can be registered. EVTCHNOP_reset can not be used because it expects the domid, which the guest does not know. My attempt to close ports is shown below, it works in my guest. The patch prints this output: initial before kexec: [ 1.129469] xen-platform-pci 0000:00:03.0: I/O protocol version 1 [ 1.233591] xen_reset_event_channels: 1 p 0 s 0 v 0 i 0 d 0 0/0 [ 1.238435] xen_reset_event_channels: 2 p 1 s 2 v 0 i 0 d 0 0/0 [ 1.243227] xen_reset_event_channels: 3 p 2 s 2 v 1 i 0 d 0 0/0 [ 1.247953] xen_reset_event_channels: 4 p 3 s 2 v 2 i 0 d 0 0/0 [ 1.252695] xen_reset_event_channels: 5 p 4 s 2 v 3 i 0 d 0 0/0 [ 1.256625] xen_reset_event_channels: 6 p 5 s 2 v 0 i 0 d 0 0/0 [ 1.259299] xen_reset_event_channels: 7 p 6 s 1 v 0 i 0 d 0 0/0 [ 1.261870] xen_reset_event_channels: close 7 p 6 s 1 v 0 i 0 d 0 0/0 [ 1.264412] xen_reset_event_channels: 1 p 7 s 0 v 0 i 0 d 0 0/0 [ 1.267555] xen_reset_event_channels: 2 p 8 s 0 v 0 i 0 d 0 0/0 [ 1.270264] xen_reset_event_channels: 3 p 9 s 0 v 0 i 0 d 0 0/0 [ 1.272518] xen_reset_event_channels: 4 p 10 s 0 v 0 i 0 d 0 0/0 [ 1.274753] xen_reset_event_channels: 5 p 11 s 0 v 0 i 0 d 0 0/0 [ 1.277322] xen_reset_event_channels: 6 p 12 s 0 v 0 i 0 d 0 0/0 [ 1.280539] xen_reset_event_channels: 7 p 13 s 0 v 0 i 0 d 0 0/0 [ 1.283734] xen_reset_event_channels: 8 p 14 s 0 v 0 i 0 d 0 0/0 [ 1.286982] xen_reset_event_channels: 9 p 15 s 0 v 0 i 0 d 0 0/0 [ 1.290188] xen_reset_event_channels: 10 p 16 s 0 v 0 i 0 d 0 0/0 [ 1.293461] xen_reset_event_channels: 11 p 17 s 0 v 0 i 0 d 0 0/0 [ 1.296732] xen_reset_event_channels: 12 p 18 s 0 v 0 i 0 d 0 0/0 [ 1.299958] xen_reset_event_channels: 13 p 19 s 0 v 0 i 0 d 0 0/0 [ 1.303239] xen_reset_event_channels: c 14 20 [ 1.307028] suspend: event channel 6 after first kexec: [ 1.126930] xen-platform-pci 0000:00:03.0: I/O protocol version 1 [ 1.131203] xen_reset_event_channels: 1 p 0 s 0 v 0 i 0 d 0 0/0 [ 1.133496] xen_reset_event_channels: 2 p 1 s 2 v 0 i 0 d 0 0/0 [ 1.135716] xen_reset_event_channels: 3 p 2 s 2 v 1 i 0 d 0 0/0 [ 1.137983] xen_reset_event_channels: 4 p 3 s 2 v 2 i 0 d 0 0/0 [ 1.140260] xen_reset_event_channels: 5 p 4 s 2 v 3 i 0 d 0 0/0 [ 1.142482] xen_reset_event_channels: 6 p 5 s 2 v 0 i 0 d 0 0/0 [ 1.144769] xen_reset_event_channels: 7 p 6 s 1 v 0 i 0 d 0 0/0 [ 1.146987] xen_reset_event_channels: close 7 p 6 s 1 v 0 i 0 d 0 0/0 [ 1.149488] xen_reset_event_channels: 1 p 7 s 1 v 0 i 0 d 0 0/0 [ 1.151706] xen_reset_event_channels: close 1 p 7 s 1 v 0 i 0 d 0 0/0 [ 1.154192] xen_reset_event_channels: 1 p 8 s 1 v 0 i 0 d 0 0/0 [ 1.156466] xen_reset_event_channels: close 1 p 8 s 1 v 0 i 0 d 0 0/0 [ 1.158880] xen_reset_event_channels: 1 p 9 s 0 v 0 i 0 d 0 0/0 [ 1.161165] xen_reset_event_channels: 2 p 10 s 0 v 0 i 0 d 0 0/0 [ 1.163388] xen_reset_event_channels: 3 p 11 s 0 v 0 i 0 d 0 0/0 [ 1.165659] xen_reset_event_channels: 4 p 12 s 0 v 0 i 0 d 0 0/0 [ 1.167882] xen_reset_event_channels: 5 p 13 s 0 v 0 i 0 d 0 0/0 [ 1.170173] xen_reset_event_channels: 6 p 14 s 0 v 0 i 0 d 0 0/0 [ 1.172621] xen_reset_event_channels: 7 p 15 s 0 v 0 i 0 d 0 0/0 [ 1.175762] xen_reset_event_channels: 8 p 16 s 0 v 0 i 0 d 0 0/0 [ 1.178161] xen_reset_event_channels: 9 p 17 s 0 v 0 i 0 d 0 0/0 [ 1.180470] xen_reset_event_channels: 10 p 18 s 0 v 0 i 0 d 0 0/0 [ 1.182752] xen_reset_event_channels: 11 p 19 s 0 v 0 i 0 d 0 0/0 [ 1.185124] xen_reset_event_channels: 12 p 20 s 0 v 0 i 0 d 0 0/0 [ 1.187402] xen_reset_event_channels: 13 p 21 s 0 v 0 i 0 d 0 0/0 [ 1.189736] xen_reset_event_channels: c 14 22 [ 1.191915] XENBUS: frontend device/vbd/768 Closed [ 1.193901] XENBUS: backend /local/domain/0/backend/vbd/19/768 Closed [ 1.196402] XENBUS: triggering reconnect on /local/domain/0/backend/vbd/19/768 [ 1.199330] XENBUS: backend /local/domain/0/backend/vbd/19/768/state Closed [ 1.202927] XENBUS: backend /local/domain/0/backend/vbd/19/768/state InitWait [ 1.205689] XENBUS: reconnect done on /local/domain/0/backend/vbd/19/768 [ 1.208306] XENBUS: frontend device/vbd/5632 Closed [ 1.210277] XENBUS: backend /local/domain/0/backend/vbd/19/5632 Closed [ 1.212783] XENBUS: triggering reconnect on /local/domain/0/backend/vbd/19/5632 [ 1.215788] XENBUS: backend /local/domain/0/backend/vbd/19/5632/state Closed [ 1.220149] XENBUS: backend /local/domain/0/backend/vbd/19/5632/state InitWait [ 1.222940] XENBUS: reconnect done on /local/domain/0/backend/vbd/19/5632 [ 1.225673] XENBUS: frontend device/vif/0 Closed [ 1.227548] XENBUS: backend /local/domain/0/backend/vif/19/0 Closed [ 1.229975] XENBUS: triggering reconnect on /local/domain/0/backend/vif/19/0 [ 1.232903] XENBUS: backend /local/domain/0/backend/vif/19/0/state Closed [ 1.236079] XENBUS: backend /local/domain/0/backend/vif/19/0/state InitWait [ 1.239787] XENBUS: reconnect done on /local/domain/0/backend/vif/19/0 [ 1.243467] suspend: event channel 6 --- Index: xen-4.1.2-testing/unmodified_drivers/linux-2.6/platform-pci/evtchn.c ==================================================================--- xen-4.1.2-testing.orig/unmodified_drivers/linux-2.6/platform-pci/evtchn.c +++ xen-4.1.2-testing/unmodified_drivers/linux-2.6/platform-pci/evtchn.c @@ -341,6 +341,38 @@ void irq_resume(void) irq_evtchn[irq].evtchn = 0; } +/* BITS_PER_LONG is used in Xen */ +#define MAX_EVTCHNS (64 * 64) + +static void xen_reset_event_channels(void) +{ + struct evtchn_status status = { .dom = DOMID_SELF }; + struct evtchn_close close; + int port, rc, count = 0; + + for (port = 0; port <= MAX_EVTCHNS; port++) { + status.port = port; + if (count++ > 12) + break; + rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status); + printk("%s: %d p %d s %x v %x i %x d %x %d/%x\n", __func__, count, port, status.status, status.vcpu, status.u.pirq, status.u.unbound.dom, rc, rc); + if (rc < 0) + continue; + switch (status.status) { + case EVTCHNSTAT_closed: + case EVTCHNSTAT_interdomain: + break; + default: + close.port = port; + rc = HYPERVISOR_event_channel_op(EVTCHNOP_close, &close); + printk("%s: close %d p %d s %x v %x i %x d %x %d/%x\n", __func__, count, port, status.status, status.vcpu, status.u.pirq, status.u.unbound.dom, rc, rc); + count = 0; + break; + } + } + printk("%s: c %d %d\n", __func__, count, port); +} + int xen_irq_init(struct pci_dev *pdev) { int irq; @@ -348,6 +380,8 @@ int xen_irq_init(struct pci_dev *pdev) for (irq = 0; irq < ARRAY_SIZE(irq_evtchn); irq++) spin_lock_init(&irq_evtchn[irq].lock); + xen_reset_event_channels(); + return request_irq(pdev->irq, evtchn_interrupt, #if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,22) SA_SHIRQ | SA_SAMPLE_RANDOM | SA_INTERRUPT,
Ian Campbell
2012-Mar-14 10:56 UTC
Re: how to properly reset event channel in PVonHVM after kexec boot
On Wed, 2012-03-14 at 10:52 +0000, Olaf Hering wrote:> What is the best way to reset all registered event channels after an > kexec boot in a PVonHVM guest? Right now repeated kexec boots will fail > after 1300 iterations because no more event channels can be registered. > > EVTCHNOP_reset can not be used because it expects the domid, which the guest > does not know.It appears that it accepts DOMID_SELF. It calls rcu_lock_target_domain_by_id which handles it explicitly. Ian.
Olaf Hering
2012-Mar-14 11:05 UTC
Re: how to properly reset event channel in PVonHVM after kexec boot
On Wed, Mar 14, Ian Campbell wrote:> On Wed, 2012-03-14 at 10:52 +0000, Olaf Hering wrote: > > What is the best way to reset all registered event channels after an > > kexec boot in a PVonHVM guest? Right now repeated kexec boots will fail > > after 1300 iterations because no more event channels can be registered. > > > > EVTCHNOP_reset can not be used because it expects the domid, which the guest > > does not know. > > It appears that it accepts DOMID_SELF. It calls > rcu_lock_target_domain_by_id which handles it explicitly.Oh, I will see how it works then. It was not obvious to me that DOMID_SELF is handled. From reading the code it appears that it could also reset EVTCHNSTAT_interdomain, which would hang the guest. Thats why my patch skips this state. Olaf
Olaf Hering
2012-Mar-14 18:01 UTC
Re: how to properly reset event channel in PVonHVM after kexec boot
On Wed, Mar 14, Olaf Hering wrote:> From reading the code it appears that it could also reset > EVTCHNSTAT_interdomain, which would hang the guest. Thats why my patch > skips this state.Yes, EVTCHNOP_reset will shutdown everything, so that can not be used for this purpose. Olaf