I''m running PVOPs 2.6.38 on Xen 4.0.2 RC3 and while booting a guest I lose interrupts for the PS/2 trackpad. The trackpad stops functioning because the device is waiting for service. I added a work around that calls i8042_interupt form a timer if it hasn''t been called in 1s and it started working again. I added some code to Xen to count IRQ 12 and compared that to the IRQ 12 count in //proc/interrupts (I stopped PS/2 activity and waited for PS/2 interrupt activity to stop before taking the counts). I lose one interrupt in Dom0 every time the trackpad freezes. (XEN) IRQ 12 count 21048 12: 21047 0 xen-pirq-ioapic-edge i8042 <--- lost an interrupt in dom0 ... (XEN) IRQ 12 count 48540 12: 48537 0 xen-pirq-ioapic-edge i8042 <--- lost 3 interrupts in dom0 I looked at the point at which the trackpad gets it''s last interrupt in a trace and the other major activity at that time is the event channel that services the Qemu vcpu io_req code. This 2.6.38 tree has a merge of Stafano''s 2.6.39 fixes in drivers/xen/events.c. Anyone have any ideas or suggestions? -Tom Goetz --- Tom Goetz tcgoetz@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On May 19, 2011, at 5:45 PM, Thomas Goetz wrote:> I''m running PVOPs 2.6.38 on Xen 4.0.2 RC3 and while booting a guest I lose interrupts for the PS/2 trackpad. The trackpad stops functioning because the device is waiting for service. I added a work around that calls i8042_interupt form a timer if it hasn''t been called in 1s and it started working again. I added some code to Xen to count IRQ 12 and compared that to the IRQ 12 count in //proc/interrupts (I stopped PS/2 activity and waited for PS/2 interrupt activity to stop before taking the counts). I lose one interrupt in Dom0 every time the trackpad freezes. > > > (XEN) IRQ 12 count 21048 > 12: 21047 0 xen-pirq-ioapic-edge i8042 <--- lost an interrupt in dom0 > ... > > (XEN) IRQ 12 count 48540 > 12: 48537 0 xen-pirq-ioapic-edge i8042 <--- lost 3 interrupts in dom0 > > > I looked at the point at which the trackpad gets it''s last interrupt in a trace and the other major activity at that time is the event channel that services the Qemu vcpu io_req code. > > This 2.6.38 tree has a merge of Stafano''s 2.6.39 fixes in drivers/xen/events.c. > > Anyone have any ideas or suggestions? >More data. The number of missing interrupts is equal to the number of times __do_IRQ_guest called send_guest_pirq and incremented already_pending. The number of IRQ 12 interrupts reported by /proc/interrupts is the same as the count of times __xen_evtchn_do_upcall called generic_handle_irq_desc for IRQ 12. So the issue has to be between send_guest_pirq in Xen and __xen_evtchn_do_upcall in dom0. -Tom Goetz _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Fri, May 20, 2011 at 11:53:54AM -0400, Thomas Goetz wrote:> > On May 19, 2011, at 5:45 PM, Thomas Goetz wrote: > > > I''m running PVOPs 2.6.38 on Xen 4.0.2 RC3 and while booting a guest I lose interrupts for the PS/2 trackpad. The trackpad stops functioning because the device is waiting for service. I added a work around that calls i8042_interupt form a timer if it hasn''t been called in 1s and it started working again. I added some code to Xen to count IRQ 12 and compared that to the IRQ 12 count in //proc/interrupts (I stopped PS/2 activity and waited for PS/2 interrupt activity to stop before taking the counts). I lose one interrupt in Dom0 every time the trackpad freezes. > > > > > > (XEN) IRQ 12 count 21048 > > 12: 21047 0 xen-pirq-ioapic-edge i8042 <--- lost an interrupt in dom0 > > ... > > > > (XEN) IRQ 12 count 48540 > > 12: 48537 0 xen-pirq-ioapic-edge i8042 <--- lost 3 interrupts in dom0 > > > > > > I looked at the point at which the trackpad gets it''s last interrupt in a trace and the other major activity at that time is the event channel that services the Qemu vcpu io_req code. > > > > This 2.6.38 tree has a merge of Stafano''s 2.6.39 fixes in drivers/xen/events.c. > > > > Anyone have any ideas or suggestions? > > > > > More data. The number of missing interrupts is equal to the number of times __do_IRQ_guest called send_guest_pirq and incremented already_pending. The number of IRQ 12 interrupts reported by /proc/interrupts is the same as the count of times __xen_evtchn_do_upcall called generic_handle_irq_desc for IRQ 12. So the issue has to be between send_guest_pirq in Xen and __xen_evtchn_do_upcall in dom0.So extremly hairy code. Not sure if there was any work done in the send_guest_pirq, but I do know that __xen_evtchn_do_upcall had a fair bit of IRQ fair round-robin code added in. The git commits were 3b7bcdf xen: events: Remove redundant clear of l2i at end of round-robin loop 24b51c2 xen: events: Make round-robin scan fairer by snapshotting each l2 word once only ada6814 xen: events: Clean up round-robin evtchn scan. f1f4a32 xen: events: Make last processed event channel a per-cpu variable. ab7f863 xen: events: Process event channels notifications in round-robin order. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On May 20, 2011, at 1:50 PM, Konrad Rzeszutek Wilk wrote:> On Fri, May 20, 2011 at 11:53:54AM -0400, Thomas Goetz wrote: >> >> On May 19, 2011, at 5:45 PM, Thomas Goetz wrote: >> >>> I''m running PVOPs 2.6.38 on Xen 4.0.2 RC3 and while booting a guest I lose interrupts for the PS/2 trackpad. The trackpad stops functioning because the device is waiting for service. I added a work around that calls i8042_interupt form a timer if it hasn''t been called in 1s and it started working again. I added some code to Xen to count IRQ 12 and compared that to the IRQ 12 count in //proc/interrupts (I stopped PS/2 activity and waited for PS/2 interrupt activity to stop before taking the counts). I lose one interrupt in Dom0 every time the trackpad freezes. >>> >>> >>> (XEN) IRQ 12 count 21048 >>> 12: 21047 0 xen-pirq-ioapic-edge i8042 <--- lost an interrupt in dom0 >>> ... >>> >>> (XEN) IRQ 12 count 48540 >>> 12: 48537 0 xen-pirq-ioapic-edge i8042 <--- lost 3 interrupts in dom0 >>> >>> >>> I looked at the point at which the trackpad gets it''s last interrupt in a trace and the other major activity at that time is the event channel that services the Qemu vcpu io_req code. >>> >>> This 2.6.38 tree has a merge of Stafano''s 2.6.39 fixes in drivers/xen/events.c. >>> >>> Anyone have any ideas or suggestions? >>> >> >> >> More data. The number of missing interrupts is equal to the number of times __do_IRQ_guest called send_guest_pirq and incremented already_pending. The number of IRQ 12 interrupts reported by /proc/interrupts is the same as the count of times __xen_evtchn_do_upcall called generic_handle_irq_desc for IRQ 12. So the issue has to be between send_guest_pirq in Xen and __xen_evtchn_do_upcall in dom0. > > So extremly hairy code. Not sure if there was any work done in the send_guest_pirq, but I do > know that __xen_evtchn_do_upcall had a fair bit of IRQ fair round-robin code added in. > > The git commits were > > 3b7bcdf xen: events: Remove redundant clear of l2i at end of round-robin loop > 24b51c2 xen: events: Make round-robin scan fairer by snapshotting each l2 word once only > ada6814 xen: events: Clean up round-robin evtchn scan. > f1f4a32 xen: events: Make last processed event channel a per-cpu variable. > ab7f863 xen: events: Process event channels notifications in round-robin order.The PS/2 port has a one character buffer. It will only ever send one interrupt until it has been serviced. When __do_IRQ_guest calls send_guest_pirq and sees that it is already pending, what part of the between the bottom of __do_IRQ_guest and _irq_guest_eoi results in the pending interrupt being issued to the guest? I haven''t found that and it looks like Xen is merging the ACKTYPE_NONE edge interrupt resulting in the deice never being serviced when it''s buffer is full and never interrupting again. --- Tom Goetz tcgoetz@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 20.05.11 at 20:06, Thomas Goetz <tcgoetz@gmail.com> wrote:> On May 20, 2011, at 1:50 PM, Konrad Rzeszutek Wilk wrote: > >> On Fri, May 20, 2011 at 11:53:54AM -0400, Thomas Goetz wrote: >>> >>> On May 19, 2011, at 5:45 PM, Thomas Goetz wrote: >>> >>>> I''m running PVOPs 2.6.38 on Xen 4.0.2 RC3 and while booting a guest I lose > interrupts for the PS/2 trackpad. The trackpad stops functioning because the > device is waiting for service. I added a work around that calls > i8042_interupt form a timer if it hasn''t been called in 1s and it started > working again. I added some code to Xen to count IRQ 12 and compared that to > the IRQ 12 count in //proc/interrupts (I stopped PS/2 activity and waited for > PS/2 interrupt activity to stop before taking the counts). I lose one > interrupt in Dom0 every time the trackpad freezes. >>>> >>>> >>>> (XEN) IRQ 12 count 21048 >>>> 12: 21047 0 xen-pirq-ioapic-edge i8042 <--- lost an interrupt in > dom0 >>>> ... >>>> >>>> (XEN) IRQ 12 count 48540 >>>> 12: 48537 0 xen-pirq-ioapic-edge i8042 <--- lost 3 interrupts in > dom0 >>>> >>>> >>>> I looked at the point at which the trackpad gets it''s last interrupt in a > trace and the other major activity at that time is the event channel that > services the Qemu vcpu io_req code. >>>> >>>> This 2.6.38 tree has a merge of Stafano''s 2.6.39 fixes in > drivers/xen/events.c. >>>> >>>> Anyone have any ideas or suggestions? >>>> >>> >>> >>> More data. The number of missing interrupts is equal to the number of times > __do_IRQ_guest called send_guest_pirq and incremented already_pending. The > number of IRQ 12 interrupts reported by /proc/interrupts is the same as the > count of times __xen_evtchn_do_upcall called generic_handle_irq_desc for IRQ > 12. So the issue has to be between send_guest_pirq in Xen and > __xen_evtchn_do_upcall in dom0. >> >> So extremly hairy code. Not sure if there was any work done in the > send_guest_pirq, but I do >> know that __xen_evtchn_do_upcall had a fair bit of IRQ fair round-robin code > added in. >> >> The git commits were >> >> 3b7bcdf xen: events: Remove redundant clear of l2i at end of round-robin loop >> 24b51c2 xen: events: Make round-robin scan fairer by snapshotting each l2 > word once only >> ada6814 xen: events: Clean up round-robin evtchn scan. >> f1f4a32 xen: events: Make last processed event channel a per-cpu variable. >> ab7f863 xen: events: Process event channels notifications in round-robin > order. > > The PS/2 port has a one character buffer. It will only ever send one > interrupt until it has been serviced. When __do_IRQ_guest calls > send_guest_pirq and sees that it is already pending, what part of the between > the bottom of __do_IRQ_guest and _irq_guest_eoi results in the pending > interrupt being issued to the guest? I haven''t found that and it looks like > Xen is merging the ACKTYPE_NONE edge interrupt resulting in the deice never > being serviced when it''s buffer is full and never interrupting again.It would be a bug of the 8042 if it sent a second interrupt when the first one wasn''t serviced (status and data port read) yet. It''s the nature of edge triggered interrupts that secondary instances are lost when the primary instance doesn''t get handled (in time). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On May 23, 2011, at 4:26 AM, Jan Beulich wrote:>>>> >> >> The PS/2 port has a one character buffer. It will only ever send one >> interrupt until it has been serviced. When __do_IRQ_guest calls >> send_guest_pirq and sees that it is already pending, what part of the between >> the bottom of __do_IRQ_guest and _irq_guest_eoi results in the pending >> interrupt being issued to the guest? I haven''t found that and it looks like >> Xen is merging the ACKTYPE_NONE edge interrupt resulting in the deice never >> being serviced when it''s buffer is full and never interrupting again. > > It would be a bug of the 8042 if it sent a second interrupt when the > first one wasn''t serviced (status and data port read) yet. It''s the > nature of edge triggered interrupts that secondary instances are > lost when the primary instance doesn''t get handled (in time). > > Jan >My assumption is that at the point that the i8042 driver reads the data register a new interrupt happens. There is gap in time between when the data register is read and when the event channel pending state is cleared. Since the hypervisor ACKed the previous real interrupt before delivering it to the guest, there is nothing to stop the i8042 device from interrupting immediately after the data register is read. If it interrupt before the event channel pending state is cleared, then it will not be delivered to the guest and the EOI mechanism will be set up, but I haven''t found anything in that that will set up a delayed delivery of the second interrupt. In this situation the i8042 device has every reason to believe the second interrupt will be delivered. The previous interrupt was received and handled. Nothing is masked. Am I missing something? --- Tom Goetz tcgoetz@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 23.05.11 at 14:09, Thomas Goetz <tcgoetz@gmail.com> wrote: > My assumption is that at the point that the i8042 driver reads the data > register a new interrupt happens. There is gap in time between when the data > register is read and when the event channel pending state is cleared. Since > the hypervisor ACKed the previous real interrupt before delivering it to the > guest, there is nothing to stop the i8042 device from interrupting > immediately after the data register is read. If it interrupt before the event > channel pending state is cleared, then it will not be delivered to the guestThat would be a bug in the control flow then. In our (non-pvops) kernel we make sure to clear the pending bit *before* calling handle_irq() (and after masking the event channel), which clearly is a requirement at least for edge triggered interrupts (not necessarily before calling handle_irq(), but before calling the device driver''s handler, i.e. in chip->ack()). Jan> and the EOI mechanism will be set up, but I haven''t found anything in that > that will set up a delayed delivery of the second interrupt. > > In this situation the i8042 device has every reason to believe the second > interrupt will be delivered. The previous interrupt was received and handled. > Nothing is masked. > > Am I missing something? > > --- > Tom Goetz > tcgoetz@gmail.com_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 23 May 2011, Thomas Goetz wrote:> My assumption is that at the point that the i8042 driver reads the data register a new interrupt happens. There is gap in > time between when the data register is read and when the event channel pending state is cleared. Since the hypervisor > ACKed the previous real interrupt before delivering it to the guest, there is nothing to stop the i8042 device from > interrupting immediately after the data register is read. If it interrupt before the event channel pending state is > cleared, then it will not be delivered to the guest and the EOI mechanism will be set up, but I haven''t found anything in > that that will set up a delayed delivery of the second interrupt. > > In this situation the i8042 device has every reason to believe the second interrupt will be delivered. The previous > interrupt was received and handled. Nothing is masked. > > Am I missing something?I am assuming you have the latest version of my fixes to drivers/xen/events.c The problem you are describing shouldn''t happen because the interrupt handler returned by request_irq to i8042 is handle_edge_irq that calls chip->irq_ack() before handle_irq_event(). We implement irq_ack with a clear_evtchn() so by the time i8042_interrupt is called the event channel should have already been cleared. If a second interrupt is asserted right after i8042_interrupt reads the data port, handle_edge_irq is called again and this time because another interrupt of the same kind is already being handled, it will call mask_ack_irq(). On Xen this translates to mask_evtchn() and clear_evtchn(), so once again the event channel pending bit should be cleared. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On May 23, 2011, at 9:45 AM, Stefano Stabellini wrote:> On Mon, 23 May 2011, Thomas Goetz wrote: >> My assumption is that at the point that the i8042 driver reads the data register a new interrupt happens. There is gap in >> time between when the data register is read and when the event channel pending state is cleared. Since the hypervisor >> ACKed the previous real interrupt before delivering it to the guest, there is nothing to stop the i8042 device from >> interrupting immediately after the data register is read. If it interrupt before the event channel pending state is >> cleared, then it will not be delivered to the guest and the EOI mechanism will be set up, but I haven''t found anything in >> that that will set up a delayed delivery of the second interrupt. >> >> In this situation the i8042 device has every reason to believe the second interrupt will be delivered. The previous >> interrupt was received and handled. Nothing is masked. >> >> Am I missing something? > > > I am assuming you have the latest version of my fixes to > drivers/xen/events.cI''ll have a version ported from your 2.6.39 tree to my 2.6.38 tree. I''ll update my copy of your tree and make sure it''s up to date.> > The problem you are describing shouldn''t happen because the interrupt > handler returned by request_irq to i8042 is handle_edge_irq that calls > chip->irq_ack() before handle_irq_event().I checked on which method it is using and it''s using handle_fasteoi_irq. In fatc all of the IRQs under 16 are despite most being edge. Log snippet below. I''m looking into why pirq_needs_eoi is returning the wrong answer now.> We implement irq_ack with a clear_evtchn() so by the time > i8042_interrupt is called the event channel should have already been > cleared. > > If a second interrupt is asserted right after i8042_interrupt reads the > data port, handle_edge_irq is called again and this time because another > interrupt of the same kind is already being handled, it will call > mask_ack_irq(). > On Xen this translates to mask_evtchn() and clear_evtchn(), so once > again the event channel pending bit should be cleared.May 23 16:52:00 jcf kernel: [ 35.075367] [<ffffffff8143fb71>] ? i8042_interrupt+0x221/0x410 May 23 16:52:00 jcf kernel: [ 35.075373] [<ffffffff81292f8b>] ? radix_tree_lookup+0xb/0x10 May 23 16:52:00 jcf kernel: [ 35.075379] [<ffffffff810ce014>] ? handle_IRQ_event+0x54/0x180 May 23 16:52:00 jcf kernel: [ 35.075384] [<ffffffff810d08b4>] ? handle_fasteoi_irq+0x84/0x110 May 23 16:52:00 jcf kernel: [ 35.075389] [<ffffffff81324077>] ? __xen_evtchn_do_upcall+0x1a7/0x270 May 23 16:52:00 jcf kernel: [ 35.075395] [<ffffffff81007c6f>] ? xen_restore_fl_direct_end+0x0/0x1 May 23 16:52:00 jcf kernel: [ 35.075399] [<ffffffff81325def>] ? xen_evtchn_do_upcall+0x2f/0x50 May 23 16:52:00 jcf kernel: [ 35.075404] [<ffffffff8100ceae>] ? xen_do_hypervisor_callback+0x1e/0x30 May 23 16:52:00 jcf kernel: [ 35.075407] <EOI> [<ffffffff810012eb>] ? hypercall_page+0x2eb/0x1000 [ 0.000000] xen_map_pirq_gsi: irq 1 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 2 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 3 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 4 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 5 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 6 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 7 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 8 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: returning irq 9 for gsi 9 [ 0.000000] xen_map_pirq_gsi: irq 10 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 11 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 12 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 13 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 14 handle_fasteoi_irq [ 0.000000] xen_map_pirq_gsi: irq 15 handle_fasteoi_irq 1: 8 0 xen-pirq-ioapic-edge i8042 3: 1 0 xen-pirq-ioapic-edge 4: 1 0 xen-pirq-ioapic-edge 5: 1 0 xen-pirq-ioapic-edge 7: 1 0 xen-pirq-ioapic-edge 8: 0 0 xen-pirq-ioapic-edge rtc0 9: 1129 0 xen-pirq-ioapic-level acpi 10: 1 0 xen-pirq-ioapic-edge 11: 1 0 xen-pirq-ioapic-edge 12: 4032 0 xen-pirq-ioapic-edge i8042 14: 153 0 xen-pirq-ioapic-edge ata_piix 15: 0 0 xen-pirq-ioapic-edge ata_piix --- Tom Goetz tcgoetz@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On May 23, 2011, at 1:16 PM, Thomas Goetz wrote:> > On May 23, 2011, at 9:45 AM, Stefano Stabellini wrote: > >> On Mon, 23 May 2011, Thomas Goetz wrote: >>> My assumption is that at the point that the i8042 driver reads the data register a new interrupt happens. There is gap in >>> time between when the data register is read and when the event channel pending state is cleared. Since the hypervisor >>> ACKed the previous real interrupt before delivering it to the guest, there is nothing to stop the i8042 device from >>> interrupting immediately after the data register is read. If it interrupt before the event channel pending state is >>> cleared, then it will not be delivered to the guest and the EOI mechanism will be set up, but I haven''t found anything in >>> that that will set up a delayed delivery of the second interrupt. >>> >>> In this situation the i8042 device has every reason to believe the second interrupt will be delivered. The previous >>> interrupt was received and handled. Nothing is masked. >>> >>> Am I missing something? >> >> >> I am assuming you have the latest version of my fixes to >> drivers/xen/events.c > > I''ll have a version ported from your 2.6.39 tree to my 2.6.38 tree. I''ll update my copy of your tree and make sure it''s up to date. > >> >> The problem you are describing shouldn''t happen because the interrupt >> handler returned by request_irq to i8042 is handle_edge_irq that calls >> chip->irq_ack() before handle_irq_event(). > > I checked on which method it is using and it''s using handle_fasteoi_irq. In fatc all of the IRQs under 16 are despite most being edge. Log snippet below. I''m looking into why pirq_needs_eoi is returning the wrong answer now.pirq_needs_eoi checks info->u.pirq.flags & PIRQ_NEEDS_EOI. PIRQ_NEEDS_EOI is only set by pirq_query_unmask which sets it based on the hypercall PHYSDEVOP_irq_status_query which in Xen 4.0.1 and Xen unstable always returns an EOI is needed. Stefano, I don''t see any changes in your 2.6.39 tree that would effect this. Relevant code snippets included below: if (pirq_needs_eoi(irq)) { printk(KERN_ERR "%s: irq %d handle_fasteoi_irq\n", __FUNCTION__, irq); set_irq_chip_and_handler_name(irq, &xen_pirq_chip, handle_fasteoi_irq, name); } else { printk(KERN_ERR "%s: irq %d handle_edge_irq\n", __FUNCTION__, irq); set_irq_chip_and_handler_name(irq, &xen_pirq_chip, handle_edge_irq, name); } static bool pirq_needs_eoi(unsigned irq) { struct irq_info *info = info_for_irq(irq); BUG_ON(info->type != IRQT_PIRQ); return info->u.pirq.flags & PIRQ_NEEDS_EOI; } static void pirq_query_unmask(int irq) { struct physdev_irq_status_query irq_status; struct irq_info *info = info_for_irq(irq); BUG_ON(info->type != IRQT_PIRQ); irq_status.irq = pirq_from_irq(irq); if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) irq_status.flags = 0; printk(KERN_ERR "%s: irq %d needs eoi %d\n", __FUNCTION__, irq, (irq_status.flags & XENIRQSTAT_needs_eoi) == XENIRQSTAT_needs_eoi); info->u.pirq.flags &= ~PIRQ_NEEDS_EOI; if (irq_status.flags & XENIRQSTAT_needs_eoi) info->u.pirq.flags |= PIRQ_NEEDS_EOI; } case PHYSDEVOP_irq_status_query: { struct physdev_irq_status_query irq_status_query; ret = -EFAULT; if ( copy_from_guest(&irq_status_query, arg, 1) != 0 ) break; irq = irq_status_query.irq; ret = -EINVAL; if ( (irq < 0) || (irq >= v->domain->nr_pirqs) ) break; irq_status_query.flags = 0; /* * Even edge-triggered or message-based IRQs can need masking from * time to time. If teh guest is not dynamically checking for this * via the new pirq_eoi_map mechanism, it must conservatively always * execute the EOI hypercall. In practice, this only really makes a * difference for maskable MSI sources, and if those are supported * then dom0 is probably modern anyway. */ irq_status_query.flags |= XENIRQSTAT_needs_eoi; if ( pirq_shared(v->domain, irq) ) irq_status_query.flags |= XENIRQSTAT_shared; ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0; break; } --- Tom Goetz tcgoetz@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On May 23, 2011, at 1:28 PM, Thomas Goetz wrote:> > On May 23, 2011, at 1:16 PM, Thomas Goetz wrote: > >> >> On May 23, 2011, at 9:45 AM, Stefano Stabellini wrote: >> >>> On Mon, 23 May 2011, Thomas Goetz wrote: >>>> My assumption is that at the point that the i8042 driver reads the data register a new interrupt happens. There is gap in >>>> time between when the data register is read and when the event channel pending state is cleared. Since the hypervisor >>>> ACKed the previous real interrupt before delivering it to the guest, there is nothing to stop the i8042 device from >>>> interrupting immediately after the data register is read. If it interrupt before the event channel pending state is >>>> cleared, then it will not be delivered to the guest and the EOI mechanism will be set up, but I haven''t found anything in >>>> that that will set up a delayed delivery of the second interrupt. >>>> >>>> In this situation the i8042 device has every reason to believe the second interrupt will be delivered. The previous >>>> interrupt was received and handled. Nothing is masked. >>>> >>>> Am I missing something? >>> >>> >>> I am assuming you have the latest version of my fixes to >>> drivers/xen/events.c >> >> I''ll have a version ported from your 2.6.39 tree to my 2.6.38 tree. I''ll update my copy of your tree and make sure it''s up to date. >> >>> >>> The problem you are describing shouldn''t happen because the interrupt >>> handler returned by request_irq to i8042 is handle_edge_irq that calls >>> chip->irq_ack() before handle_irq_event(). >> >> I checked on which method it is using and it''s using handle_fasteoi_irq. In fatc all of the IRQs under 16 are despite most being edge. Log snippet below. I''m looking into why pirq_needs_eoi is returning the wrong answer now. > > > pirq_needs_eoi checks info->u.pirq.flags & PIRQ_NEEDS_EOI. PIRQ_NEEDS_EOI is only set by pirq_query_unmask which sets it based on the hypercall PHYSDEVOP_irq_status_query which in Xen 4.0.1 and Xen unstable always returns an EOI is needed. Stefano, I don''t see any changes in your 2.6.39 tree that would effect this.I''m running with the attached workaround and I''m the PS/2 issue is gone. drivers/xen/events.c :: xen_map_pirq_gsi if ((strcmp(name, "ioapic-edge") != 0) && pirq_needs_eoi(irq)) { set_irq_chip_and_handler_name(irq, &xen_pirq_chip, handle_fasteoi_irq, name); } else { set_irq_chip_and_handler_name(irq, &xen_pirq_chip, handle_edge_irq, name); } --- Tom Goetz tcgoetz@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 23.05.11 at 19:28, Thomas Goetz <tcgoetz@gmail.com> wrote:> On May 23, 2011, at 1:16 PM, Thomas Goetz wrote: > >> >> On May 23, 2011, at 9:45 AM, Stefano Stabellini wrote: >> >>> On Mon, 23 May 2011, Thomas Goetz wrote: >>>> My assumption is that at the point that the i8042 driver reads the data > register a new interrupt happens. There is gap in >>>> time between when the data register is read and when the event channel > pending state is cleared. Since the hypervisor >>>> ACKed the previous real interrupt before delivering it to the guest, there > is nothing to stop the i8042 device from >>>> interrupting immediately after the data register is read. If it interrupt > before the event channel pending state is >>>> cleared, then it will not be delivered to the guest and the EOI mechanism > will be set up, but I haven''t found anything in >>>> that that will set up a delayed delivery of the second interrupt. >>>> >>>> In this situation the i8042 device has every reason to believe the second > interrupt will be delivered. The previous >>>> interrupt was received and handled. Nothing is masked. >>>> >>>> Am I missing something? >>> >>> >>> I am assuming you have the latest version of my fixes to >>> drivers/xen/events.c >> >> I''ll have a version ported from your 2.6.39 tree to my 2.6.38 tree. I''ll > update my copy of your tree and make sure it''s up to date. >> >>> >>> The problem you are describing shouldn''t happen because the interrupt >>> handler returned by request_irq to i8042 is handle_edge_irq that calls >>> chip->irq_ack() before handle_irq_event(). >> >> I checked on which method it is using and it''s using handle_fasteoi_irq. In > fatc all of the IRQs under 16 are despite most being edge. Log snippet below. > I''m looking into why pirq_needs_eoi is returning the wrong answer now. > > > pirq_needs_eoi checks info->u.pirq.flags & PIRQ_NEEDS_EOI. PIRQ_NEEDS_EOI is > only set by pirq_query_unmask which sets it based on the hypercall > PHYSDEVOP_irq_status_query which in Xen 4.0.1 and Xen unstable always returns > an EOI is needed. Stefano, I don''t see any changes in your 2.6.39 tree that > would effect this. > > Relevant code snippets included below: > > if (pirq_needs_eoi(irq)) { > printk(KERN_ERR "%s: irq %d handle_fasteoi_irq\n", > __FUNCTION__, irq); > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > handle_fasteoi_irq, name); > } else { > printk(KERN_ERR "%s: irq %d handle_edge_irq\n", > __FUNCTION__, irq); > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > handle_edge_irq, name); > }Now this, imo, is a very good reason to not use handle_edge_irq() at all, and instead use the prior control flow (masking and clearing the event channel up front in do_upcall()) with only fasteoi (leaving aside per-CPU ones). Jan> static bool pirq_needs_eoi(unsigned irq) > { > struct irq_info *info = info_for_irq(irq); > > BUG_ON(info->type != IRQT_PIRQ); > > return info->u.pirq.flags & PIRQ_NEEDS_EOI; > } > > static void pirq_query_unmask(int irq) > { > struct physdev_irq_status_query irq_status; > struct irq_info *info = info_for_irq(irq); > > BUG_ON(info->type != IRQT_PIRQ); > > irq_status.irq = pirq_from_irq(irq); > if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) > irq_status.flags = 0; > > printk(KERN_ERR "%s: irq %d needs eoi %d\n", __FUNCTION__, irq, > (irq_status.flags & XENIRQSTAT_needs_eoi) == XENIRQSTAT_needs_eoi); > > info->u.pirq.flags &= ~PIRQ_NEEDS_EOI; > if (irq_status.flags & XENIRQSTAT_needs_eoi) > info->u.pirq.flags |= PIRQ_NEEDS_EOI; > } > > case PHYSDEVOP_irq_status_query: { > struct physdev_irq_status_query irq_status_query; > ret = -EFAULT; > if ( copy_from_guest(&irq_status_query, arg, 1) != 0 ) > break; > irq = irq_status_query.irq; > ret = -EINVAL; > if ( (irq < 0) || (irq >= v->domain->nr_pirqs) ) > break; > irq_status_query.flags = 0; > /* > * Even edge-triggered or message-based IRQs can need masking from > * time to time. If teh guest is not dynamically checking for this > * via the new pirq_eoi_map mechanism, it must conservatively always > * execute the EOI hypercall. In practice, this only really makes a > * difference for maskable MSI sources, and if those are supported > * then dom0 is probably modern anyway. > */ > irq_status_query.flags |= XENIRQSTAT_needs_eoi; > if ( pirq_shared(v->domain, irq) ) > irq_status_query.flags |= XENIRQSTAT_shared; > ret = copy_to_guest(arg, &irq_status_query, 1) ? -EFAULT : 0; > break; > } > > > --- > Tom Goetz > tcgoetz@gmail.com_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 24 May 2011, Jan Beulich wrote:> > Relevant code snippets included below: > > > > if (pirq_needs_eoi(irq)) { > > printk(KERN_ERR "%s: irq %d handle_fasteoi_irq\n", > > __FUNCTION__, irq); > > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > > handle_fasteoi_irq, name); > > } else { > > printk(KERN_ERR "%s: irq %d handle_edge_irq\n", > > __FUNCTION__, irq); > > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > > handle_edge_irq, name); > > } > > Now this, imo, is a very good reason to not use handle_edge_irq() > at all, and instead use the prior control flow (masking and clearing > the event channel up front in do_upcall()) with only fasteoi (leaving > aside per-CPU ones). >Actually I think it is a good reason to fix pirq_needs_eoi that shouldn''t return unconditionally yes if dom0 doesn''t support pirq_eoi_map. The comment in Xen says: /* * Even edge-triggered or message-based IRQs can need masking from * time to time. If teh guest is not dynamically checking for this * via the new pirq_eoi_map mechanism, it must conservatively always * execute the EOI hypercall. In practice, this only really makes a * difference for maskable MSI sources, and if those are supported * then dom0 is probably modern anyway. */ Considering that I would rather avoid supporting pirq_eoi_map and we are talking about edge triggered interrupts, do you think it would be safe for me to send a patch to xen to change this behaviour? Shouldn''t we set XENIRQSTAT_needs_eoi only for level triggered interrupts (and maybe maskable MSI sources)? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 24.05.11 at 13:04, Stefano Stabellini <stefano.stabellini@eu.citrix.com>wrote:> On Tue, 24 May 2011, Jan Beulich wrote: >> > Relevant code snippets included below: >> > >> > if (pirq_needs_eoi(irq)) { >> > printk(KERN_ERR "%s: irq %d handle_fasteoi_irq\n", >> > __FUNCTION__, irq); >> > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, >> > handle_fasteoi_irq, name); >> > } else { >> > printk(KERN_ERR "%s: irq %d handle_edge_irq\n", >> > __FUNCTION__, irq); >> > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, >> > handle_edge_irq, name); >> > } >> >> Now this, imo, is a very good reason to not use handle_edge_irq() >> at all, and instead use the prior control flow (masking and clearing >> the event channel up front in do_upcall()) with only fasteoi (leaving >> aside per-CPU ones). >> > > Actually I think it is a good reason to fix pirq_needs_eoi that shouldn''t > return unconditionally yes if dom0 doesn''t support pirq_eoi_map. > The comment in Xen says: > > /* > * Even edge-triggered or message-based IRQs can need masking from > * time to time. If teh guest is not dynamically checking for this > * via the new pirq_eoi_map mechanism, it must conservatively always > * execute the EOI hypercall. In practice, this only really makes a > * difference for maskable MSI sources, and if those are supported > * then dom0 is probably modern anyway. > */ > > Considering that I would rather avoid supporting pirq_eoi_map and we are > talking about edge triggered interrupts, do you think it would be safe > for me to send a patch to xen to change this behaviour? > Shouldn''t we set XENIRQSTAT_needs_eoi only for level triggered > interrupts (and maybe maskable MSI sources)?Only if you can prove that the very first part of that comment is incorrect (in including "edge-triggered" and ignoring whether MSI sources are maskable). And your Linux side code would then still be incorrect for maskable MSIs (you''d continue to handle them as fasteoi with no up front clearing/masking while that is necessary as Thomas'' report made clear). What''s so wrong with pirq_eoi_map that you''re trying to avoid it by all means? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, May 24, 2011 at 01:24:24PM +0100, Jan Beulich wrote:> >>> On 24.05.11 at 13:04, Stefano Stabellini <stefano.stabellini@eu.citrix.com> > wrote: > > On Tue, 24 May 2011, Jan Beulich wrote: > >> > Relevant code snippets included below: > >> > > >> > if (pirq_needs_eoi(irq)) { > >> > printk(KERN_ERR "%s: irq %d handle_fasteoi_irq\n", > >> > __FUNCTION__, irq); > >> > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > >> > handle_fasteoi_irq, name); > >> > } else { > >> > printk(KERN_ERR "%s: irq %d handle_edge_irq\n", > >> > __FUNCTION__, irq); > >> > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > >> > handle_edge_irq, name); > >> > } > >> > >> Now this, imo, is a very good reason to not use handle_edge_irq() > >> at all, and instead use the prior control flow (masking and clearing > >> the event channel up front in do_upcall()) with only fasteoi (leaving > >> aside per-CPU ones). > >> > > > > Actually I think it is a good reason to fix pirq_needs_eoi that shouldn''t > > return unconditionally yes if dom0 doesn''t support pirq_eoi_map. > > The comment in Xen says: > > > > /* > > * Even edge-triggered or message-based IRQs can need masking from > > * time to time. If teh guest is not dynamically checking for this > > * via the new pirq_eoi_map mechanism, it must conservatively always > > * execute the EOI hypercall. In practice, this only really makes a > > * difference for maskable MSI sources, and if those are supported > > * then dom0 is probably modern anyway. > > */ > > > > Considering that I would rather avoid supporting pirq_eoi_map and we are > > talking about edge triggered interrupts, do you think it would be safe > > for me to send a patch to xen to change this behaviour? > > Shouldn''t we set XENIRQSTAT_needs_eoi only for level triggered > > interrupts (and maybe maskable MSI sources)? > > Only if you can prove that the very first part of that comment is > incorrect (in including "edge-triggered" and ignoring whether MSI > sources are maskable). And your Linux side code would then still > be incorrect for maskable MSIs (you''d continue to handle them > as fasteoi with no up front clearing/masking while that is necessary > as Thomas'' report made clear).I believe we handle MSI''s as ''handle_edge_chip'' (xen_bind_pirq_msi_to_irq) irregardless of the above hypercall. You wouldn''t have a nice chart of the right type of events to do for different types of interrupts, would you?> > What''s so wrong with pirq_eoi_map that you''re trying to avoid it > by all means?My recollection is that we had a hard time trying to work it in with the tglr''x rewrite of the IRQ code and not enough understanding of this (at least on my side). Any help here would be appreciated. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 24 May 2011, Jan Beulich wrote:> > Actually I think it is a good reason to fix pirq_needs_eoi that shouldn''t > > return unconditionally yes if dom0 doesn''t support pirq_eoi_map. > > The comment in Xen says: > > > > /* > > * Even edge-triggered or message-based IRQs can need masking from > > * time to time. If teh guest is not dynamically checking for this > > * via the new pirq_eoi_map mechanism, it must conservatively always > > * execute the EOI hypercall. In practice, this only really makes a > > * difference for maskable MSI sources, and if those are supported > > * then dom0 is probably modern anyway. > > */ > > > > Considering that I would rather avoid supporting pirq_eoi_map and we are > > talking about edge triggered interrupts, do you think it would be safe > > for me to send a patch to xen to change this behaviour? > > Shouldn''t we set XENIRQSTAT_needs_eoi only for level triggered > > interrupts (and maybe maskable MSI sources)? > > Only if you can prove that the very first part of that comment is > incorrect (in including "edge-triggered" and ignoring whether MSI > sources are maskable). And your Linux side code would then still > be incorrect for maskable MSIs (you''d continue to handle them > as fasteoi with no up front clearing/masking while that is necessary > as Thomas'' report made clear). > > What''s so wrong with pirq_eoi_map that you''re trying to avoid it > by all means?The main issue is that if pirq_eoi_map is enabled PHYSDEVOP_eoi automatically unmask the event channel. There isn''t even a way to specify if we want the unmask to be done or not, it just does it. I also think that it is a violation of the interface, see this comment from xen/include/public/xen.h: * Event channels are addressed by a "port index". Each channel is * associated with two bits of information: * 1. PENDING -- notifies the domain that there is a pending notification * to be processed. This bit is cleared by the guest. * 2. MASK -- if this bit is clear then a 0->1 transition of PENDING * will cause an asynchronous upcall to be scheduled. This bit is only --> * updated by the guest. It is read-only within Xen. If a channel * becomes pending while the channel is masked then the ''edge'' is lost * (i.e., when the channel is unmasked, the guest must manually handle * pending notifications as no upcall will be scheduled by Xen). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 23 May 2011, Thomas Goetz wrote:> I''m running with the attached workaround and I''m the PS/2 issue is gone. > > drivers/xen/events.c :: xen_map_pirq_gsi > > if ((strcmp(name, "ioapic-edge") != 0) && pirq_needs_eoi(irq)) { > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > handle_fasteoi_irq, name); > } else { > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, > handle_edge_irq, name); > }I had the same idea while reading your previous email. I think the following patch is better than strcmp name: --- Use the trigger info we already have to choose the irq handler Do not use pirq_needs_eoi to decide which irq handler to use because Xen always returns true if the guest does not support pirq_eoi_map. Use the trigger information we already have from MP-tables and ACPI. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 7f676f8..8418398 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -627,6 +627,9 @@ int xen_allocate_pirq_gsi(unsigned gsi) * * Note: We don''t assign an event channel until the irq actually started * up. Return an existing irq if we''ve already got one for the gsi. + * + * Shareable implies level triggered, not shareable implies edge + * triggered here. */ int xen_bind_pirq_gsi_to_irq(unsigned gsi, unsigned pirq, int shareable, char *name) @@ -665,16 +668,13 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi, pirq_query_unmask(irq); /* We try to use the handler with the appropriate semantic for the - * type of interrupt: if the interrupt doesn''t need an eoi - * (pirq_needs_eoi returns false), we treat it like an edge - * triggered interrupt so we use handle_edge_irq. - * As a matter of fact this only happens when the corresponding - * physical interrupt is edge triggered or an msi. + * type of interrupt: if the interrupt is an edge triggered + * interrupt we use handle_edge_irq. * - * On the other hand if the interrupt needs an eoi (pirq_needs_eoi - * returns true) we treat it like a level triggered interrupt so we - * use handle_fasteoi_irq like the native code does for this kind of + * On the other hand if the interrupt is level triggered we use + * handle_fasteoi_irq like the native code does for this kind of * interrupts. + * * Depending on the Xen version, pirq_needs_eoi might return true * not only for level triggered interrupts but for edge triggered * interrupts too. In any case Xen always honors the eoi mechanism, @@ -682,7 +682,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi, * hasn''t received an eoi yet. Therefore using the fasteoi handler * is the right choice either way. */ - if (pirq_needs_eoi(irq)) + if (shareable) irq_set_chip_and_handler_name(irq, &xen_pirq_chip, handle_fasteoi_irq, name); else _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On May 24, 2011, at 9:53 AM, Stefano Stabellini wrote:> On Mon, 23 May 2011, Thomas Goetz wrote: >> I''m running with the attached workaround and I''m the PS/2 issue is gone. >> >> drivers/xen/events.c :: xen_map_pirq_gsi >> >> if ((strcmp(name, "ioapic-edge") != 0) && pirq_needs_eoi(irq)) { >> set_irq_chip_and_handler_name(irq, &xen_pirq_chip, >> handle_fasteoi_irq, name); >> } else { >> set_irq_chip_and_handler_name(irq, &xen_pirq_chip, >> handle_edge_irq, name); >> } > > I had the same idea while reading your previous email. > I think the following patch is better than strcmp name: > > --- > > Use the trigger info we already have to choose the irq handler > > Do not use pirq_needs_eoi to decide which irq handler to use because Xen > always returns true if the guest does not support pirq_eoi_map. > Use the trigger information we already have from MP-tables and ACPI. > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index 7f676f8..8418398 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -627,6 +627,9 @@ int xen_allocate_pirq_gsi(unsigned gsi) > * > * Note: We don''t assign an event channel until the irq actually started > * up. Return an existing irq if we''ve already got one for the gsi. > + * > + * Shareable implies level triggered, not shareable implies edge > + * triggered here. > */ > int xen_bind_pirq_gsi_to_irq(unsigned gsi, > unsigned pirq, int shareable, char *name) > @@ -665,16 +668,13 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi, > > pirq_query_unmask(irq); > /* We try to use the handler with the appropriate semantic for the > - * type of interrupt: if the interrupt doesn''t need an eoi > - * (pirq_needs_eoi returns false), we treat it like an edge > - * triggered interrupt so we use handle_edge_irq. > - * As a matter of fact this only happens when the corresponding > - * physical interrupt is edge triggered or an msi. > + * type of interrupt: if the interrupt is an edge triggered > + * interrupt we use handle_edge_irq. > * > - * On the other hand if the interrupt needs an eoi (pirq_needs_eoi > - * returns true) we treat it like a level triggered interrupt so we > - * use handle_fasteoi_irq like the native code does for this kind of > + * On the other hand if the interrupt is level triggered we use > + * handle_fasteoi_irq like the native code does for this kind of > * interrupts. > + * > * Depending on the Xen version, pirq_needs_eoi might return true > * not only for level triggered interrupts but for edge triggered > * interrupts too. In any case Xen always honors the eoi mechanism, > @@ -682,7 +682,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi, > * hasn''t received an eoi yet. Therefore using the fasteoi handler > * is the right choice either way. > */ > - if (pirq_needs_eoi(irq)) > + if (shareable) > irq_set_chip_and_handler_name(irq, &xen_pirq_chip, > handle_fasteoi_irq, name); > else >I tested this patch and it worked fine for me. It will go into nightly testing tonight and I''ll let you know if we hit any issues. --- Tom Goetz tcgoetz@gmail.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 24.05.11 at 15:52, Stefano Stabellini <stefano.stabellini@eu.citrix.com>wrote:> On Tue, 24 May 2011, Jan Beulich wrote: >> > Actually I think it is a good reason to fix pirq_needs_eoi that shouldn''t >> > return unconditionally yes if dom0 doesn''t support pirq_eoi_map. >> > The comment in Xen says: >> > >> > /* >> > * Even edge-triggered or message-based IRQs can need masking from >> > * time to time. If teh guest is not dynamically checking for this >> > * via the new pirq_eoi_map mechanism, it must conservatively always >> > * execute the EOI hypercall. In practice, this only really makes a >> > * difference for maskable MSI sources, and if those are supported >> > * then dom0 is probably modern anyway. >> > */ >> > >> > Considering that I would rather avoid supporting pirq_eoi_map and we are >> > talking about edge triggered interrupts, do you think it would be safe >> > for me to send a patch to xen to change this behaviour? >> > Shouldn''t we set XENIRQSTAT_needs_eoi only for level triggered >> > interrupts (and maybe maskable MSI sources)? >> >> Only if you can prove that the very first part of that comment is >> incorrect (in including "edge-triggered" and ignoring whether MSI >> sources are maskable). And your Linux side code would then still >> be incorrect for maskable MSIs (you''d continue to handle them >> as fasteoi with no up front clearing/masking while that is necessary >> as Thomas'' report made clear). >> >> What''s so wrong with pirq_eoi_map that you''re trying to avoid it >> by all means? > > The main issue is that if pirq_eoi_map is enabled PHYSDEVOP_eoi > automatically unmask the event channel. > There isn''t even a way to specify if we want the unmask to be done or > not, it just does it.I can''t think of situations where this would be a problem. It certainly never has been in our kernels.> I also think that it is a violation of the interface, see this comment > from xen/include/public/xen.h: > > * Event channels are addressed by a "port index". Each channel is > * associated with two bits of information: > * 1. PENDING -- notifies the domain that there is a pending notification > * to be processed. This bit is cleared by the guest. > * 2. MASK -- if this bit is clear then a 0->1 transition of PENDING > * will cause an asynchronous upcall to be scheduled. This bit is only > --> * updated by the guest. It is read-only within Xen. If a channelYeah, that should have been updated when the new feature got introduced. But I''m sure you know how things go wrt documentation (especially when a comment like this sits far away from any code touched during the implementation of something new)... Anyway - if a kernel is using the new feature, it clearly ought to be aware that the bitmap then no longer is read-only to the hypervisor. Jan> * becomes pending while the channel is masked then the ''edge'' is lost > * (i.e., when the channel is unmasked, the guest must manually handle > * pending notifications as no upcall will be scheduled by Xen)._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> On 24.05.11 at 14:58, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > On Tue, May 24, 2011 at 01:24:24PM +0100, Jan Beulich wrote: >> >>> On 24.05.11 at 13:04, Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> wrote: >> > On Tue, 24 May 2011, Jan Beulich wrote: >> >> > Relevant code snippets included below: >> >> > >> >> > if (pirq_needs_eoi(irq)) { >> >> > printk(KERN_ERR "%s: irq %d handle_fasteoi_irq\n", >> >> > __FUNCTION__, irq); >> >> > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, >> >> > handle_fasteoi_irq, name); >> >> > } else { >> >> > printk(KERN_ERR "%s: irq %d handle_edge_irq\n", >> >> > __FUNCTION__, irq); >> >> > set_irq_chip_and_handler_name(irq, &xen_pirq_chip, >> >> > handle_edge_irq, name); >> >> > } >> >> >> >> Now this, imo, is a very good reason to not use handle_edge_irq() >> >> at all, and instead use the prior control flow (masking and clearing >> >> the event channel up front in do_upcall()) with only fasteoi (leaving >> >> aside per-CPU ones). >> >> >> > >> > Actually I think it is a good reason to fix pirq_needs_eoi that shouldn''t >> > return unconditionally yes if dom0 doesn''t support pirq_eoi_map. >> > The comment in Xen says: >> > >> > /* >> > * Even edge-triggered or message-based IRQs can need masking from >> > * time to time. If teh guest is not dynamically checking for this >> > * via the new pirq_eoi_map mechanism, it must conservatively always >> > * execute the EOI hypercall. In practice, this only really makes a >> > * difference for maskable MSI sources, and if those are supported >> > * then dom0 is probably modern anyway. >> > */ >> > >> > Considering that I would rather avoid supporting pirq_eoi_map and we are >> > talking about edge triggered interrupts, do you think it would be safe >> > for me to send a patch to xen to change this behaviour? >> > Shouldn''t we set XENIRQSTAT_needs_eoi only for level triggered >> > interrupts (and maybe maskable MSI sources)? >> >> Only if you can prove that the very first part of that comment is >> incorrect (in including "edge-triggered" and ignoring whether MSI >> sources are maskable). And your Linux side code would then still >> be incorrect for maskable MSIs (you''d continue to handle them >> as fasteoi with no up front clearing/masking while that is necessary >> as Thomas'' report made clear). > > I believe we handle MSI''s as ''handle_edge_chip'' (xen_bind_pirq_msi_to_irq) > irregardless of the above hypercall.So how do you issue the possibly necessary EOI call then?> You wouldn''t have a nice chart of the right type of events to > do for different types of interrupts, would you?No, sorry - all I usually look at are the three more or less different implementations. Jan>> >> What''s so wrong with pirq_eoi_map that you''re trying to avoid it >> by all means? > > My recollection is that we had a hard time trying to work it in with the > tglr''x rewrite of the IRQ code and not enough understanding of this (at > least > on my side). Any help here would be appreciated._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 24 May 2011, Thomas Goetz wrote:> I tested this patch and it worked fine for me. It will go into nightly testing tonight and I''ll let you know if we hit any issues. >thank you very much for your help on this> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 24 May 2011, Jan Beulich wrote:> So how do you issue the possibly necessary EOI call then?we issue the EOI call from the irq_ack callback _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, 24 May 2011, Jan Beulich wrote:> > The main issue is that if pirq_eoi_map is enabled PHYSDEVOP_eoi > > automatically unmask the event channel. > > There isn''t even a way to specify if we want the unmask to be done or > > not, it just does it. > > I can''t think of situations where this would be a problem. It certainly > never has been in our kernels.The situation is simple: we want an event channel to stay masked (because for example the corresponding irq has been disabled) but at the same time we need to eoi the pirq. AFACT it is not possible to do this with pirq_eoi_map. An event channel might be masked for a number of reasons, it is wrong to assume that an eoi should enable it again. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel