Scott L. Lykens
2014-Jun-01 12:41 UTC
[asterisk-users] wct4xxp Excessive Interrupts Resulting in Unusable System or Card
Hello all- I have a Digium TE410P in an HP DL145 G2 dual processor server that generates well over 100,000 interrupts per second (sometimes I?ve counted 160,000+ per second) generally resulting in either the system becoming swamped and unusable or the kernel disabling the IRQ the TE410P is on resulting in the spans on that card being unusable. I have confirmed that the card is good by placing it in an IBM server running FreePBX Distro and verifying that it generates only 1,000 interrupts per second, and works properly. This is on a system running 64-bit Ubuntu 14.04 LTS, kernels 3.13.0-27-generic and 3.13.0-27-lowlatency. I have compiled and installed DAHDI from source, both 2.9.1.1 and 2.8.0, and see the same result with the Ubuntu DAHDI package which is based on 2.5.0. I have entered BIOS and disabled all extra devices I can and reset the configuration data. Most frequently the interrupt is disabled by the kernel - booting with the irqpoll option as suggested by the error message does not always solve the problem and introduces other problems. See dmesg below: (not prepped yet message repeat *many* times) [ 16.371739] wct4xxp 0000:81:01.0: Not prepped yet! [ 16.371743] wct4xxp 0000:81:01.0: Not prepped yet! [ 16.611991] irq 25: nobody cared (try booting with the "irqpoll" option) [ 16.615221] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GF O 3.13.0-27-generic #50-Ubuntu [ 16.615224] Hardware name: HP ProLiant DL145 G2/K85NL, BIOS 2.14 10/20/2005 [ 16.615227] ffff880139ea6a9c ffff88013bc03e68 ffffffff817199c4 ffff880139ea6a00 [ 16.615231] ffff88013bc03e90 ffffffff810c19d2 ffff880139ea6a00 0000000000000019 [ 16.615235] 0000000000000000 ffff88013bc03ed0 ffffffff810c1e6c 000000008101b763 [ 16.615239] Call Trace: [ 16.615241] <IRQ> [<ffffffff817199c4>] dump_stack+0x45/0x56 [ 16.615253] [<ffffffff810c19d2>] __report_bad_irq+0x32/0xd0 [ 16.615257] [<ffffffff810c1e6c>] note_interrupt+0x1ac/0x200 [ 16.615260] [<ffffffff810bf749>] handle_irq_event_percpu+0xd9/0x1d0 [ 16.615263] [<ffffffff810bf87d>] handle_irq_event+0x3d/0x60 [ 16.615267] [<ffffffff810c29ea>] handle_fasteoi_irq+0x5a/0x100 [ 16.615272] [<ffffffff81015cde>] handle_irq+0x1e/0x30 [ 16.615276] [<ffffffff8172c6cd>] do_IRQ+0x4d/0xc0 [ 16.615281] [<ffffffff81721e6d>] common_interrupt+0x6d/0x6d [ 16.615283] <EOI> [<ffffffff810d63c1>] ? tick_nohz_idle_enter+0x41/0x70 [ 16.615289] [<ffffffff810d63bd>] ? tick_nohz_idle_enter+0x3d/0x70 [ 16.615292] [<ffffffff810beb48>] cpu_startup_entry+0x88/0x290 [ 16.615297] [<ffffffff81707e97>] rest_init+0x77/0x80 [ 16.615302] [<ffffffff81d35f70>] start_kernel+0x438/0x443 [ 16.615305] [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c [ 16.615308] [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120 [ 16.615312] [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c [ 16.615315] [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152 [ 16.615317] handlers: [ 16.615987] [<ffffffffa01d3420>] t4_interrupt_gen2 [wct4xxp] [ 16.615987] Disabling IRQ #25 [ 17.607238] dahdi_echocan_mg2: Registered echo canceler 'MG2' [ 17.608276] wct4xxp 0000:81:01.0: Span 1 configured for ESF/B8ZS [ 17.608360] wct4xxp 0000:81:01.0: SPAN 1: Primary Sync Source [ 17.708056] wct4xxp 0000:81:01.0: RCLK source set to span 1 [ 17.708065] wct4xxp 0000:81:01.0: Recovered timing mode, RCLK set to span 1 [ 17.736138] wct4xxp 0000:81:01.0: Span 2 configured for ESF/B8ZS [ 17.808065] wct4xxp 0000:81:01.0: RCLK source set to span 1 [ 17.808073] wct4xxp 0000:81:01.0: Recovered timing mode, RCLK set to span 1 [ 17.864134] wct4xxp 0000:81:01.0: Span 3 configured for ESF/B8ZS [ 17.908049] wct4xxp 0000:81:01.0: RCLK source set to span 1 [ 17.908058] wct4xxp 0000:81:01.0: Recovered timing mode, RCLK set to span 1 [ 17.992139] wct4xxp 0000:81:01.0: Span 4 configured for ESF/B8ZS [ 18.008106] wct4xxp 0000:81:01.0: RCLK source set to span 1 [ 18.008114] wct4xxp 0000:81:01.0: Recovered timing mode, RCLK set to span 1 [ 20.208172] wct4xxp 0000:81:01.0: Setting yellow alarm span 1 [ 20.208212] wct4xxp 0000:81:01.0: RCLK source set to span 2 [ 20.208216] wct4xxp 0000:81:01.0: System timing mode, RCLK set to span 2 [ 20.308149] wct4xxp 0000:81:01.0: Setting yellow alarm span 2 [ 20.308180] wct4xxp 0000:81:01.0: RCLK source set to span 3 [ 20.308184] wct4xxp 0000:81:01.0: System timing mode, RCLK set to span 3 [ 20.408173] wct4xxp 0000:81:01.0: Setting yellow alarm span 3 [ 20.408200] wct4xxp 0000:81:01.0: RCLK source set to span 4 [ 20.408204] wct4xxp 0000:81:01.0: System timing mode, RCLK set to span 4 [ 25.601523] wct4xxp 0000:81:01.0: Span 1 configured for ESF/B8ZS [ 25.601587] wct4xxp 0000:81:01.0: SPAN 1: Primary Sync Source [ 25.601673] wct4xxp 0000:81:01.0: Span 4 configured for ESF/B8ZS [ 25.608209] wct4xxp 0000:81:01.0: RCLK source set to span 4 [ 25.608215] wct4xxp 0000:81:01.0: System timing mode, RCLK set to span 4 Checking /proc/interrupts reveals that the card generated 100,000 interrupts without being serviced and the kernel disabled it (and also reveals that the card is apparently on its own IRQ): maintenance at sip:~$ cat /proc/interrupts CPU0 CPU1 0: 46 0 IO-APIC-edge timer 1: 10 0 IO-APIC-edge i8042 7: 1 0 IO-APIC-edge 8: 0 0 IO-APIC-edge rtc0 9: 0 0 IO-APIC-fasteoi acpi 12: 4 0 IO-APIC-edge i8042 14: 0 0 IO-APIC-edge pata_amd 15: 0 0 IO-APIC-edge pata_amd 16: 304 0 IO-APIC-fasteoi nouveau 19: 1221 0 IO-APIC-fasteoi eth1 21: 8681 0 IO-APIC-fasteoi sata_nv 22: 0 0 IO-APIC-fasteoi ehci_hcd:usb1 23: 0 0 IO-APIC-fasteoi ohci_hcd:usb2 25: 100000 1 IO-APIC-fasteoi wct4xxp NMI: 1 1 Non-maskable interrupts LOC: 17884 19728 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 1 1 Performance monitoring interrupts IWI: 1554 815 IRQ work interrupts RTR: 0 0 APIC ICR read retries RES: 6566 8577 Rescheduling interrupts CAL: 220 4521 Function call interrupts TLB: 638 504 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 1 1 Machine check polls ERR: 1 MIS: 0 Any ideas on how I can further diagnose and pursue this? Google does not reveal much related to this issue that is useful. Thank you! -- Scott L. Lykens Keystone Medical Management Solutions, Inc. +1 814 325-7500 x501 -- www.kmmsinc.com<http://www.kmmsinc.com> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20140601/cb1867f5/attachment.html>