Lev Serebryakov
2017-Nov-09 20:17 UTC
Intel I210 (igb) sometimes consume all CPU on not-so-big traffic — need help!
I still have problems with my E-1220v3 server equipped with Intel I210 adapter. It can not be loaded more than 100Mbit/s because it is connection to internet. But sometimes four interrupt threads "intr{irqXXX: igb0:que Y}" consume 100% CPU. Interrupt rate is very modest: % vmstat -i interrupt total rate ... irq276: igb0:que 0 851899713 1193 irq277: igb0:que 1 907338150 1271 irq278: igb0:que 2 907538207 1271 irq279: igb0:que 3 768217584 1076 irq280: igb0:link 2 0 % But CPU consumption is 90-100% per thread: PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root -92 - 0K 544K CPU2 2 146:22 98.30% intr{irq278: igb0:que 2} 11 root -92 - 0K 544K WAIT 0 178:18 81.55% intr{irq276: igb0:que 0} 11 root -92 - 0K 544K WAIT 1 135:34 77.77% intr{irq277: igb0:que 1} 11 root -92 - 0K 544K CPU3 3 138:57 67.50% intr{irq279: igb0:que 3} procstat -ak looks suspicious: % sudo procstat -ak | grep igb0:que 11 100056 intr irq276: igb0:que 0 vm_page_scan_contig vm_phys_scan_contig vm_page_reclaim_contig kmem_alloc_contig mbuf_jumbo_alloc keg_alloc_slab keg_fetch_slab zone_fetch_slab zone_import zone_alloc_item uma_zalloc_arg m_getjcl igb_refresh_mbufs igb_rxeof igb_msix_que intr_event_execute_handlers ithread_loop fork_exit 11 100058 intr irq277: igb0:que 1 mi_switch ithread_loop fork_exit fork_trampoline 11 100060 intr irq278: igb0:que 2 mi_switch ithread_loop fork_exit fork_trampoline 11 100062 intr irq279: igb0:que 3 mi_switch ithread_loop fork_exit fork_trampoline % -- // Lev Serebryakov -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: OpenPGP digital signature URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20171109/da661197/attachment.sig>
Michael Sierchio
2017-Nov-09 20:23 UTC
Re: Intel I210 (igb) sometimes consume all CPU on not-so-big traffic — need help!
Is device polling enabled? - M On Thu, Nov 9, 2017 at 12:17 PM, Lev Serebryakov <lev at freebsd.org> wrote:> > I still have problems with my E-1220v3 server equipped with Intel I210 > adapter. It can not be loaded more than 100Mbit/s because it is > connection to internet. > > But sometimes four interrupt threads "intr{irqXXX: igb0:que Y}" consume > 100% CPU. Interrupt rate is very modest: > > % vmstat -i > interrupt total rate > ... > irq276: igb0:que 0 851899713 1193 > irq277: igb0:que 1 907338150 1271 > irq278: igb0:que 2 907538207 1271 > irq279: igb0:que 3 768217584 1076 > irq280: igb0:link 2 0 > % > > But CPU consumption is 90-100% per thread: > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 11 root -92 - 0K 544K CPU2 2 146:22 98.30% intr{irq278: > igb0:que 2} > 11 root -92 - 0K 544K WAIT 0 178:18 81.55% intr{irq276: > igb0:que 0} > 11 root -92 - 0K 544K WAIT 1 135:34 77.77% intr{irq277: > igb0:que 1} > 11 root -92 - 0K 544K CPU3 3 138:57 67.50% intr{irq279: > igb0:que 3} > > procstat -ak looks suspicious: > > % sudo procstat -ak | grep igb0:que > 11 100056 intr irq276: igb0:que 0 vm_page_scan_contig > vm_phys_scan_contig vm_page_reclaim_contig kmem_alloc_contig > mbuf_jumbo_alloc keg_alloc_slab keg_fetch_slab zone_fetch_slab > zone_import zone_alloc_item uma_zalloc_arg m_getjcl igb_refresh_mbufs > igb_rxeof igb_msix_que intr_event_execute_handlers ithread_loop fork_exit > 11 100058 intr irq277: igb0:que 1 mi_switch ithread_loop fork_exit > fork_trampoline > 11 100060 intr irq278: igb0:que 2 mi_switch ithread_loop fork_exit > fork_trampoline > 11 100062 intr irq279: igb0:que 3 mi_switch ithread_loop fork_exit > fork_trampoline > % > > -- > // Lev Serebryakov > >-- "Well," Brahma said, "even after ten thousand explanations, a fool is no wiser, but an intelligent person requires only two thousand five hundred." - The Mah?bh?rata
Lev Serebryakov
2017-Nov-20 14:36 UTC
Re: Intel I210 (igb) sometimes consume all CPU on not-so-big traffic — need help!
On 09.11.2017 23:17, Lev Serebryakov wrote: Looks like I know where it spent all time. I've used 'pmcstat' and got very suspicious flamegraph. Looks like problem is on codepath which lies through igb_refresh_mbufs m_getjcl uma_zalloc_arg [zone_alloc_item] zone_import zone_fetch_slab keg_fetch_slab keg_alloc_slab mbuf_jumbo_alloc kmem_alloc_contig vm_page_reclaim_contig vm_phys_scan_contig vm_page_scan_contig zone_alloc_item is optional, it presents 50% of time, otherwise path is one step shorter. -- // Lev Serebryakov -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: OpenPGP digital signature URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20171120/114fbfc8/attachment.sig>