Bez?glich Harry Schmalzbauer's Nachricht vom 07.01.2015 06:39
(localtime):> Hello,
>
> recently I upgraded one server from 9.1 to 10.1. There are two 82576
> (one port of two Intel ET Dual-Port GbE [kawela]), driven by igb(4).
> I've never seen any watchdog timeout with FreeBSD-9.1 but suddenly
(with
> 10-stable) I see:
> igb0: Watchdog timeout -- resetting
> igb0: Queue(0) tdh = 2974, hw tdt = 2973
> igb0: TX(0) desc avail = 0,Next TX to Clean = 0
>
> My biggest problem is, that lagg(4) doesn't detect the problem with
> igb0. It's configured with "lagghash l2' and most connections
were
> interupted until I manually do 'ifconfig igb0 down'. Then lagg does
it's
> job and connectivity was restored via the remaining igb1.
>
> Is there a way to auto-if-down an interface which suffers from watchdog
> timeouts? And any way to really reset it without rebooting the machine?
igb wathchdog timeout happened again :-( ~48 hours after the last with
very moderate-to-low avarage traffic.
This time I could fetch dev.igb sysctls before igb0 was reset by watchdog
It's showing strange irq load:
dev.igb.%parent:
dev.igb.0.%desc: Intel(R) PRO/1000 Network Connection version - 2.4.0
dev.igb.0.%driver: igb
dev.igb.0.%location: slot=0 function=0 handle=\_SB_.PCI0.PE60.S1F0
dev.igb.0.%pnpinfo: vendor=0x8086 device=0x10c9 subvendor=0x8086
subdevice=0xa03c class=0x020000
dev.igb.0.%parent: pci7
dev.igb.0.nvm: -1
dev.igb.0.enable_aim: 1
dev.igb.0.fc: 3
dev.igb.0.rx_processing_limit: 100
dev.igb.0.link_irq: 5
dev.igb.0.dropped: 0
dev.igb.0.tx_dma_fail: 0
dev.igb.0.rx_overruns: 0
dev.igb.0.watchdog_timeouts: 1
dev.igb.0.device_control: 1488978497
dev.igb.0.rx_control: 67272738
dev.igb.0.interrupt_mask: 4
dev.igb.0.extended_int_mask: 2147483679
dev.igb.0.tx_buf_alloc: 0
dev.igb.0.rx_buf_alloc: 0
dev.igb.0.fc_high_water: 47488
dev.igb.0.fc_low_water: 47472
dev.igb.0.queue0.interrupt_rate: 8000
dev.igb.0.queue0.txd_head: 0
dev.igb.0.queue0.txd_tail: 468
dev.igb.0.queue0.no_desc_avail: 41
dev.igb.0.queue0.tx_packets: 90807
dev.igb.0.queue0.rxd_head: 0
dev.igb.0.queue0.rxd_tail: 4095
dev.igb.0.queue0.rx_packets: 443307
dev.igb.0.queue0.rx_bytes: 0
dev.igb.0.queue0.lro_queued: 0
dev.igb.0.queue0.lro_flushed: 0
dev.igb.0.queue1.interrupt_rate: 8000
dev.igb.0.queue1.txd_head: 0
dev.igb.0.queue1.txd_tail: 221
dev.igb.0.queue1.no_desc_avail: 0
dev.igb.0.queue1.tx_packets: 300702
dev.igb.0.queue1.rxd_head: 0
dev.igb.0.queue1.rxd_tail: 4095
dev.igb.0.queue1.rx_packets: 734853
dev.igb.0.queue1.rx_bytes: 0
dev.igb.0.queue1.lro_queued: 0
dev.igb.0.queue1.lro_flushed: 0
dev.igb.0.queue2.interrupt_rate: 8000
dev.igb.0.queue2.txd_head: 0
dev.igb.0.queue2.txd_tail: 116
dev.igb.0.queue2.no_desc_avail: 0
dev.igb.0.queue2.tx_packets: 635285
dev.igb.0.queue2.rxd_head: 0
dev.igb.0.queue2.rxd_tail: 4095
dev.igb.0.queue2.rx_packets: 163156
dev.igb.0.queue2.rx_bytes: 0
dev.igb.0.queue2.lro_queued: 0
dev.igb.0.queue2.lro_flushed: 0
dev.igb.0.queue3.interrupt_rate: 8000
dev.igb.0.queue3.txd_head: 0
dev.igb.0.queue3.txd_tail: 199
dev.igb.0.queue3.no_desc_avail: 0
dev.igb.0.queue3.tx_packets: 177701
dev.igb.0.queue3.rxd_head: 0
dev.igb.0.queue3.rxd_tail: 4095
dev.igb.0.queue3.rx_packets: 209749
dev.igb.0.queue3.rx_bytes: 0
dev.igb.0.queue3.lro_queued: 0
dev.igb.0.queue3.lro_flushed: 0
dev.igb.0.mac_stats.excess_coll: 0
dev.igb.0.mac_stats.single_coll: 0
dev.igb.0.mac_stats.multiple_coll: 0
dev.igb.0.mac_stats.late_coll: 0
dev.igb.0.mac_stats.collision_count: 0
dev.igb.0.mac_stats.symbol_errors: 0
dev.igb.0.mac_stats.sequence_errors: 0
dev.igb.0.mac_stats.defer_count: 0
dev.igb.0.mac_stats.missed_packets: 0
dev.igb.0.mac_stats.recv_length_errors: 0
dev.igb.0.mac_stats.recv_no_buff: 0
dev.igb.0.mac_stats.recv_undersize: 0
dev.igb.0.mac_stats.recv_fragmented: 0
dev.igb.0.mac_stats.recv_oversize: 0
dev.igb.0.mac_stats.recv_jabber: 0
dev.igb.0.mac_stats.recv_errs: 0
dev.igb.0.mac_stats.crc_errs: 0
dev.igb.0.mac_stats.alignment_errs: 0
dev.igb.0.mac_stats.tx_no_crs: 0
dev.igb.0.mac_stats.coll_ext_errs: 0
dev.igb.0.mac_stats.xon_recvd: 0
dev.igb.0.mac_stats.xon_txd: 0
dev.igb.0.mac_stats.xoff_recvd: 0
dev.igb.0.mac_stats.xoff_txd: 0
dev.igb.0.mac_stats.unsupported_fc_recvd: 0
dev.igb.0.mac_stats.mgmt_pkts_recvd: 0
dev.igb.0.mac_stats.mgmt_pkts_drop: 0
dev.igb.0.mac_stats.mgmt_pkts_txd: 0
dev.igb.0.mac_stats.total_pkts_recvd: 1707305
dev.igb.0.mac_stats.good_pkts_recvd: 1551183
dev.igb.0.mac_stats.bcast_pkts_recvd: 179491
dev.igb.0.mac_stats.mcast_pkts_recvd: 1868
dev.igb.0.mac_stats.rx_frames_64: 212
dev.igb.0.mac_stats.rx_frames_65_127: 843418
dev.igb.0.mac_stats.rx_frames_128_255: 116516
dev.igb.0.mac_stats.rx_frames_256_511: 81391
dev.igb.0.mac_stats.rx_frames_512_1023: 14010
dev.igb.0.mac_stats.rx_frames_1024_1522: 495636
dev.igb.0.mac_stats.good_octets_recvd: 4228681579
dev.igb.0.mac_stats.total_octets_recvd: 4239899893
dev.igb.0.mac_stats.good_octets_txd: 3039302164
dev.igb.0.mac_stats.total_octets_recvd: 4239899893
dev.igb.0.mac_stats.good_octets_txd: 3039302164
dev.igb.0.mac_stats.total_octets_txd: 3039302164
dev.igb.0.mac_stats.total_pkts_txd: 1424648
dev.igb.0.mac_stats.good_pkts_txd: 1424648
dev.igb.0.mac_stats.bcast_pkts_txd: 412
dev.igb.0.mac_stats.mcast_pkts_txd: 6
dev.igb.0.mac_stats.tx_frames_64: 639519
dev.igb.0.mac_stats.tx_frames_65_127: 253844
dev.igb.0.mac_stats.tx_frames_128_255: 180022
dev.igb.0.mac_stats.tx_frames_256_511: 873
dev.igb.0.mac_stats.tx_frames_512_1023: 292
dev.igb.0.mac_stats.tx_frames_1024_1522: 350098
dev.igb.0.mac_stats.tso_txd: 95280
dev.igb.0.mac_stats.tso_ctx_fail: 0
dev.igb.0.interrupts.asserts: 3323144
dev.igb.0.interrupts.rx_pkt_timer: 1551160
dev.igb.0.interrupts.rx_abs_timer: 0
dev.igb.0.interrupts.tx_pkt_timer: 0
dev.igb.0.interrupts.tx_abs_timer: 1551069
dev.igb.0.interrupts.tx_queue_empty: 1424637
dev.igb.0.interrupts.tx_queue_min_thresh: 0
dev.igb.0.interrupts.rx_desc_min_thresh: 0
dev.igb.0.interrupts.rx_overrun: 0
dev.igb.0.host.breaker_tx_pkt: 0
dev.igb.0.host.host_tx_pkt_discard: 0
dev.igb.0.host.rx_pkt: 23
dev.igb.0.host.breaker_rx_pkts: 0
dev.igb.0.host.breaker_rx_pkt_drop: 0
dev.igb.0.host.tx_good_pkt: 11
dev.igb.0.host.breaker_tx_pkt_drop: 0
dev.igb.0.host.rx_good_bytes: 4228681579
dev.igb.0.host.tx_good_bytes: 3039302164
dev.igb.0.host.length_errors: 0
dev.igb.0.host.serdes_violation_pkt: 0
dev.igb.0.host.header_redir_missed: 0
Also igb1 was quiet busy at that time, but igb1 never hung:
dev.igb.1.queue0.interrupt_rate: 10526
dev.igb.1.queue0.txd_head: 1879
dev.igb.1.queue0.txd_tail: 1879
dev.igb.1.queue0.no_desc_avail: 0
dev.igb.1.queue0.tx_packets: 8694
dev.igb.1.queue0.rxd_head: 1116
dev.igb.1.queue0.rxd_tail: 1115
dev.igb.1.queue0.rx_packets: 181340
dev.igb.1.queue0.rx_bytes: 11819287
dev.igb.1.queue0.lro_queued: 0
dev.igb.1.queue0.lro_flushed: 0
dev.igb.1.queue1.interrupt_rate: 76923
dev.igb.1.queue1.txd_head: 945
dev.igb.1.queue1.txd_tail: 945
dev.igb.1.queue1.no_desc_avail: 0
dev.igb.1.queue1.tx_packets: 9295572
dev.igb.1.queue1.rxd_head: 203
dev.igb.1.queue1.rxd_tail: 202
dev.igb.1.queue1.rx_packets: 18239691
dev.igb.1.queue1.rx_bytes: 23591559819
dev.igb.1.queue1.lro_queued: 0
dev.igb.1.queue1.lro_flushed: 0
dev.igb.1.queue2.interrupt_rate: 43478
dev.igb.1.queue2.txd_head: 4027
dev.igb.1.queue2.txd_tail: 4027
dev.igb.1.queue2.no_desc_avail: 0
dev.igb.1.queue2.tx_packets: 7335
dev.igb.1.queue2.rxd_head: 2158
dev.igb.1.queue2.rxd_tail: 2157
dev.igb.1.queue2.rx_packets: 2153
dev.igb.1.queue2.rx_bytes: 413198
dev.igb.1.queue2.lro_queued: 0
dev.igb.1.queue2.lro_flushed: 0
dev.igb.1.queue3.interrupt_rate: 43478
Should I consider tungin "hw.igb.max_interrupt_rate" ?
Any help highly appreciated!
Like mentioned initially, I've never had this issue with FreeBSD 9.1
with exactly the same environment/workload.
Thanks,
-Harry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20150108/8ceb6399/attachment.sig>