On Thu, Nov 10, 2011 at 10:22:39AM +0100, Willem Jan Withagen
wrote:> Still running this file server on ZFS, and every now and then em0
> goes down, and is not revivable.... Nothing goes in or out the
> box...
>
> Any suggestions as how to (help) fix this?
CC'ing Jack Vogel of Intel.
We need "pciconf -lvbc" output (-lv by itself isn't sufficient in
this
regard).
Also, please do "sysctl dev.em.0.debug=1", which will show nothing
useful in the output, however "dmesg" shortly after should have a
bunch
of driver-level debugging information that should help (output starts
with "Interface is ...". Please provide that too.
> Nov 10 09:07:41 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:07:41 zfs kernel: em0: Queue(0) tdh = 187, hw tdt = 189
> Nov 10 09:07:41 zfs kernel: em0: TX(0) desc avail = 1022,Next TX to Clean =
187
> Nov 10 09:11:32 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:11:32 zfs kernel: em0: Queue(0) tdh = 139, hw tdt = 151
> Nov 10 09:11:32 zfs kernel: em0: TX(0) desc avail = 1012,Next TX to Clean =
139
> Nov 10 09:16:05 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:16:05 zfs kernel: em0: Queue(0) tdh = 152, hw tdt = 163
> Nov 10 09:16:05 zfs kernel: em0: TX(0) desc avail = 1013,Next TX to Clean =
152
> Nov 10 09:33:10 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:33:10 zfs kernel: em0: Queue(0) tdh = 161, hw tdt = 176
> Nov 10 09:33:10 zfs kernel: em0: TX(0) desc avail = 1008,Next TX to Clean =
160
> Nov 10 09:53:18 zfs kernel: em0: Watchdog timeout -- resetting
> Nov 10 09:53:18 zfs kernel: em0: Queue(0) tdh = 157, hw tdt = 172
> Nov 10 09:53:18 zfs kernel: em0: TX(0) desc avail = 1009,Next TX to Clean =
157
>
> Device is:
> Nov 10 10:07:27 zfs kernel: em0: <Intel(R) PRO/1000 Network Connection
7.2.3> port 0x1820-0x183f mem 0xdf900000-0xdf91ffff,0xdf924000-0xdf924fff irq
16 at device 25.0 on pci0
> Nov 10 10:07:27 zfs kernel: em0: Using an MSI interrupt
> Nov 10 10:07:27 zfs kernel: em0: [FILTER]
>
> pciconf -lv:
> em0@pci0:0:25:0: class=0x020000 card=0x10bd15d9
> chip=0x10bd8086 rev=0x02 hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Intel 82566DM Gigabit Ethernet Adapter (82566DM)'
> class = network
> subclass = ethernet
>
> uname:
> 8.2-STABLE FreeBSD 8.2-STABLE #12: Sun Oct 2 13:36:55 CEST 2011
> amd64
>
> sysctl -a | grep em.0:
> dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.3
> dev.em.0.%driver: em
> dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.LAN_
> dev.em.0.%pnpinfo: vendor=0x8086 device=0x10bd subvendor=0x15d9
> subdevice=0x10bd class=0x020000
> dev.em.0.%parent: pci0
> dev.em.0.nvm: -1
> dev.em.0.debug: -1
> dev.em.0.rx_int_delay: 0
> dev.em.0.tx_int_delay: 66
> dev.em.0.rx_abs_int_delay: 66
> dev.em.0.tx_abs_int_delay: 66
> dev.em.0.rx_processing_limit: 100
> dev.em.0.flow_control: 3
> dev.em.0.eee_control: 0
> dev.em.0.link_irq: 0
> dev.em.0.mbuf_alloc_fail: 0
> dev.em.0.cluster_alloc_fail: 0
> dev.em.0.dropped: 0
> dev.em.0.tx_dma_fail: 0
> dev.em.0.rx_overruns: 6
> dev.em.0.watchdog_timeouts: 5
> dev.em.0.device_control: 1074790976
> dev.em.0.rx_control: 67141634
> dev.em.0.fc_high_water: 8192
> dev.em.0.fc_low_water: 6692
> dev.em.0.queue0.txd_head: 78
> dev.em.0.queue0.txd_tail: 78
> dev.em.0.queue0.tx_irq: 0
> dev.em.0.queue0.no_desc_avail: 0
> dev.em.0.queue0.rxd_head: 376
> dev.em.0.queue0.rxd_tail: 375
> dev.em.0.queue0.rx_irq: 0
> dev.em.0.mac_stats.excess_coll: 0
> dev.em.0.mac_stats.single_coll: 0
> dev.em.0.mac_stats.multiple_coll: 0
> dev.em.0.mac_stats.late_coll: 0
> dev.em.0.mac_stats.collision_count: 0
> dev.em.0.mac_stats.symbol_errors: 0
> dev.em.0.mac_stats.sequence_errors: 0
> dev.em.0.mac_stats.defer_count: 0
> dev.em.0.mac_stats.missed_packets: 9
> dev.em.0.mac_stats.recv_no_buff: 0
> dev.em.0.mac_stats.recv_undersize: 0
> dev.em.0.mac_stats.recv_fragmented: 0
> dev.em.0.mac_stats.recv_oversize: 0
> dev.em.0.mac_stats.recv_jabber: 0
> dev.em.0.mac_stats.recv_errs: 1
> dev.em.0.mac_stats.crc_errs: 1
> dev.em.0.mac_stats.alignment_errs: 0
> dev.em.0.mac_stats.coll_ext_errs: 0
> dev.em.0.mac_stats.xon_recvd: 0
> dev.em.0.mac_stats.xon_txd: 0
> dev.em.0.mac_stats.xoff_recvd: 0
> dev.em.0.mac_stats.xoff_txd: 0
> dev.em.0.mac_stats.total_pkts_recvd: 160062850
> dev.em.0.mac_stats.good_pkts_recvd: 160062840
> dev.em.0.mac_stats.bcast_pkts_recvd: 79648
> dev.em.0.mac_stats.mcast_pkts_recvd: 10220
> dev.em.0.mac_stats.rx_frames_64: 0
> dev.em.0.mac_stats.rx_frames_65_127: 0
> dev.em.0.mac_stats.rx_frames_128_255: 0
> dev.em.0.mac_stats.rx_frames_256_511: 0
> dev.em.0.mac_stats.rx_frames_512_1023: 0
> dev.em.0.mac_stats.rx_frames_1024_1522: 0
> dev.em.0.mac_stats.good_octets_recvd: 107143604749
> dev.em.0.mac_stats.good_octets_txd: 129876768158
> dev.em.0.mac_stats.total_pkts_txd: 179010567
> dev.em.0.mac_stats.good_pkts_txd: 179010567
> dev.em.0.mac_stats.bcast_pkts_txd: 14608
> dev.em.0.mac_stats.mcast_pkts_txd: 206
> dev.em.0.mac_stats.tx_frames_64: 0
> dev.em.0.mac_stats.tx_frames_65_127: 0
> dev.em.0.mac_stats.tx_frames_128_255: 0
> dev.em.0.mac_stats.tx_frames_256_511: 0
> dev.em.0.mac_stats.tx_frames_512_1023: 0
> dev.em.0.mac_stats.tx_frames_1024_1522: 0
> dev.em.0.mac_stats.tso_txd: 3691806
> dev.em.0.mac_stats.tso_ctx_fail: 0
> dev.em.0.interrupts.asserts: 130023913
> dev.em.0.interrupts.rx_pkt_timer: 0
> dev.em.0.interrupts.rx_abs_timer: 0
> dev.em.0.interrupts.tx_pkt_timer: 0
> dev.em.0.interrupts.tx_abs_timer: 0
> dev.em.0.interrupts.tx_queue_empty: 0
> dev.em.0.interrupts.tx_queue_min_thresh: 0
> dev.em.0.interrupts.rx_desc_min_thresh: 0
> dev.em.0.interrupts.rx_overrun: 0
> dev.em.0.wake: 0
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |