Jack Vogel
2007-Mar-05 22:13 UTC
PATCH : ARP problem with 6.2-STABLE Intel PRO/1000 NIC, latest em driver
On 3/5/07, Mark Costlow <cheeks@swcp.com> wrote:> On Mon, Mar 05, 2007 at 10:02:26AM -0800, Jack Vogel wrote: > > On 3/5/07, Jack Vogel <jfvogel@gmail.com> wrote: > > >On 3/5/07, Mark Costlow <cheeks@swcp.com> wrote: > > >> On Mon, Mar 05, 2007 at 08:41:01AM -0800, Jack Vogel wrote: > > >> > > > > >> > >Maybe more of your dmesg might help as it could show interrrupt issues > > >> > >that perhaps others could help diagnose > > >> > > > >> > Yes, agreed, this might be revealing. > > >> > > >> Here's the full dmesg. Thanks for looking at this. > > >> > > >> ---------------------------- > > >> Copyright (c) 1992-2007 The FreeBSD Project. > > >> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > > >> The Regents of the University of California. All rights reserved. > > >> FreeBSD is a registered trademark of The FreeBSD Foundation. > > >> FreeBSD 6.2-STABLE #0: Sun Mar 4 22:40:38 MST 2007 > > >> root@ame4.swcp.com:/usr/obj/usr/src/sys/GENERIC > > >> ACPI APIC Table: <PTLTD APIC > > > >> Timecounter "i8254" frequency 1193182 Hz quality 0 > > >> CPU: Intel(R) Xeon(R) CPU 5130 @ 2.00GHz (2000.08-MHz > > >686-class CPU) > > >> Origin = "GenuineIntel" Id = 0x6f6 Stepping = 6 > > >> > > >Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,C > > >> MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > > >> > > >Features2=0x4e33d<SSE3,RSVD2,MON,DS_CPL,VMX,TM2,<b9>,CX16,<b14>,<b15>,<b18>> > > >> AMD Features=0x20000000<LM> > > >> AMD Features2=0x1<LAHF> > > >> Cores per package: 2 > > >> real memory = 3489005568 (3327 MB) > > >> avail memory = 3414384640 (3256 MB) > > >> ioapic0 <Version 2.0> irqs 0-23 on motherboard > > >> ioapic1 <Version 2.0> irqs 24-47 on motherboard > > >> kbd1 at kbdmux0 > > >> ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, > > >RF5413) > > >> acpi0: <PTLTD RSDT> on motherboard > > >> acpi0: Power Button (fixed) > > >> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > > >> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 > > >> cpu0: <ACPI CPU> on acpi0 > > >> acpi_throttle0: <ACPI CPU Throttling> on cpu0 > > >> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > > >> pci0: <ACPI PCI bus> on pcib0 > > >> pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0 > > >> pci1: <ACPI PCI bus> on pcib1 > > >> pcib2: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci1 > > >> pci2: <ACPI PCI bus> on pcib2 > > >> pcib3: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci2 > > >> pci3: <ACPI PCI bus> on pcib3 > > >> pcib4: <ACPI PCI-PCI bridge> irq 18 at device 2.0 on pci2 > > >> pci4: <ACPI PCI bus> on pcib4 > > >> em0: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port > > >0x2000-0x201f m > > >> em 0xda000000-0xda01ffff irq 18 at device 0.0 on pci4 > > >> em0: Ethernet address: 00:30:48:8c:71:54 > > >> em1: <Intel(R) PRO/1000 Network Connection Version - 6.2.9> port > > >0x2020-0x203f m > > >> em 0xda020000-0xda03ffff irq 19 at device 0.1 on pci4 > > >> em1: Ethernet address: 00:30:48:8c:71:55 > > >> pcib5: <ACPI PCI-PCI bridge> at device 0.3 on pci1 > > >> pci5: <ACPI PCI bus> on pcib5 > > >> 3ware device driver for 9000 series storage controllers, version: > > >3.60.02.012 > > >> twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem > > >0xd8000000-0 > > >> xd9ffffff,0xda100000-0xda100fff irq 24 at device 1.0 on pci5 > > >> twa0: [GIANT-LOCKED] > > >> twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-4LP, 4 > > >ports, Firm > > >> ware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 > > >> pci0: <base peripheral> at device 8.0 (no driver attached) > > >> pcib6: <ACPI PCI-PCI bridge> irq 17 at device 28.0 on pci0 > > >> pci6: <ACPI PCI bus> on pcib6 > > >> uhci0: <UHCI (generic) USB controller> port 0x1800-0x181f irq 17 at > > >device 29.0 > > >> on pci0 > > >> uhci0: [GIANT-LOCKED] > > >> usb0: <UHCI (generic) USB controller> on uhci0 > > >> usb0: USB revision 1.0 > > >> uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > > >> uhub0: 2 ports with 2 removable, self powered > > >> uhci1: <UHCI (generic) USB controller> port 0x1820-0x183f irq 19 at > > >device 29.1 > > >> on pci0 > > >> uhci1: [GIANT-LOCKED] > > >> usb1: <UHCI (generic) USB controller> on uhci1 > > >> usb1: USB revision 1.0 > > >> uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > > >> uhub1: 2 ports with 2 removable, self powered > > >> uhci2: <UHCI (generic) USB controller> port 0x1840-0x185f irq 18 at > > >device 29.2 > > >> on pci0 > > >> uhci2: [GIANT-LOCKED] > > >> usb2: <UHCI (generic) USB controller> on uhci2 > > >> usb2: USB revision 1.0 > > >> uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > > >> uhub2: 2 ports with 2 removable, self powered > > >> uhci3: <UHCI (generic) USB controller> port 0x1860-0x187f irq 16 at > > >device 29.3 > > >> on pci0 > > >> uhci3: [GIANT-LOCKED] > > >> usb3: <UHCI (generic) USB controller> on uhci3 > > >> usb3: USB revision 1.0 > > >> uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > > >> uhub3: 2 ports with 2 removable, self powered > > >> ehci0: <EHCI (generic) USB 2.0 controller> mem 0xda600000-0xda6003ff irq > > >17 at d > > >> evice 29.7 on pci0 > > >> ehci0: [GIANT-LOCKED] > > >> usb4: EHCI version 1.0 > > >> usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 > > >> usb4: <EHCI (generic) USB 2.0 controller> on ehci0 > > >> usb4: USB revision 2.0 > > >> uhub4: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 > > >> uhub4: 8 ports with 8 removable, self powered > > >> pcib7: <ACPI PCI-PCI bridge> at device 30.0 on pci0 > > >> pci7: <ACPI PCI bus> on pcib7 > > >> pci7: <display, VGA> at device 1.0 (no driver attached) > > >> isab0: <PCI-ISA bridge> at device 31.0 on pci0 > > >> isa0: <ISA bus> on isab0 > > >> atapci0: <Intel 63XXESB2 UDMA100 controller> port > > >0x1f0-0x1f7,0x3f6,0x170-0x177, > > >> 0x376,0x1880-0x188f at device 31.1 on pci0 > > >> ata0: <ATA channel 0> on atapci0 > > >> ata1: <ATA channel 1> on atapci0 > > >> pci0: <serial bus, SMBus> at device 31.3 (no driver attached) > > >> acpi_button0: <Power Button> on acpi0 > > >> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 > > >> atkbd0: <AT Keyboard> irq 1 on atkbdc0 > > >> kbd0 at atkbd0 > > >> atkbd0: [GIANT-LOCKED] > > >> psm0: <PS/2 Mouse> irq 12 on atkbdc0 > > >> psm0: [GIANT-LOCKED] > > >> psm0: model Generic PS/2 mouse, device ID 0 > > >> sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on > > >acpi0 > > >> sio0: type 16550A > > >> sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 > > >> sio1: type 16550A > > >> fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on > > >acpi0 > > >> fdc0: [FAST] > > >> ppc0: <ECP parallel printer port> port 0x378-0x37f,0x778-0x77f irq 7 drq > > >3 on ac > > >> pi0 > > >> ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode > > >> ppc0: FIFO with 16/16/9 bytes threshold > > >> ppbus0: <Parallel port bus> on ppc0 > > >> plip0: <PLIP network interface> on ppbus0 > > >> lpt0: <Printer> on ppbus0 > > >> lpt0: Interrupt-driven port > > >> ppi0: <Parallel I/O> on ppbus0 > > >> pmtimer0 on isa0 > > >> orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xcb000-0xcc7ff on isa0 > > >> sc0: <System console> at flags 0x100 on isa0 > > >> sc0: VGA <16 virtual consoles, flags=0x300> > > >> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > > >> rue0: USBKR100 USB 10/100 LAN, rev 1.10/1.00, addr 2 > > >> miibus0: <MII bus> on rue0 > > >> ruephy0: <RealTek RTL8150 internal media interface> on miibus0 > > >> ruephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > >> rue0: Ethernet address: 00:10:60:dd:ed:e9 > > >> rue0: if_start running deferred for Giant > > >> Timecounter "TSC" frequency 2000078406 Hz quality 800 > > >> Timecounters tick every 1.000 msec > > >> acd0: CDRW <HL-DT-STCD-RW/DVD-ROM GCC-4244N/B103> at ata0-master UDMA33 > > >> da0 at twa0 bus 0 target 0 lun 0 > > >> da0: <AMCC 9550SX-4LP DISK 3.04> Fixed Direct Access SCSI-3 device > > >> da0: 100.000MB/s transfers > > >> da0: 238408MB (488259584 512 byte sectors: 255H 63S/T 30392C) > > >> da1 at twa0 bus 0 target 1 lun 0 > > >> da1: <AMCC 9550SX-4LP DISK 3.04> Fixed Direct Access SCSI-3 device > > >> da1: 100.000MB/s transfers > > >> da1: 238408MB (488259584 512 byte sectors: 255H 63S/T 30392C) > > >> Trying to mount root from ufs:/dev/da0s1a > > >> em0: link state changed to UP > > >> em0: promiscuous mode enabled > > >> em0: promiscuous mode disabled > > >> twa0: INFO: (0x04: 0x0029): Verify started: unit=0 > > >> twa0: INFO: (0x04: 0x002B): Verify completed: unit=0 > > >> ---------------------------- > > >> > > >> This is while booting GENERIC. I can boot SMP and send that too if you > > >> suggest. > > >> > > >> Here's vmstat -i: > > >> > > >> interrupt total rate > > >> irq1: atkbd0 2 0 > > >> irq6: fdc0 3 0 > > >> irq14: ata0 47 0 > > >> irq16: uhci3 14836 0 > > >> irq17: uhci0 ehci0 25 0 > > >> irq18: em0 uhci2 91850 2 > > >> irq24: twa0 14828 0 > > >> cpu0: timer 79015190 1999 > > >> Total 79136781 2003 > > >> > > >> Is the fact that em0 and uhci2 are sharing an interrupt significant? > > > > > >Possibly, but it should work. Could you try a kernel with that defined out? > > >Secondly, would it be possible to load the latest snapshot of CURRENT > > >to see how it behaves, it uses MSI and would indicate if this is an > > >interrupt thing, I still doubt this however. > > > > > >Look over your docs and see if there's some means to disable system > > >management, I still think its interfering. I have a group meeting in 10 > > >mins, > > >I'll check on any known issues with this. > > > > Don't bother installing CURRENT, just got out of my meeting and I found > > out what the problem is. There is indeed an issue with management, and > > its something our test group isnt set up to test. I will send a patch to > > try sometime before end of day.OK, here is the patch, this should fix it... Cheers, Jack --- dist/if_em.c Sun Jan 21 04:13:37 2007 +++ ./if_em.c Tue Mar 6 05:13:07 2007 @@ -245,6 +245,8 @@ static void em_set_promisc(struct adapter *); static void em_disable_promisc(struct adapter *); static void em_set_multi(struct adapter *); +static void em_setup_manageability(struct adapter *); +static void em_release_manageability(struct adapter *); static void em_print_hw_stats(struct adapter *); static void em_update_link_status(struct adapter *); static int em_get_buf(int i, struct adapter *, struct mbuf *); @@ -511,6 +513,9 @@ /* Initialize eeprom parameters */ em_init_eeprom_params(&adapter->hw); + /* Determine if we have hardware managability */ + adapter->em_mng_passthru = em_enable_mng_pass_thru(&adapter->hw); + tsize = roundup2(adapter->num_tx_desc * sizeof(struct em_tx_desc), EM_DBA_ALIGN); @@ -1101,6 +1106,9 @@ #endif } + /* Configure for OS presence */ + em_setup_manageability(adapter); + /* Prepare transmit descriptors and buffers */ em_setup_transmit_structures(adapter); em_initialize_transmit_unit(adapter); @@ -2105,6 +2113,10 @@ /* Tell the stack that the interface is no longer active */ ifp->if_drv_flags &= ~(IFF_DRV_RUNNING | IFF_DRV_OACTIVE); + + /* Enable HW manageability if available */ + em_release_manageability(adapter); + em_reset_hw(&adapter->hw); } @@ -4163,3 +4175,46 @@ SYSCTL_CHILDREN(device_get_sysctl_tree(adapter->dev)), OID_AUTO, name, CTLTYPE_INT|CTLFLAG_RW, limit, value, description); } + +#define E1000_82542_MANC2H E1000_MANC2H // A shared code workaround + +static void +em_setup_manageability(struct adapter *adapter) +{ + if (adapter->em_mng_passthru) { + int manc2h = E1000_READ_REG(&adapter->hw, MANC2H); + int manc = E1000_READ_REG(&adapter->hw, MANC); + + /* disable hardware interception of ARP */ + manc &= ~(E1000_MANC_ARP_EN); + + /* enable receiving management packets to the host */ + if (adapter->hw.mac_type >= em_82571) { + manc |= E1000_MANC_EN_MNG2HOST; +#define E1000_MNG2HOST_PORT_623 (1 << 5) +#define E1000_MNG2HOST_PORT_664 (1 << 6) + manc2h |= E1000_MNG2HOST_PORT_623; + manc2h |= E1000_MNG2HOST_PORT_664; + E1000_WRITE_REG(&adapter->hw, MANC2H, manc2h); + } + + E1000_WRITE_REG(&adapter->hw, MANC, manc); + } +} + +static void +em_release_manageability(struct adapter *adapter) +{ + if (adapter->em_mng_passthru) { + int manc = E1000_READ_REG(&adapter->hw, MANC); + + /* re-enable hardware interception of ARP */ + manc |= E1000_MANC_ARP_EN; + + if (adapter->hw.mac_type >= em_82571) + manc &= ~E1000_MANC_EN_MNG2HOST; + + E1000_WRITE_REG(&adapter->hw, MANC, manc); + } +} + --- dist/if_em.h Sun Jan 21 04:13:36 2007 +++ ./if_em.h Tue Mar 6 03:05:15 2007 @@ -334,6 +334,7 @@ int if_flags; struct mtx mtx; int em_insert_vlan_header; + int em_mng_passthru; #ifdef EM_FAST_INTR struct task link_task;
Mark Costlow
2007-Mar-06 02:32 UTC
PATCH : ARP problem with 6.2-STABLE Intel PRO/1000 NIC, latest em driver
On Mon, Mar 05, 2007 at 02:13:36PM -0800, Jack Vogel wrote: [...snip...]> >> > >> Don't bother installing CURRENT, just got out of my meeting and I found > >> out what the problem is. There is indeed an issue with management, and > >> its something our test group isnt set up to test. I will send a patch to > >> try sometime before end of day. > > OK, here is the patch, this should fix it...Hi Jack, the patch didn't seem to have any effect. When I run "tcpdump -n arp" after rebooting with this patch, I still see 2-3 ARPs per minute instead of 100-200 per minute. I was patching against: /*$FreeBSD: src/sys/dev/em/if_em.c,v 1.65.2.22 2007/03/01 17:32:27 csjp Exp $*/ Is that correct? I tried both SMP and non-SMP kernels, with same results. Is there anything I can do to gather some additional debug information from the system while it's running? I neglected to mention before the specific motherboard model: Supermicro X7DVL-E. There is no IPMI card installed, and no IPMI setting in the BIOS. Thanks, Mark -- Mark Costlow | Southwest Cyberport | Fax: +1-505-232-7975 cheeks@swcp.com | Web: www.swcp.com | Voice: +1-505-232-7992 abq-strange.com -- Interesting photos taken in Albuquerque, NM Last post: Art Is OK...And Dangerous - 2007-03-02 10:27:17