On 10/11/2016 9:03 PM, Ashish Yadav wrote:> Please test that if both the server are communicating with each other at > 1Gbps or not via "iperf" tool. > > If above gives result of 1Gbps then it will eliminate the NICs problem then > you know that it is a problem with cisco switch only.after they forced the cisco ports to gigE, I was seeing 200-400Mbps in iPerf, which was odd. servers were both very lightly loaded. BUT... the switch ports kept going offline on us. Note I have no admin access to the switch, its managed by IT so I have to go through channels to get anything. I asked what error codes were causing the ports to go offline but haven't heard back. as of right now, both servers are offline, (I can reach their IPMI management controller, and remotely log onto the console just fine, but the ports show no link). When I was in the DC yesterday, I switched ports, same problem, I also switched the network cable with a different (HP) server, it had no problems on the same cable+port thats giving these supermicro servers problems. I'd chalk it up to a bad NIC, but two identical servers with two nic's each all have this problem, so its got to be something else, some weirdness with the 82574L as implemented on these SuperMicro X8DTE-F servers running CentOS 6.7 ?!? In our old DC, these servers ran rock solid for several years without any network issues at all, in that rack I had a Netgear JGS524 -- john r pierce, recycling bits in santa cruz
On 12.10.2016 06:26, John R Pierce wrote:> On 10/11/2016 9:03 PM, Ashish Yadav wrote: >> Please test that if both the server are communicating with each other at >> 1Gbps or not via "iperf" tool. >> >> If above gives result of 1Gbps then it will eliminate the NICs problem >> then >> you know that it is a problem with cisco switch only. > > after they forced the cisco ports to gigE, I was seeing 200-400Mbps in > iPerf, which was odd. servers were both very lightly loaded. > > BUT... the switch ports kept going offline on us. Note I have no > admin access to the switch, its managed by IT so I have to go through > channels to get anything. I asked what error codes were causing the > ports to go offline but haven't heard back. as of right now, both > servers are offline, (I can reach their IPMI management controller, and > remotely log onto the console just fine, but the ports show no link). > When I was in the DC yesterday, I switched ports, same problem, I also > switched the network cable with a different (HP) server, it had no > problems on the same cable+port thats giving these supermicro servers > problems. > > I'd chalk it up to a bad NIC, but two identical servers with two nic's > each all have this problem, so its got to be something else, some > weirdness with the 82574L as implemented on these SuperMicro X8DTE-F > servers running CentOS 6.7 ?!? In our old DC, these servers ran rock > solid for several years without any network issues at all, in that rack > I had a Netgear JGS524A while back there was an issue with this nic chipset and CentOS but I'm not sure if this still applies to CentOS 6.7: https://blog.andreas-haerter.com/2013/02/11/intel-82574l-network-nic-aspm-bug-e1000-linux-rhel-centos-sl-6.3 If this is your problem then adding "pcie_aspm=off" should fix it. Regards, Dennis
> On Oct 12, 2016, at 12:26 AM, John R Pierce <pierce at hogranch.com> wrote: > > the switch ports kept going offline on us.Not finding anything exactly like this... Closest I could find is CSCuu81949 Open a Cisco TAC case and upload a Nexus 9000 tech support (`tac-pac`) to investigate further. Is "port security" enabled on these ports? Does this port double as a LOM/IPMI port? What driver is being used `ethtool -i eth#`? Any OS bonding? Any switch port-channel/vPC? That NIC chipset is a few years old so it's not like that NIC/OS/switch is a combo that hasn't been tried/tested.
On 10/13/2016 5:57 PM, Steven Tardy wrote:>> On Oct 12, 2016, at 12:26 AM, John R Pierce <pierce at hogranch.com> wrote: >> >> the switch ports kept going offline on us. > Not finding anything exactly like this... Closest I could find is CSCuu81949 > > Open a Cisco TAC case and upload a Nexus 9000 tech support (`tac-pac`) to investigate further. > > Is "port security" enabled on these ports?I don't know, asking LAN operations> Does this port double as a LOM/IPMI port?no, boards have dedicated IPMI port, which doesn't go offline> What driver is being used `ethtool -i eth#`?e1000m> Any OS bonding?no> Any switch port-channel/vPC?I don't believe so.> That NIC chipset is a few years old so it's not like that NIC/OS/switch is a combo that hasn't been tried/tested.yeah, about as bog stock normal a config as I've seen. -- john r pierce, recycling bits in santa cruz