thr3ads.net - CentOS - [CentOS] Failing Network card [Jun 2012]

If this information is useful, please help other people find it:
Share via:

Gregory P. Ennis

2012-Jun-20 13:34 UTC

[CentOS] Failing Network card

Everyone,

Most of the time I am over my head in trying to troubleshoot problems.
However, after reading manuals, man pages, and getting advice from this
list I have been able to work my way through difficulties, and at the
end, I usually have a better understanding of what 'is going on'.  I can
only hope this method will work on this problem too.

I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
card.  After adding the card to a machine with a new Centos 6.2 install
and naming it 'eth4' it works well for 6 to 12 hours and then fails.
The failure is characterized by dropping its connection speed from 1000
to 100 while not allowing any data to flow in or out.  When this happens
a shutdown and reboot does not solve the problem, but shutting down and
then removing the power does solve the problem.  

I wrote a perl script that uses the  eth4 interface by pinging another
machine every 60 seconds to try to figure out the relationship of the
message log entries with the time of failure, and I think there is a
corelation of the failure of eth4 to function with the below entry.
Unfortunately, I am way over my head on this one.  If any of you can
help I would surely appreciate your thoughts.

Some additional information that may be useful.  The TrendNet card is
the second TrendNet card I have used.  The first card had the same
symptoms, and I deduced the card was bad, and purchased another one. The
symptoms are the same with the second card.  

Before I purchase a third card from a different manufacturer I thought I
would post this to see what some of you think.  This is the first pci-e
card I have used; are there problems with the pci-e interfaces as
opposed to pci?  Do you think the motherboard could be the problem, and
moving eth4 to a different slot on the motherboard would be worthwhile.

Any ideas ???

Greg Ennis
P.S.  Here is the appropriate log entry in the /var/log/message file.

Jun 20 03:08:38 Mail kernel: ------------[ cut here ]------------
Jun 20 03:08:38 Mail kernel: WARNING: at net/sched/sch_generic.c:261
dev_watchdog+0x26d/0x280() (Not tainted)
Jun 20 03:08:38 Mail kernel: Hardware name: p7-1220
Jun 20 03:08:38 Mail kernel: NETDEV WATCHDOG: eth4 (r8169): transmit
queue 0 timed out
Jun 20 03:08:38 Mail kernel: Modules linked in: ipt_REDIRECT ipt_LOG
xt_limit ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat
xt_CHECKSUM iptable_mangle bridge autofs4 sunrpc bnx2fc cnic uio fcoe
libfcoe libfc 8021q scsi_transport_fc garp stp llc scsi_tgt
cpufreq_ondemand powernow_k8 freq_table mperf ipt_REJECT
nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT
nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter
ip6_tables ipv6 vhost_net macvtap macvlan tun kvm uinput sg btusb
bluetooth rfkill microcode snd_hda_codec_realtek snd_hda_intel
snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd
soundcore snd_page_alloc i2c_piix4 r8169 mii ext4 mbcache jbd2 sr_mod
cdrom sd_mod crc_t10dif usb_storage sdhci_pci sdhci mmc_core ahci radeon
ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: scsi_wait_scan]
Jun 20 03:08:38 Mail kernel: Pid: 0, comm: swapper Not tainted
2.6.32-220.23.1.el6.centos.plus.x86_64 #1
Jun 20 03:08:38 Mail kernel: Call Trace:
Jun 20 03:08:38 Mail kernel: <IRQ>  [<ffffffff81069c97>] ?
warn_slowpath_common+0x87/0xc0
Jun 20 03:08:38 Mail kernel: [<ffffffff81069d86>] ? warn_slowpath_fmt
+0x46/0x50
Jun 20 03:08:38 Mail kernel: [<ffffffff81069d86>] ? warn_slowpath_fmt
+0x46/0x50
Jun 20 03:08:38 Mail kernel: [<ffffffff81451c0d>] ? dev_watchdog
+0x26d/0x280
Jun 20 03:08:38 Mail kernel: [<ffffffff814519a0>] ? dev_watchdog
+0x0/0x280
Jun 20 03:08:38 Mail kernel: [<ffffffff810efbf3>] ?
trace_nowake_buffer_unlock_commit+0x43/0x60
Jun 20 03:08:38 Mail kernel: [<ffffffff814519a0>] ? dev_watchdog
+0x0/0x280
Jun 20 03:08:38 Mail kernel: [<ffffffff8107cab7>] ? run_timer_softirq
+0x197/0x340
Jun 20 03:08:38 Mail kernel: [<ffffffff81072291>] ? __do_softirq
+0xc1/0x1d0
Jun 20 03:08:38 Mail kernel: [<ffffffff810958b0>] ? hrtimer_interrupt
+0x140/0x250
Jun 20 03:08:38 Mail kernel: [<ffffffff8100c24c>] ? call_softirq
+0x1c/0x30
Jun 20 03:08:38 Mail kernel: [<ffffffff8100de85>] ? do_softirq+0x65/0xa0
Jun 20 03:08:38 Mail kernel: [<ffffffff81072075>] ? irq_exit+0x85/0x90
Jun 20 03:08:38 Mail kernel: [<ffffffff814fc550>] ?
smp_apic_timer_interrupt+0x70/0x9b
Jun 20 03:08:38 Mail kernel: [<ffffffff8100bc13>] ? apic_timer_interrupt
+0x13/0x20
Jun 20 03:08:38 Mail kernel: <EOI>  [<ffffffff812f5f9c>] ?
acpi_idle_enter_simple+0x114/0x14b
Jun 20 03:08:38 Mail kernel: [<ffffffff812f5f98>] ?
acpi_idle_enter_simple+0x110/0x14b
Jun 20 03:08:38 Mail kernel: [<ffffffff814014a7>] ? cpuidle_idle_call
+0xa7/0x140
Jun 20 03:08:38 Mail kernel: [<ffffffff81009e06>] ? cpu_idle+0xb6/0x110
Jun 20 03:08:38 Mail kernel: [<ffffffff814ed686>] ? start_secondary
+0x202/0x245
Jun 20 03:08:38 Mail kernel: ---[ end trace 24f15998c117ac8f ]---
Jun 20 03:08:38 Mail kernel: r8169 0000:01:00.0: eth4: link up
Jun 20 03:08:39 Mail abrtd: Directory 'oops-2012-06-20-03:08:39-2420-0'
creation detected
Jun 20 03:08:39 Mail abrt-dump-oops: Reported 1 kernel oopses to Abrt
Jun 20 03:08:39 Mail abrtd: Can't open file
'/var/spool/abrt/oops-2012-06-20-03:08:39-2420-0/uid': No such file or
directory

m.roth at 5-cent.us

2012-Jun-20 14:06 UTC

head link

[CentOS] Failing Network card

Gregory P. Ennis wrote:
<snip>> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
> card.  After adding the card to a machine with a new Centos 6.2 install
> and naming it 'eth4' it works well for 6 to 12 hours and then
fails.
> The failure is characterized by dropping its connection speed from 1000
> to 100 while not allowing any data to flow in or out.  When this happens
> a shutdown and reboot does not solve the problem, but shutting down and
> then removing the power does solve the problem.
<snip>> Some additional information that may be useful.  The TrendNet card is
> the second TrendNet card I have used.  The first card had the same
> symptoms, and I deduced the card was bad, and purchased another one. The
> symptoms are the same with the second card.<snip>
Several questions: do you have another machine on the same network? Does
*it* show the problem, around the same time?

And, finally, did you buy both TrendNet cards from the same vendor? Are
their MACs close? If so, it could be the vendor got a bad batch, either
OEM's fault, or the gorilla who un/loaded it during shipping.

       mark

Chris Beattie

2012-Jun-20 17:18 UTC

head link

[CentOS] Failing Network card

On 6/20/2012 9:34 AM, Gregory P. Ennis wrote:>
> I have been chasing a problem with a pci-e TrendNet(TEG-ECTX) gigabit
> card.  After adding the card to a machine with a new Centos 6.2 install
> and naming it 'eth4' it works well for 6 to 12 hours and then
fails.
Try moving the network card to a new slot, especially if you can swap 
the network card with another card which is known to work.  Also, try 
swapping the card into a spare server.

If the problem follows the network card, then the card is probably bad. 
  If a known-good card misbehaves in the slot where you previously had 
the network card, then the slot may be bad as well.

-- 
-Chris

Nothing in this message is intended to make or accept an offer or to form a
contract, except that an attachment that is an image of a contract bearing the
signature of an officer of our company may be or become a contract. This message
(including any attachments) is intended only for the use of the individual or
entity to whom it is addressed. It may contain information that is non-public,
proprietary, privileged, confidential, and exempt from disclosure under
applicable law or may constitute as attorney work product. If you are not the
intended recipient, we hereby notify you that any use, dissemination,
distribution, or copying of this message is strictly prohibited. If you have
received this message in error, please notify us immediately by telephone and
delete this message immediately.

Thank you.

Chuck Munro

2012-Jun-21 16:32 UTC

head link

[CentOS] Failing Network card

> Date: Wed, 20 Jun 2012 10:54:33 -0700
> From: John R Pierce<pierce at hogranch.com>
> Subject: Re: [CentOS] Failing Network card
> To:centos at centos.org
> Message-ID:<4FE20E59.20907 at hogranch.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> On 06/20/12 8:44 AM, Gregory P. Ennis wrote:
>> >  01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>> >  RTL8111/8168B PCI Express Gigabit Ethernet controller (rev ff)
 >> pure unmitigated junk.
>
> -- john r pierce N 37, W 122 santa cruz ca mid-left coast
I agree with John's comment.  Realtek chips are junk with unpredictable 
reliability, especially under heavy load.  I have had several problems 
with various versions of the 81xx chips.  When I tossed the cards in the 
garbage and switched to Intel-based NICs, all the problems went away.

Every time I build systems with Realtek network chips on the 
motherboard, I disable them in the BIOS and add Intel NICs instead.

YMMV, but please consider ditching Realtek altogether.

Chuck

Possibly Parallel Threads

Search for more seemingly similar threads

CentOS - Jun 2012 - Failing Network card

[CentOS] Failing Network card

[CentOS] Failing Network card

[CentOS] Failing Network card

[CentOS] Failing Network card

Possibly Parallel Threads