This is a return to an issue I first raised back in June. We had a similar
occurrence in September while I was away and so I am revisiting the entire
matter.
Steve Clark on 6 Jun 16:02 2014 wrote:> Hi,
>
> We ran into this problem also - the interface would disappear.
> There is newer e1000e driver that fixes it or you could
> add pcie_aspm=off to your kernel command line.
>
> HTH,
> Steve
I have run into other reports of similar occurrences and some of these refer
to this bug report: https://bugzilla.redhat.com/show_bug.cgi?id=632650
However, that report is closed as being a duplicate of:
https://bugzilla.redhat.com/show_bug.cgi?id=562273
Which is not available to viewing by the great unwashed.
Nonetheless, following the discussion thread in the bug report that I can view
it appears that this issue was supposedly resolved sometime in late 2012.
>From what I can gather the fix was to disable ASPM L1 for this model adaptor
in the e1000e driver module.
* Upstream commit d4a4206ebbaf48b55803a7eb34e330530d83a889 - e1000e: Disable
ASPM L1 on 82574
However, when I run lspci -vvv on the host that exhibited the problem I see
this:
. . .
03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
Subsystem: Super Micro Computer Inc Device 10d3
Physical Slot: 0-2
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 17
Region 0: Memory at feae0000 (32-bit, non-prefetchable) [size=128K]
Region 2: I/O ports at ec00 [size=32]
Region 3: Memory at feadc000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
<512ns,
L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+
TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency
L0 <128ns, L1 <64us
ClockPM- Surprise- LLActRep- BwNot-
############
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain-
CommClk+
############
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
Capabilities: [a0] MSI-X: Enable+ Count=5 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00002000
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 00-25-90-ff-ff-61-74-c1
Kernel driver in use: e1000e
Kernel modules: e1000e
. . .
lsmod
. . .
e1000e 267701 0
. . .
The host is running CentOS-6.5 with all updates applied to date. My question
is: Has this issue been addressed in the official e1000e module or not? if
not then does the recommendation to "add pcie_aspm=off to your kernel
command
line" hold?
--
*** E-Mail is NOT a SECURE channel ***
James B. Byrne mailto:ByrneJB at Harte-Lyne.ca
Harte & Lyne Limited http://www.harte-lyne.ca
9 Brockley Drive vox: +1 905 561 1241
Hamilton, Ontario fax: +1 905 561 0757
Canada L8E 3C3