Hello,
I have a dual-Xeon computer with PCI-X.
I'm using Xen 2.0.5 stable with kernel 2.6.10.
I'm trying to run Infiniband in domain zero. I have an Infiniband HCA, which
uses PCI-X.
The HCA is a multifunction PCI device having also a PCI bridge.
Running 2.6.10 vanilla detects two PCI buses on the HCA where one of them is a
PCI bridge.
lspci -vv output looks like this:
---------------------- lspci -vv output with vanilla kernel
---------------------------
<few devices here omitted>
0000:12:02.0 Ethernet controller: Intel Corp. 82544GC Gigabit Ethernet
Controller (LOM) (rev 02)
Subsystem: Intel Corp. 82544GC Based Network Connection
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32 (63750ns min), cache line size 08
Interrupt: pin A routed to IRQ 19
Region 0: Memory at ec000000 (64-bit, non-prefetchable) [size=fe8e0000]
Region 2: Memory at eb800000 (64-bit, non-prefetchable) [size=128K]
Region 4: I/O ports at a000 [size=32]
Expansion ROM at 00020000 [disabled]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [e4] PCI-X non-bridge device.
Command: DPERE- ERO+ RBC=0 OST=0
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0,
DMOST=0, DMCRS=0, RSCEM- Capabilities: [f0] Message Signalled Interrupts: 64bit+
Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
0000:13:02.0 PCI bridge: Mellanox Technology MT23108 InfiniHost HCA bridge (rev
a1) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32, cache line size 08
Bus: primary=13, secondary=14, subordinate=14, sec-latency=32
Memory behind bridge: eb000000-eb7fffff
Prefetchable memory behind bridge: 00000000f0000000-00000000fe700000
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
Capabilities: [70] PCI-X bridge device.
Secondary Status: 64bit+, 133MHz+, SCD-, USC-, SCO-, SRD- Freq=3
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, SCO-, SRD-
: Upstream: Capacity=0, Commitment Limit=0
: Downstream: Capacity=0, Commitment Limit=0
0000:14:00.0 InfiniBand: Mellanox Technology MT23108 InfiniHost HCA (rev a1)
Subsystem: Mellanox Technology MT23108 InfiniHost HCA
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32, cache line size 08
Interrupt: pin A routed to IRQ 17
Region 0: Memory at eb000000 (64-bit, non-prefetchable)
Region 2: Memory at fe000000 (64-bit, prefetchable) [size=8M]
Region 4: Memory at f0000000 (64-bit, prefetchable) [size=128M]
Capabilities: [40] #11 [001f]
Capabilities: [50] Vital Product Data
Capabilities: [60] Message Signalled Interrupts: 64bit+ Queue=0/5 Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [70] PCI-X non-bridge device.
Command: DPERE- ERO- RBC=0 OST=1
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0,
DMOST=0, DMCRS=0, RSCEM-
-------------------------------------------------------------------------------------------------------
My problem is that the Xen kernel (linux-2.6.10-xen0) does not detect the PCI
bridge bus (bus 13) but only the second bus.
(One can also spot that during PCI scan in the kernel boot).
---------------------- lspci -vv output with 2.6.10-xen0 kernel
---------------------------
<few devices here omitted>
0000:12:02.0 Ethernet controller: Intel Corp. 82544GC Gigabit Ethernet
Controller (LOM) (rev 02)
Subsystem: Intel Corp. 82544GC Based Network Connection
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR- FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32, cache line size 08
Interrupt: pin ? routed to IRQ 19
Region 0: Memory at ec000000 (64-bit, non-prefetchable) [size=fe8e0000]
Region 2: Memory at eb800000 (64-bit, non-prefetchable) [size=128K]
Region 4: I/O ports at a000 [size=32]
Expansion ROM at 00020000 [disabled]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [e4] PCI-X non-bridge device.
Command: DPERE- ERO+ RBC=0 OST=0
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0,
DMOST=0, DMCRS=0, RSCEM- Capabilities: [f0] Message Signalled Interrupts: 64bit+
Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
0000:14:00.0 InfiniBand: Mellanox Technology MT23108 InfiniHost HCA (rev a1)
Subsystem: Mellanox Technology MT23108 InfiniHost HCA
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR+ FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 32, cache line size 08
Interrupt: pin ? routed to IRQ 17
Region 0: Memory at eb000000 (64-bit, non-prefetchable)
Region 2: Memory at fe000000 (64-bit, prefetchable) [size=8M]
Region 4: Memory at f0000000 (64-bit, prefetchable) [size=128M]
Capabilities: [40] #11 [001f]
Capabilities: [50] Vital Product Data
Capabilities: [60] Message Signalled Interrupts: 64bit+ Queue=0/5 Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [70] PCI-X non-bridge device.
Command: DPERE- ERO- RBC=0 OST=1
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0,
DMOST=0, DMCRS=0, RSCEM-
-----------------------------------------------------------------------------------------------------------
Other differences that I see besides the missing bus is that interrupt pin is
marked with '?' in the xen kernel:
Interrupt: pin ? routed to IRQ 17
^^^^^
In the vanilla kernel it says something like:
Interrupt: pin A routed to IRQ 17
^^^^^
When I load the HCA driver, it fails (it suceeds in the vanilla kernel).
Is anybody else experiencing such problems?
Is there a fix to this problem?
Are there any known problems with the PCI support in Xen in general?
I read in the mailing list that Xen is hiding PCI bridges. Why is that?
Thanks,
Ro'ee