Nadolski, Ed
2010-Mar-19 18:41 UTC
[Xen-devel] mptscsih gets SCSI I/O errors in HVM with VT-d
Hi, I am running Xen 4.0.0-rc6 on a Dell T7500 quad-core Xeon with Fedora 12 as dom0. I have an LSI FC949E quad-port Fibre Channel HBA that works fine when I run it from either dom0 or baremetal, but when I try to assign this HBA to an HVM using VT-d, I see a bunch of SCSI abort/reset errors from the mptscsih driver in the HVM whenever I run disk I/Os thru the HBA. The HVM OS is off-the-shelf Fedora 12. Here are the mpt driver error messages from the HVM:> mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa900) > sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 20 00 00 c0 00 > mptscsih: ioc3: WARNING - Issuing Reset from mptscsih_IssueTaskMgmt!! > mptbase: ioc3: Initiating recovery > mptscsih: ioc3: task abort: SUCCESS (sc=ffff88001d4fa900) > mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa500) > sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 e0 00 01 00 00 > mptscsih: ioc3: task abort: FAILED (sc=ffff88001d4fa500) > mptscsih: ioc3: attempting target reset! (sc=ffff88001d4fa900) > sd 5:0:0:2: [sdd] CDB: Read(10): 28 00 03 7a d3 20 00 00 c0 00 > mptscsih: ioc3: target reset: SUCCESS (sc=ffff88001d4fa900) > mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fb400) > sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd 20 00 00 c0 00 > mptscsih: ioc3: WARNING - Issuing Reset from mptscsih_IssueTaskMgmt!! > mptbase: ioc3: Initiating recovery > mptscsih: ioc3: task abort: SUCCESS (sc=ffff88001d4fb400) > mptscsih: ioc3: attempting task abort! (sc=ffff88001d4fa600) > sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd e0 00 01 00 00 > mptscsih: ioc3: task abort: FAILED (sc=ffff88001d4fa600) > mptscsih: ioc3: attempting target reset! (sc=ffff88001d4fb400) > sd 5:0:0:1: [sdc] CDB: Read(10): 28 00 01 b9 fd 20 00 00 c0 00 > mptscsih: ioc3: target reset: SUCCESS (sc=ffff88001d4fb400)Any thoughts on what could cause something like this under HVM but not baremetal? I will try to instrument the mptscsih driver in the HVM to get a better idea of what kind of I/O errors are occurring. Interestingly, this Dell T7500 also has an onboard LSI 1068 SAS controller, which works fine when assigned to the HVM. So I wonder if this could have something to do with PCI bridging? FWIW I''ve also enclosed below the lspci -vvvxxx for the HBA, both baremetal and in the HVM, tho I don''t see anything obvious there. Thanks, Ed ### /etc/grub.conf entry for passthru title Fedora-12 Xen 4.0.0-rc6 (2.6.31.12) iommu=1 xen-pciback.hide=(24:00.0)(24:00.1)(25:00.0)(25:00.1) root (hd0,0) kernel /xen-4.0.0-rc6.gz iommu=1 acpi_skip_timer_override loglvl=all guest_loglvl=all sync_console console_to_ring com1=115200,8n1 console=com1 module /vmlinuz-2.6.31.12 ro root=UUID=edbcbc29-f3e4-4985-80c1-3c3b0ce24d17 LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us console=hvc0 earlyprintk=xen xen-pciback.hide=(24:00.0)(24:00.1)(25:00.0)(25:00.1) module /initramfs-2.6.31.12.img ### lspci for HBA device on baremetal Fedora 12: # lspci ... 24:00.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) 24:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) 25:00.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) 25:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) # lspci -vvvxxx -s 25:00.1 25:00.1 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) Subsystem: LSI Logic / Symbios Logic Device 1070 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin B routed to IRQ 61 Region 0: I/O ports at dc00 [size=256] Region 1: Memory at dfadc000 (64-bit, non-prefetchable) [size=16K] Region 3: Memory at dfaf0000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at dc100000 [disabled] [size=1M] Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Latency L0 <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [98] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [b0] MSI-X: Enable- Count=1 Masked- Vector table: BAR=1 offset=00002000 PBA: BAR=1 offset=00003000 Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 14, GenCap+ CGenEn- ChkCap+ ChkEn- Kernel driver in use: mptfc Kernel modules: mptfc 00: 00 10 46 06 07 00 10 00 02 00 04 0c 10 00 80 00 10: 01 dc 00 00 04 c0 ad df 00 00 00 00 04 00 af df 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 10 70 10 30: 00 00 b0 df 50 00 00 00 00 00 00 00 0a 02 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 01 68 02 06 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 01 25 00 00 10 98 01 00 25 00 00 00 70: 36 28 0a 00 81 0c 00 00 40 00 81 10 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 05 b0 80 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 11 00 00 00 01 20 00 00 01 30 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 #### lspci for HBA device on HVM Fedora 12: # lspci ... 00:04.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) 00:05.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) 00:06.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) 00:07.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) # lspci -vvvxxx -s 00:07.0 00:07.0 Fibre Channel: LSI Logic / Symbios Logic FC949ES Fibre Channel Adapter (rev 02) Subsystem: LSI Logic / Symbios Logic Device 1070 Physical Slot: 7 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 128 Interrupt: pin B routed to IRQ 45 Region 0: I/O ports at c400 [size=256] Region 1: Memory at f344c000 (64-bit, non-prefetchable) [size=16K] Region 3: Memory at f3430000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at f3300000 [disabled] [size=1M] Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Latency L0 <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [98] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [b0] MSI-X: Enable- Count=1 Masked- Vector table: BAR=1 offset=00002000 PBA: BAR=1 offset=00003000 Kernel driver in use: mptfc Kernel modules: mptfc 00: 00 10 46 06 07 00 10 00 02 00 04 0c 00 80 80 00 10: 01 c4 00 00 04 c0 44 f3 00 00 00 00 04 00 43 f3 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 10 70 10 30: 00 00 30 f3 50 00 00 00 00 00 00 00 05 02 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 01 68 02 06 08 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 01 25 00 00 10 98 01 00 25 00 00 00 70: 10 28 0a 00 81 0c 00 00 00 00 81 10 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 05 b0 80 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 11 00 00 00 01 20 00 00 01 30 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel