Patrick M. Hausen
2019-Apr-12 11:58 UTC
NVME aborting outstanding i/o and controller resets
Hi all, my problems seem not to be TRIM related after all ? and I can now quickly reproduce it. ====root at freenas01[~]# sysctl vfs.zfs.trim.enabled vfs.zfs.trim.enabled: 0 ====root at freenas01[~]# cd /mnt/zfs root at freenas01[/mnt/zfs]# dd if=/dev/urandom of=hurz bs=10m ^C ? system freezes temporarily ====Apr 12 13:42:16 freenas01 nvme6: resetting controller Apr 12 13:42:16 freenas01 nvme6: aborting outstanding i/o Apr 12 13:42:16 freenas01 nvme6: WRITE sqid:1 cid:117 nsid:1 lba:981825104 len:176 Apr 12 13:42:16 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1 cid:117 cdw0:0 Apr 12 13:42:49 freenas01 nvme6: resetting controller Apr 12 13:42:50 freenas01 nvme6: aborting outstanding i/o Apr 12 13:42:50 freenas01 nvme6: WRITE sqid:1 cid:127 nsid:1 lba:984107936 len:96 Apr 12 13:42:50 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1 cid:127 cdw0:0 Apr 12 13:43:35 freenas01 nvme6: resetting controller Apr 12 13:43:35 freenas01 nvme6: aborting outstanding i/o Apr 12 13:43:35 freenas01 nvme6: WRITE sqid:1 cid:112 nsid:1 lba:976172032 len:176 Apr 12 13:43:35 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1 cid:112 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: resetting controller Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:111 nsid:1 lba:976199176 len:248 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:111 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:102 nsid:1 lba:976199432 len:248 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:102 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:112 nsid:1 lba:976199680 len:8 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:112 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:105 nsid:1 lba:976199752 len:64 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:105 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:122 nsid:1 lba:976199816 len:64 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:122 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:103 nsid:1 lba:976199688 len:64 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:103 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:126 nsid:1 lba:976200136 len:56 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:126 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:106 nsid:1 lba:976200192 len:8 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:106 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:107 nsid:1 lba:976200200 len:64 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:107 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:127 nsid:1 lba:976200264 len:64 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:127 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:113 nsid:1 lba:976200328 len:120 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:113 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:108 nsid:1 lba:976200448 len:72 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:108 cdw0:0 Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:116 nsid:1 lba:976200520 len:64 Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 cid:116 cdw0:0 ====root at freenas01[~]# nvmecontrol identify nvme6 Controller Capabilities/Features ===============================Vendor ID: 8086 Subsystem Vendor ID: 8086 Serial Number: BTLJ90230EC61P0FGN Model Number: INTEL SSDPE2KX010T8 Firmware Version: VDV10131 Recommended Arb Burst: 0 IEEE OUI Identifier: e4 d2 5c Multi-Interface Cap: 00 Max Data Transfer Size: 131072 Controller ID: 0x00 Admin Command Set Attributes ===========================Security Send/Receive: Not Supported Format NVM: Supported Firmware Activate/Download: Supported Namespace Managment: Supported Abort Command Limit: 4 Async Event Request Limit: 4 Number of Firmware Slots: 1 Firmware Slot 1 Read-Only: No Per-Namespace SMART Log: No Error Log Page Entries: 64 Number of Power States: 1 NVM Command Set Attributes =========================Submission Queue Entry Size Max: 64 Min: 64 Completion Queue Entry Size Max: 16 Min: 16 Number of Namespaces: 1 Compare Command: Not Supported Write Uncorrectable Command: Supported Dataset Management Command: Supported Volatile Write Cache: Not Present Namespace Drive Attributes =========================NVM total cap: 1000204886016 NVM unallocated cap: 0 ====root at freenas01[~]# zpool status pool: freenas-boot state: ONLINE scan: scrub repaired 0 in 0 days 00:00:03 with 0 errors on Sun Apr 7 03:45:03 2019 config: NAME STATE READ WRITE CKSUM freenas-boot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 nvd0p2 ONLINE 0 0 0 nvd1p2 ONLINE 0 0 0 errors: No known data errors pool: zfs state: ONLINE scan: scrub repaired 0 in 0 days 00:01:53 with 0 errors on Fri Mar 22 19:53:37 2019 config: NAME STATE READ WRITE CKSUM zfs ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/97d0a7ce-44e5-11e9-982e-0025905f99ac ONLINE 0 0 0 gptid/98053880-44e5-11e9-982e-0025905f99ac ONLINE 0 0 0 gptid/983a9468-44e5-11e9-982e-0025905f99ac ONLINE 0 0 0 gptid/987100f2-44e5-11e9-982e-0025905f99ac ONLINE 0 0 0 gptid/98aa6e88-44e5-11e9-982e-0025905f99ac ONLINE 0 0 0 gptid/98f20b8c-44e5-11e9-982e-0025905f99ac ONLINE 0 0 0 errors: No known data errors ==== The problem only appears on the data pool built from 6 INTEL SSDPE2KX010T8. The system pool built from two KXG50ZNV256G TOSHIBA does not show any problem with write load. All the Intel drives have the latest firmware according to the Intel support website. Could it possibly help to tweak dev.nvme.7.ioq0.num_entries and similar entries? What about switching to the nda device instead of nvd? Kind regards, Patrick -- punkt.de GmbH Internet - Dienstleistungen - Beratung Kaiserallee 13a Tel.: 0721 9109-0 Fax: -100 76133 Karlsruhe info at punkt.de http://punkt.de AG Mannheim 108285 Gf: Juergen Egeling
On Fri, Apr 12, 2019 at 6:00 AM Patrick M. Hausen <hausen at punkt.de> wrote:> Hi all, > > my problems seem not to be TRIM related after all ? and I can now > quickly reproduce it. > > ====> root at freenas01[~]# sysctl vfs.zfs.trim.enabled > vfs.zfs.trim.enabled: 0 > ====> root at freenas01[~]# cd /mnt/zfs > root at freenas01[/mnt/zfs]# dd if=/dev/urandom of=hurz bs=10m > ^C ? system freezes temporarily >This does one I/O at a time to the filesystem, which then repackages the I/Os such that multiple I/Os are going on with the NVMe card.> ====> Apr 12 13:42:16 freenas01 nvme6: resetting controller >OK. This means that whatever I/O workload we've done has caused the NVME card to stop responding for 30s, so we reset it.> Apr 12 13:42:16 freenas01 nvme6: aborting outstanding i/o > Apr 12 13:42:16 freenas01 nvme6: WRITE sqid:1 cid:117 nsid:1 lba:981825104 > len:176 > Apr 12 13:42:16 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1 > cid:117 cdw0:0 >But only one request was in flight... And we keep doing it over and over again, but to different LBAs suggesting that we're stuttering: a few go through and then things wedge again. This happens every 30ish seconds.> Apr 12 13:42:49 freenas01 nvme6: resetting controller > Apr 12 13:42:50 freenas01 nvme6: aborting outstanding i/o > Apr 12 13:42:50 freenas01 nvme6: WRITE sqid:1 cid:127 nsid:1 lba:984107936 > len:96 > Apr 12 13:42:50 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1 > cid:127 cdw0:0 > Apr 12 13:43:35 freenas01 nvme6: resetting controller > Apr 12 13:43:35 freenas01 nvme6: aborting outstanding i/o > Apr 12 13:43:35 freenas01 nvme6: WRITE sqid:1 cid:112 nsid:1 lba:976172032 > len:176 > Apr 12 13:43:35 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1 > cid:112 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: resetting controller >And then this one goes wonkies.> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:111 nsid:1 lba:976199176 > len:248 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:111 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:102 nsid:1 lba:976199432 > len:248 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:102 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:112 nsid:1 lba:976199680 > len:8 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:112 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:105 nsid:1 lba:976199752 > len:64 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:105 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:122 nsid:1 lba:976199816 > len:64 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:122 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:103 nsid:1 lba:976199688 > len:64 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:103 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:126 nsid:1 lba:976200136 > len:56 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:126 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:106 nsid:1 lba:976200192 > len:8 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:106 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:107 nsid:1 lba:976200200 > len:64 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:107 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:127 nsid:1 lba:976200264 > len:64 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:127 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:113 nsid:1 lba:976200328 > len:120 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:113 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:108 nsid:1 lba:976200448 > len:72 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:108 cdw0:0 > Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o > Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:116 nsid:1 lba:976200520 > len:64 > Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1 > cid:116 cdw0:0 > ====> root at freenas01[~]# nvmecontrol identify nvme6 > Controller Capabilities/Features > ===============================> Vendor ID: 8086 > Subsystem Vendor ID: 8086 > Serial Number: BTLJ90230EC61P0FGN > Model Number: INTEL SSDPE2KX010T8 >So it's an intel card.> Firmware Version: VDV10131 > Recommended Arb Burst: 0 > IEEE OUI Identifier: e4 d2 5c > Multi-Interface Cap: 00 > Max Data Transfer Size: 131072 > Controller ID: 0x00 > > Admin Command Set Attributes > ===========================> Security Send/Receive: Not Supported > Format NVM: Supported > Firmware Activate/Download: Supported > Namespace Managment: Supported > Abort Command Limit: 4 > Async Event Request Limit: 4 > Number of Firmware Slots: 1 > Firmware Slot 1 Read-Only: No > Per-Namespace SMART Log: No > Error Log Page Entries: 64 > Number of Power States: 1 > > NVM Command Set Attributes > =========================> Submission Queue Entry Size > Max: 64 > Min: 64 > Completion Queue Entry Size > Max: 16 > Min: 16 > Number of Namespaces: 1 > Compare Command: Not Supported > Write Uncorrectable Command: Supported > Dataset Management Command: Supported > Volatile Write Cache: Not Present > > Namespace Drive Attributes > =========================> NVM total cap: 1000204886016 > NVM unallocated cap: 0 > ====> root at freenas01[~]# zpool status > pool: freenas-boot > state: ONLINE > scan: scrub repaired 0 in 0 days 00:00:03 with 0 errors on Sun Apr 7 > 03:45:03 2019 > config: > > NAME STATE READ WRITE CKSUM > freenas-boot ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > nvd0p2 ONLINE 0 0 0 > nvd1p2 ONLINE 0 0 0 > > errors: No known data errors > > pool: zfs > state: ONLINE > scan: scrub repaired 0 in 0 days 00:01:53 with 0 errors on Fri Mar 22 > 19:53:37 2019 > config: > > NAME STATE READ > WRITE CKSUM > zfs ONLINE 0 > 0 0 > raidz2-0 ONLINE 0 > 0 0 > gptid/97d0a7ce-44e5-11e9-982e-0025905f99ac ONLINE 0 > 0 0 > gptid/98053880-44e5-11e9-982e-0025905f99ac ONLINE 0 > 0 0 > gptid/983a9468-44e5-11e9-982e-0025905f99ac ONLINE 0 > 0 0 > gptid/987100f2-44e5-11e9-982e-0025905f99ac ONLINE 0 > 0 0 > gptid/98aa6e88-44e5-11e9-982e-0025905f99ac ONLINE 0 > 0 0 > gptid/98f20b8c-44e5-11e9-982e-0025905f99ac ONLINE 0 > 0 0 > > errors: No known data errors > ====> > The problem only appears on the data pool built from 6 INTEL > SSDPE2KX010T8. The system > pool built from two KXG50ZNV256G TOSHIBA does not show any problem with > write load. >OK. That suggests Intel has a problem with their firmware.> All the Intel drives have the latest firmware according to the Intel > support website. > > Could it possibly help to tweak dev.nvme.7.ioq0.num_entries and similar > entries? >It might. Maybe. I'm unsure what the intel firmware bug is.> What about switching to the nda device instead of nvd? >I doubt that would have any effect. They both throw as much I/O onto the card as possible in the default config. There's been some minor improvements in -current here. Any chance you could experimentally try that with this test? You won't get as many I/O abort errors (since we don't print those), and we have a few more workarounds for the reset path (though honestly, it's still kinda stinky). Warner