> On 29 Jul 2016, at 17:44, Jim Harris <jim.harris at gmail.com> wrote: > > > > On Fri, Jul 29, 2016 at 1:10 AM, Borja Marcos <borjam at sarenet.es> wrote: > > > On 28 Jul 2016, at 19:25, Jim Harris <jim.harris at gmail.com> wrote: > > > > Yes, you should worry. > > > > Normally we could use the dump_debug sysctls to help debug this - these > > sysctls will dump the NVMe I/O submission and completion queues. But in > > this case the LBA data is in the payload, not the NVMe submission entries, > > so dump_debug will not help as much as dumping the NVMe DSM payload > > directly. > > > > Could you try the attached patch and send output after recreating your pool? > > Just in case the evil anti-spam ate my answer, sent the results to your Gmail account. > > > Thanks Borja. > > It looks like all of the TRIM commands are formatted properly. The failures do not happen until about 10 seconds after the last TRIM to each drive was submitted, and immediately before TRIMs start to the next drive, so I'm assuming the failures are for the the last few TRIM commands but cannot say for sure. Could you apply patch v2 (attached) which will dump the TRIM payload contents inline with the failure messages?Sure, this is the complete /var/log/messages starting with the system boot. Before booting I destroyed the pool so that you could capture what happens when booting, zpool create, etc. Remember that the drives are in LBA format #3 (4 KB blocks). As far as I know that?s preferred to the old 512 byte blocks. Thank you very much and sorry about the belated response. Borja.
FWIW I've had similar issues with Intel 750 PCIe NVMe drives when attempting to use 4K blocks on Linux with EXT4 on top of MD RAID1 (software mirror). I didn't dig much into because too many layers to reduce at the time but it looked like the drive misreported the number of blocks and a subsequent TRIM command or write of the last sector then errored. I mention it because despite the differences the similarities (Intel NVMe, LBA#3/4K) and error writing to a nonexistent block. Might give someone enough info to figure it out fully. On Monday, August 1, 2016, Borja Marcos <borjam at sarenet.es> wrote:> > > On 29 Jul 2016, at 17:44, Jim Harris <jim.harris at gmail.com > <javascript:;>> wrote: > > > > > > > > On Fri, Jul 29, 2016 at 1:10 AM, Borja Marcos <borjam at sarenet.es > <javascript:;>> wrote: > > > > > On 28 Jul 2016, at 19:25, Jim Harris <jim.harris at gmail.com > <javascript:;>> wrote: > > > > > > Yes, you should worry. > > > > > > Normally we could use the dump_debug sysctls to help debug this - these > > > sysctls will dump the NVMe I/O submission and completion queues. But > in > > > this case the LBA data is in the payload, not the NVMe submission > entries, > > > so dump_debug will not help as much as dumping the NVMe DSM payload > > > directly. > > > > > > Could you try the attached patch and send output after recreating your > pool? > > > > Just in case the evil anti-spam ate my answer, sent the results to your > Gmail account. > > > > > > Thanks Borja. > > > > It looks like all of the TRIM commands are formatted properly. The > failures do not happen until about 10 seconds after the last TRIM to each > drive was submitted, and immediately before TRIMs start to the next drive, > so I'm assuming the failures are for the the last few TRIM commands but > cannot say for sure. Could you apply patch v2 (attached) which will dump > the TRIM payload contents inline with the failure messages? > > Sure, this is the complete /var/log/messages starting with the system > boot. Before booting I destroyed the pool > so that you could capture what happens when booting, zpool create, etc. > > Remember that the drives are in LBA format #3 (4 KB blocks). As far as I > know that?s preferred to the old 512 byte blocks. > > Thank you very much and sorry about the belated response. > > > > > > Borja. > > > > _______________________________________________ > freebsd-stable at freebsd.org <javascript:;> mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org > <javascript:;>"-- "Genius might be described as a supreme capacity for getting its possessors into trouble of all kinds." -- Samuel Butler
On Mon, Aug 1, 2016 at 7:38 AM, Borja Marcos <borjam at sarenet.es> wrote:> > > On 29 Jul 2016, at 17:44, Jim Harris <jim.harris at gmail.com> wrote: > > > > > > > > On Fri, Jul 29, 2016 at 1:10 AM, Borja Marcos <borjam at sarenet.es> wrote: > > > > > On 28 Jul 2016, at 19:25, Jim Harris <jim.harris at gmail.com> wrote: > > > > > > Yes, you should worry. > > > > > > Normally we could use the dump_debug sysctls to help debug this - these > > > sysctls will dump the NVMe I/O submission and completion queues. But > in > > > this case the LBA data is in the payload, not the NVMe submission > entries, > > > so dump_debug will not help as much as dumping the NVMe DSM payload > > > directly. > > > > > > Could you try the attached patch and send output after recreating your > pool? > > > > Just in case the evil anti-spam ate my answer, sent the results to your > Gmail account. > > > > > > Thanks Borja. > > > > It looks like all of the TRIM commands are formatted properly. The > failures do not happen until about 10 seconds after the last TRIM to each > drive was submitted, and immediately before TRIMs start to the next drive, > so I'm assuming the failures are for the the last few TRIM commands but > cannot say for sure. Could you apply patch v2 (attached) which will dump > the TRIM payload contents inline with the failure messages? > > Sure, this is the complete /var/log/messages starting with the system > boot. Before booting I destroyed the pool > so that you could capture what happens when booting, zpool create, etc. > > Remember that the drives are in LBA format #3 (4 KB blocks). As far as I > know that?s preferred to the old 512 byte blocks. > > Thank you very much and sorry about the belated response.Hi Borja, Thanks for the additional testing. This has all of the detail that I need for now. -Jim