On Mon, Aug 1, 2016 at 7:38 AM, Borja Marcos <borjam at sarenet.es> wrote:
>
> > On 29 Jul 2016, at 17:44, Jim Harris <jim.harris at gmail.com>
wrote:
> >
> >
> >
> > On Fri, Jul 29, 2016 at 1:10 AM, Borja Marcos <borjam at
sarenet.es> wrote:
> >
> > > On 28 Jul 2016, at 19:25, Jim Harris <jim.harris at
gmail.com> wrote:
> > >
> > > Yes, you should worry.
> > >
> > > Normally we could use the dump_debug sysctls to help debug this -
these
> > > sysctls will dump the NVMe I/O submission and completion queues.
But
> in
> > > this case the LBA data is in the payload, not the NVMe submission
> entries,
> > > so dump_debug will not help as much as dumping the NVMe DSM
payload
> > > directly.
> > >
> > > Could you try the attached patch and send output after recreating
your
> pool?
> >
> > Just in case the evil anti-spam ate my answer, sent the results to
your
> Gmail account.
> >
> >
> > Thanks Borja.
> >
> > It looks like all of the TRIM commands are formatted properly. The
> failures do not happen until about 10 seconds after the last TRIM to each
> drive was submitted, and immediately before TRIMs start to the next drive,
> so I'm assuming the failures are for the the last few TRIM commands but
> cannot say for sure. Could you apply patch v2 (attached) which will dump
> the TRIM payload contents inline with the failure messages?
>
> Sure, this is the complete /var/log/messages starting with the system
> boot. Before booting I destroyed the pool
> so that you could capture what happens when booting, zpool create, etc.
>
> Remember that the drives are in LBA format #3 (4 KB blocks). As far as I
> know that?s preferred to the old 512 byte blocks.
>
> Thank you very much and sorry about the belated response.
Hi Borja,
Thanks for the additional testing. This has all of the detail that I need
for now.
-Jim