thr3ads.net - freebsd stable - ZFS and NVMe, trim caused stalling [May 2016]

If this information is useful, please help other people find it:
Share via:

Borja Marcos

2016-May-17 07:49 UTC

ZFS and NVMe, trim caused stalling

> On 05 May 2016, at 16:39, Warner Losh <imp at bsdimp.com> wrote:
> 
>> What do you think? In some cases it?s clear that TRIM can do more harm
than good.
> 
> I think it?s best we not overreact.
I agree. But with this issue the system is almost unusable for now.
> This particular case is cause by the nvd driver, not the Intel P3500 NVME
drive. You need
> a solution (3): Fix the driver.
> 
> Specifically, ZFS is pushing down a boatload of BIO_DELETE requests. In
ata/da land, these
> requests are queued up, then collapsed together as much as makes sense (or
is possible).
> This vastly helps performance (even with the extra sorting that I forced to
be in there that I
> need to fix before 11). The nvd driver needs to do the same thing.
I understand that, but I don?t think it?s a good that ZFS depends blindly on a
driver feature such
as that. Of course, it?s great to exploit it.

I have also noticed that ZFS has a good throttling mechanism for write
operations. A similar
mechanism should throttle trim requests so that trim requests don?t clog the
whole system.
> I?d be extremely hesitant to tossing away TRIMs. They are actually quite
important for
> the FTL in the drive?s firmware to proper manage the NAND wear. More free
space always
> reduces write amplification. It tends to go as 1 / freespace, so simply
dropping them on
> the floor should be done with great reluctance.
I understand. I was wondering about choosing the lesser between two evils. A 15
minute
I/O stall (I deleted 2 TB of data, that?s a lot, but not so unrealistic) or
settings trims aside
during the peak activity.

I see that I was wrong on that, as a throttling mechanism would be more than
enough probably,
unless the system is close to running out of space.

I?ve filed a bug report anyway. And copying to -stable.


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571

Thanks!







Borja.

Steven Hartland

2016-May-17 09:09 UTC

head link

ZFS and NVMe, trim caused stalling

On 17/05/2016 08:49, Borja Marcos wrote:>> On 05 May 2016, at 16:39, Warner Losh <imp at bsdimp.com> wrote:
>>
>>> What do you think? In some cases it?s clear that TRIM can do more
harm than good.
>> I think it?s best we not overreact.
> I agree. But with this issue the system is almost unusable for now.
>
>> This particular case is cause by the nvd driver, not the Intel P3500
NVME drive. You need
>> a solution (3): Fix the driver.
>>
>> Specifically, ZFS is pushing down a boatload of BIO_DELETE requests. In
ata/da land, these
>> requests are queued up, then collapsed together as much as makes sense
(or is possible).
>> This vastly helps performance (even with the extra sorting that I
forced to be in there that I
>> need to fix before 11). The nvd driver needs to do the same thing.
> I understand that, but I don?t think it?s a good that ZFS depends blindly
on a driver feature such
> as that. Of course, it?s great to exploit it.
>
> I have also noticed that ZFS has a good throttling mechanism for write
operations. A similar
> mechanism should throttle trim requests so that trim requests don?t clog
the whole system.
It already does.>
>> I?d be extremely hesitant to tossing away TRIMs. They are actually
quite important for
>> the FTL in the drive?s firmware to proper manage the NAND wear. More
free space always
>> reduces write amplification. It tends to go as 1 / freespace, so simply
dropping them on
>> the floor should be done with great reluctance.
> I understand. I was wondering about choosing the lesser between two evils.
A 15 minute
> I/O stall (I deleted 2 TB of data, that?s a lot, but not so unrealistic) or
settings trims aside
> during the peak activity.
>
> I see that I was wrong on that, as a throttling mechanism would be more
than enough probably,
> unless the system is close to running out of space.
>
> I?ve filed a bug report anyway. And copying to -stable.
>
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571
>TBH it sounds like you may have badly behaved HW, we've used ZFS + TRIM 
and for years on large production boxes and while we're seen slow down 
we haven't experienced the total lockups you're describing.

The graphs on you're ticket seem to indicate peak throughput of 250MB/s 
which is extremely slow for standard SSD's let alone NVMe ones and when 
you add in the fact you have 10 well it seems like something is VERY wrong.

I just did a quick test on our DB box here creating and then deleting a 
2G file as you describe and I couldn't even spot the delete in the 
general noise it was so quick to process and that's a 6 disk machine 
with P3700's.

     Regards
     Steve

freebsd stable - May 2016 - ZFS and NVMe, trim caused stalling

ZFS and NVMe, trim caused stalling

ZFS and NVMe, trim caused stalling