On 17/05/2016 08:49, Borja Marcos wrote:>> On 05 May 2016, at 16:39, Warner Losh <imp at bsdimp.com> wrote:
>>
>>> What do you think? In some cases it?s clear that TRIM can do more
harm than good.
>> I think it?s best we not overreact.
> I agree. But with this issue the system is almost unusable for now.
>
>> This particular case is cause by the nvd driver, not the Intel P3500
NVME drive. You need
>> a solution (3): Fix the driver.
>>
>> Specifically, ZFS is pushing down a boatload of BIO_DELETE requests. In
ata/da land, these
>> requests are queued up, then collapsed together as much as makes sense
(or is possible).
>> This vastly helps performance (even with the extra sorting that I
forced to be in there that I
>> need to fix before 11). The nvd driver needs to do the same thing.
> I understand that, but I don?t think it?s a good that ZFS depends blindly
on a driver feature such
> as that. Of course, it?s great to exploit it.
>
> I have also noticed that ZFS has a good throttling mechanism for write
operations. A similar
> mechanism should throttle trim requests so that trim requests don?t clog
the whole system.
It already does.>
>> I?d be extremely hesitant to tossing away TRIMs. They are actually
quite important for
>> the FTL in the drive?s firmware to proper manage the NAND wear. More
free space always
>> reduces write amplification. It tends to go as 1 / freespace, so simply
dropping them on
>> the floor should be done with great reluctance.
> I understand. I was wondering about choosing the lesser between two evils.
A 15 minute
> I/O stall (I deleted 2 TB of data, that?s a lot, but not so unrealistic) or
settings trims aside
> during the peak activity.
>
> I see that I was wrong on that, as a throttling mechanism would be more
than enough probably,
> unless the system is close to running out of space.
>
> I?ve filed a bug report anyway. And copying to -stable.
>
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209571
>
TBH it sounds like you may have badly behaved HW, we've used ZFS + TRIM
and for years on large production boxes and while we're seen slow down
we haven't experienced the total lockups you're describing.
The graphs on you're ticket seem to indicate peak throughput of 250MB/s
which is extremely slow for standard SSD's let alone NVMe ones and when
you add in the fact you have 10 well it seems like something is VERY wrong.
I just did a quick test on our DB box here creating and then deleting a
2G file as you describe and I couldn't even spot the delete in the
general noise it was so quick to process and that's a 6 disk machine
with P3700's.
Regards
Steve