thr3ads.net - zfs discuss - [zfs-discuss] Impact of L2ARC device failure and SSD recommendations [Jun 2011]

If this information is useful, please help other people find it:
Share via:

Edmund White

2011-Jun-11 13:35 UTC

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

Posted in greater detail at Server Fault - http://serverfault.com/q/277966/13325

I have an HP ProLiant DL380 G7 system running NexentaStor. The server has 36GB
RAM, 2 LSI 9211-8i SAS controllers (no SAS expanders), 2 SAS system drives, 12
SAS data drives, a hot-spare disk, an Intel X25-M L2ARC cache and a DDRdrive PCI
ZIL accelerator. This system serves NFS to multiple VMWare hosts. I also have
about 90-100GB of deduplicated data on the array.

I''ve had two incidents where performance tanked suddenly, leaving the
VM guests and Nexenta SSH/Web consoles inaccessible and requiring a full reboot
of the array to restore functionality. In both cases, it was the Intel X-25M
L2ARC SSD that failed or was "offlined". NexentaStor failed to alert
me on the cache failure, however the general ZFS FMA alert was visible on the
(unresponsive) console screen.

The "zpool status" output showed:

cache
c6t5001517959467B45d0 FAULTED 2 542 0 too many errors

This did not trigger any alerts from within Nexenta.

I was under the impression that an L2ARC failure would not impact the system.
But in this case, it was the culprit. I''ve never seen any
recommendations to RAID L2ARC for resiliency. Removing the bad SSD entirely from
the server got me back running, but I''m concerned about the impact of
the device failure and the lack of notification from NexentaStor.

What''s the current best-choice SSD for L2ARC cache applications these
days? It seems as though the Intel units are no longer well-regarded.

--
Edmund White
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110611/a31da62c/attachment.html>

Pasi Kärkkäinen

2011-Jun-11 15:15 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

On Sat, Jun 11, 2011 at 08:35:19AM -0500, Edmund White
wrote:>    Posted in greater detail at Server Fault
>    - [1]http://serverfault.com/q/277966/13325
> 
>    I have an HP ProLiant DL380 G7 system running NexentaStor. The server
has
>    36GB RAM, 2 LSI 9211-8i SAS controllers (no SAS expanders), 2 SAS system
>    drives, 12 SAS data drives, a hot-spare disk, an Intel X25-M L2ARC cache
>    and a DDRdrive PCI ZIL accelerator. This system serves NFS to multiple
>    VMWare hosts. I also have about 90-100GB of deduplicated data on the
>    array.
> 
>    I''ve had two incidents where performance tanked suddenly,
leaving the VM
>    guests and Nexenta SSH/Web consoles inaccessible and requiring a full
>    reboot of the array to restore functionality. In both cases, it was the
>    Intel X-25M L2ARC SSD that failed or was "offlined".
NexentaStor failed to
>    alert me on the cache failure, however the general ZFS FMA alert was
>    visible on the (unresponsive) console screen.
> 
>    The "zpool status" output showed:
> 
>  cache
>  c6t5001517959467B45d0     FAULTED      2   542     0  too many errors
> 
>    This did not trigger any alerts from within Nexenta.
> 
>    I was under the impression that an L2ARC failure would not impact the
>    system. But in this case, it was the culprit. I''ve never seen
any
>    recommendations to RAID L2ARC for resiliency. Removing the bad SSD
>    entirely from the server got me back running, but I''m concerned
about the
>    impact of the device failure and the lack of notification from
>    NexentaStor.
> 
>    What''s the current best-choice SSD for L2ARC cache applications
these
>    days? It seems as though the Intel units are no longer well-regarded.
> 
IIRC recently there was discussion on this list about firmware bug
on the Intel X25 SSDs causing them to fail under high disk IO with "reset
storms".

Maybe you''re hitting that firmware bug.

-- Pasi

Edmund White

2011-Jun-11 15:28 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

So, can this be fixed in firmware? How can I determine if the drive is
actually bad?

-- 
Edmund White
ewwhite at mac.com



On 6/11/11 10:15 AM, "Pasi K?rkk?inen" <pasik at iki.fi> wrote:
>On Sat, Jun 11, 2011 at 08:35:19AM -0500, Edmund White wrote:
>>    Posted in greater detail at Server Fault
>>    - [1]http://serverfault.com/q/277966/13325
>> 
>>    I have an HP ProLiant DL380 G7 system running NexentaStor. The
>>server has
>>    36GB RAM, 2 LSI 9211-8i SAS controllers (no SAS expanders), 2 SAS
>>system
>>    drives, 12 SAS data drives, a hot-spare disk, an Intel X25-M L2ARC
>>cache
>>    and a DDRdrive PCI ZIL accelerator. This system serves NFS to
>>multiple
>>    VMWare hosts. I also have about 90-100GB of deduplicated data on the
>>    array.
>> 
>>    I''ve had two incidents where performance tanked suddenly,
leaving
>>the VM
>>    guests and Nexenta SSH/Web consoles inaccessible and requiring a
full
>>    reboot of the array to restore functionality. In both cases, it was
>>the
>>    Intel X-25M L2ARC SSD that failed or was "offlined".
NexentaStor
>>failed to
>>    alert me on the cache failure, however the general ZFS FMA alert was
>>    visible on the (unresponsive) console screen.
>> 
>>    The "zpool status" output showed:
>> 
>>  cache
>>  c6t5001517959467B45d0     FAULTED      2   542     0  too many errors
>> 
>>    This did not trigger any alerts from within Nexenta.
>> 
>>    I was under the impression that an L2ARC failure would not impact
the
>>    system. But in this case, it was the culprit. I''ve never
seen any
>>    recommendations to RAID L2ARC for resiliency. Removing the bad SSD
>>    entirely from the server got me back running, but I''m
concerned
>>about the
>>    impact of the device failure and the lack of notification from
>>    NexentaStor.
>> 
>>    What''s the current best-choice SSD for L2ARC cache
applications these
>>    days? It seems as though the Intel units are no longer
well-regarded.
>> 
>
>IIRC recently there was discussion on this list about firmware bug
>on the Intel X25 SSDs causing them to fail under high disk IO with
"reset
>storms".
>
>Maybe you''re hitting that firmware bug.
>
>-- Pasi
>

Jim Klimov

2011-Jun-11 16:26 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

2011-06-11 19:15, Pasi K?rkk?inen ?????:> On Sat, Jun 11, 2011 at 08:35:19AM -0500, Edmund White wrote:
>>     I''ve had two incidents where performance tanked suddenly,
leaving the VM
>>     guests and Nexenta SSH/Web consoles inaccessible and requiring a
full
>>     reboot of the array to restore functionality. In both cases, it was
the
>>     Intel X-25M L2ARC SSD that failed or was "offlined".
NexentaStor failed to
>>     alert me on the cache failure, however the general ZFS FMA alert
was
>>     visible on the (unresponsive) console screen.
>>
>>     The "zpool status" output showed:
>>
>>   cache
>>   c6t5001517959467B45d0     FAULTED      2   542     0  too many errors
>>
>>     This did not trigger any alerts from within Nexenta.
>>
>>     I was under the impression that an L2ARC failure would not impact
the
>>     system. But in this case, it was the culprit. I''ve never
seen any
>>     recommendations to RAID L2ARC for resiliency. Removing the bad SSD
>>     entirely from the server got me back running, but I''m
concerned about the
>>     impact of the device failure and the lack of notification from
>>     NexentaStor.
> IIRC recently there was discussion on this list about firmware bug
> on the Intel X25 SSDs causing them to fail under high disk IO with
"reset storms".Even if so, this does not forgive ZFS hanging - especially
if it detected the drive failure, and especially if this drive
is not required for redundant operation.

I''ve seen similar bad behaviour on my oi_148a box when
I tested USB flash devices as L2ARC caches and
occasionally they died by slightly moving out of the
USB socket due to vibration or whatever reason ;)

Similarly, this oi_148a box hung upon loss of SATA
connection to a drive in the raidz2 disk set due to
unreliable cable connectors, while it should have
stalled IOs to that pool but otherwise the system
should have remained remain responsive (tested
failmode=continue and failmode=wait on different
occasions).

So I can relate - these things happen, they do annoy,
and I hope they will be fixed sometime soon so that
ZFS matches its docs and promises ;)

//Jim Klimov

Pasi Kärkkäinen

2011-Jun-12 11:05 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

On Sat, Jun 11, 2011 at 08:26:34PM +0400, Jim Klimov
wrote:> 2011-06-11 19:15, Pasi K?rkk?inen ??????????:
>> On Sat, Jun 11, 2011 at 08:35:19AM -0500, Edmund White wrote:
>>>     I''ve had two incidents where performance tanked
suddenly, leaving the VM
>>>     guests and Nexenta SSH/Web consoles inaccessible and requiring
a full
>>>     reboot of the array to restore functionality. In both cases, it
was the
>>>     Intel X-25M L2ARC SSD that failed or was "offlined".
NexentaStor failed to
>>>     alert me on the cache failure, however the general ZFS FMA
alert was
>>>     visible on the (unresponsive) console screen.
>>>
>>>     The "zpool status" output showed:
>>>
>>>   cache
>>>   c6t5001517959467B45d0     FAULTED      2   542     0  too many
errors
>>>
>>>     This did not trigger any alerts from within Nexenta.
>>>
>>>     I was under the impression that an L2ARC failure would not
impact the
>>>     system. But in this case, it was the culprit. I''ve
never seen any
>>>     recommendations to RAID L2ARC for resiliency. Removing the bad
SSD
>>>     entirely from the server got me back running, but I''m
concerned about the
>>>     impact of the device failure and the lack of notification from
>>>     NexentaStor.
>> IIRC recently there was discussion on this list about firmware bug
>> on the Intel X25 SSDs causing them to fail under high disk IO with
"reset storms".
> Even if so, this does not forgive ZFS hanging - especially
> if it detected the drive failure, and especially if this drive
> is not required for redundant operation.
>
> I''ve seen similar bad behaviour on my oi_148a box when
> I tested USB flash devices as L2ARC caches and
> occasionally they died by slightly moving out of the
> USB socket due to vibration or whatever reason ;)
>
> Similarly, this oi_148a box hung upon loss of SATA
> connection to a drive in the raidz2 disk set due to
> unreliable cable connectors, while it should have
> stalled IOs to that pool but otherwise the system
> should have remained remain responsive (tested
> failmode=continue and failmode=wait on different
> occasions).
>
> So I can relate - these things happen, they do annoy,
> and I hope they will be fixed sometime soon so that
> ZFS matches its docs and promises ;)
>
True, definitely sounds like a bug in ZFS aswell..

-- Pasi

Richard Elling

2011-Jun-12 19:52 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

On Jun 11, 2011, at 6:35 AM, Edmund White wrote:
> Posted in greater detail at Server Fault -
http://serverfault.com/q/277966/13325
> Replied in greater detail at same.
> I have an HP ProLiant DL380 G7 system running NexentaStor. The server has
36GB RAM, 2 LSI 9211-8i SAS controllers (no SAS expanders), 2 SAS system drives,
12 SAS data drives, a hot-spare disk, an Intel X25-M L2ARC cache and a DDRdrive
PCI ZIL accelerator. This system serves NFS to multiple VMWare hosts. I also
have about 90-100GB of deduplicated data on the array.
> 
> I''ve had two incidents where performance tanked suddenly, leaving
the VM guests and Nexenta SSH/Web consoles inaccessible and requiring a full
reboot of the array to restore functionality.
> The reboot is your decision, the software will, eventually, recover.
> In both cases, it was the Intel X-25M L2ARC SSD that failed or was
"offlined". NexentaStor failed to alert me on the cache failure,
however the general ZFS FMA alert was visible on the (unresponsive) console
screen.
> 
> 
NexentaStor fault triggers run in addition to the existing FMA and syslog
services.
> The "zpool status" output showed:
> 
> 
> cache
> c6t5001517959467B45d0     FAULTED      2   542     0  too many errors
> 
> This did not trigger any alerts from within Nexenta.
> 
> 
The NexentaStor volume-check runner looks for zpool status error messages. Check
your configuration
for the runner schedule, by default it is hourly.

> I was under the impression that an L2ARC failure would not impact the
system.
> With all due respect, that is a naive assumption. Any system failure can impact
the system. The
worst kinds of failures are those that impact performance. In this case, the
broken SSD firmware
causes very slow response to I/O requests. It does not return an error code that
says "I''m broken"
it just responds very slowly, perhaps after other parts of the system ask it to
reset and retry a few
times.
> But in this case, it was the culprit. I''ve never seen any
recommendations to RAID L2ARC for resiliency. Removing the bad SSD entirely from
the server got me back running, but I''m concerned about the impact of
the device failure and the lack of notification from NexentaStor.
> 
> 
We have made some improvements in notification for this type of failure in the
3.1 release. Why?
Because we have seen a large number of these errors from various disk and SSD
manufacturers
recently. You will notice that Nexenta does not support these SSDs behind SAS
expanders for this
very reason. At the end of the day, resolution is to get the device fixed or
replaced. Contact your hardware
provider for details.
> What''s the current best-choice SSD for L2ARC cache applications
these days? It seems as though the Intel units are no longer well-regarded.
> 
> 
No device is perfect. Some have better firmware, components, or design than
others. YMMV.
 -- richard

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20110612/d75df0df/attachment-0001.html>

Richard Elling

2011-Jun-12 19:57 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

On Jun 11, 2011, at 9:26 AM, Jim Klimov wrote:
> 2011-06-11 19:15, Pasi K?rkk?inen ?????:
>> On Sat, Jun 11, 2011 at 08:35:19AM -0500, Edmund White wrote:
>>>    I''ve had two incidents where performance tanked
suddenly, leaving the VM
>>>    guests and Nexenta SSH/Web consoles inaccessible and requiring a
full
>>>    reboot of the array to restore functionality. In both cases, it
was the
>>>    Intel X-25M L2ARC SSD that failed or was "offlined".
NexentaStor failed to
>>>    alert me on the cache failure, however the general ZFS FMA alert
was
>>>    visible on the (unresponsive) console screen.
>>> 
>>>    The "zpool status" output showed:
>>> 
>>>  cache
>>>  c6t5001517959467B45d0     FAULTED      2   542     0  too many
errors
>>> 
>>>    This did not trigger any alerts from within Nexenta.
>>> 
>>>    I was under the impression that an L2ARC failure would not
impact the
>>>    system. But in this case, it was the culprit. I''ve
never seen any
>>>    recommendations to RAID L2ARC for resiliency. Removing the bad
SSD
>>>    entirely from the server got me back running, but I''m
concerned about the
>>>    impact of the device failure and the lack of notification from
>>>    NexentaStor.
>> IIRC recently there was discussion on this list about firmware bug
>> on the Intel X25 SSDs causing them to fail under high disk IO with
"reset storms".
> Even if so, this does not forgive ZFS hanging - especially
> if it detected the drive failure, and especially if this drive
> is not required for redundant operation.
How long should it wait? Before you answer, read through the thread:
	http://lists.illumos.org/pipermail/developer/2011-April/001996.html
Then add your comments :-)
 -- richard

Jim Klimov

2011-Jun-12 23:18 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

2011-06-12 23:57, Richard Elling wrote:>
> How long should it wait? Before you answer, read through the thread:
> 	http://lists.illumos.org/pipermail/developer/2011-April/001996.html
> Then add your comments :-)
>   -- richard
Interesting thread. I did not quite get the resentment against
a tunable value instead of a hard-coded #define, though.

Especially if we might want to somehow tune it per-device,
i.e. CDROM, enterprise SAS and some commodity drive or a
USB stick (or a VMWare emulated HDD, as Ceri pointed out)
might all be plugged into the same box and require different
timeouts only the sysadmin might know about (the numeric
values per-device). So I''d rather go with some hardcoded
default and many tuned lines in sd.conf, probably.

But the point of my previous comment was that, according
to the original poster, after a while his disk did get
marked as "faulted" or "offlined". IF this happened
during the system''s initial uptime, but it froze anyway,
it it a problem.

What I do not know is if he rebooted the box within the
5 minutes set aside for the timeout, or if some other
processes gave up during the 5 minutes of no IO and
effectively hung the system.

If it is somehow the latter - that the inaccessible drive
did (lead to) hang(ing) the system past any set IO retry
timeouts - that is a bug, I think.

But maybe I''m just too annoyed with my box hanging with
a more-or-less reproducible scenario, and now I''m barking
up any tree that looks like system freeze related to IO ;)

//Jim

Richard Elling

2011-Jun-12 23:34 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

On Jun 12, 2011, at 4:18 PM, Jim Klimov wrote:
> 2011-06-12 23:57, Richard Elling wrote:
>> 
>> How long should it wait? Before you answer, read through the thread:
>> 	http://lists.illumos.org/pipermail/developer/2011-April/001996.html
>> Then add your comments :-)
>>  -- richard
> 
> Interesting thread. I did not quite get the resentment against
> a tunable value instead of a hard-coded #define, though.
Tunables are evil. They increase complexity and lead to local optimizations
that interfere with systemic optimizations.
> Especially if we might want to somehow tune it per-device,
> i.e. CDROM, enterprise SAS and some commodity drive or a
> USB stick (or a VMWare emulated HDD, as Ceri pointed out)
> might all be plugged into the same box and require different
> timeouts only the sysadmin might know about (the numeric
> values per-device). So I''d rather go with some hardcoded
> default and many tuned lines in sd.conf, probably.
yuck. I''d rather have my eye poked out with a sharp stick.
> But the point of my previous comment was that, according
> to the original poster, after a while his disk did get
> marked as "faulted" or "offlined". IF this happened
> during the system''s initial uptime, but it froze anyway,
> it it a problem.
> 
> What I do not know is if he rebooted the box within the
> 5 minutes set aside for the timeout, or if some other
> processes gave up during the 5 minutes of no IO and
> effectively hung the system.
Not likely. Much more likely that that which you were expecting
was blocked.
> If it is somehow the latter - that the inaccessible drive
> did (lead to) hang(ing) the system past any set IO retry
> timeouts - that is a bug, I think.
> 
> But maybe I''m just too annoyed with my box hanging with
> a more-or-less reproducible scenario, and now I''m barking
> up any tree that looks like system freeze related to IO ;)
Yep, a common reaction.

I think we can be more creative...
 -- richard

Edmund White

2011-Jun-13 00:04 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

On 6/12/11 6:18 PM, "Jim Klimov" <jimklimov at cos.ru> wrote:

>2011-06-12 23:57, Richard Elling wrote:
>>
>> How long should it wait? Before you answer, read through the thread:
>> 	http://lists.illumos.org/pipermail/developer/2011-April/001996.html
>> Then add your comments :-)
>>   -- richard
>
>But the point of my previous comment was that, according
>to the original poster, after a while his disk did get
>marked as "faulted" or "offlined". IF this happened
>during the system''s initial uptime, but it froze anyway,
>it it a problem.
>
>What I do not know is if he rebooted the box within the
>5 minutes set aside for the timeout, or if some other
>processes gave up during the 5 minutes of no IO and
>effectively hung the system.
>
>If it is somehow the latter - that the inaccessible drive
>did (lead to) hang(ing) the system past any set IO retry
>timeouts - that is a bug, I think.
>
Here''s the timeline:

- The Intel X25-M was marked "FAULTED" Monday evening, 6pm. This was
not
detected by NexentaStor.
- The storage system performance diminished at 9am the next morning.
Intermittent spikes in system load (of the VMs hosted on the unit).
- By 11am, the Nexenta interface and console were unresponsive and the
virtual machines dependent on the underlying storage stalled completely.
- At 12pm, I gained physical access to the server, but I could not acquire
console access (shell or otherwise). I did see the FMA error output on the
screen indicating the actual device FAULT time.
- I powered the system off, removed the Intel X-25M, and powered back on.
The VMs picked up where they left off and the system stabilized.

The total impact to end-users was 3 hours of either poor performance or
straight downtime. 

-- 
Edmund White
ewwhite at mac.com

Richard Elling

2011-Jun-13 00:25 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

On Jun 12, 2011, at 5:04 PM, Edmund White wrote:> On 6/12/11 6:18 PM, "Jim Klimov" <jimklimov at cos.ru>
wrote:
>> 2011-06-12 23:57, Richard Elling wrote:
>>> 
>>> How long should it wait? Before you answer, read through the
thread:
>>> 
http://lists.illumos.org/pipermail/developer/2011-April/001996.html
>>> Then add your comments :-)
>>>  -- richard
>> 
>> But the point of my previous comment was that, according
>> to the original poster, after a while his disk did get
>> marked as "faulted" or "offlined". IF this happened
>> during the system''s initial uptime, but it froze anyway,
>> it it a problem.
>> 
>> What I do not know is if he rebooted the box within the
>> 5 minutes set aside for the timeout, or if some other
>> processes gave up during the 5 minutes of no IO and
>> effectively hung the system.
>> 
>> If it is somehow the latter - that the inaccessible drive
>> did (lead to) hang(ing) the system past any set IO retry
>> timeouts - that is a bug, I think.
>> 
> 
> Here''s the timeline:
> 
> - The Intel X25-M was marked "FAULTED" Monday evening, 6pm. This
was not
> detected by NexentaStor.
Is the volume-check runner enabled? All of the check runner results are logged
in
the report database and sent to the system administrator via email. I will
assume
that you have configured email for delivery, as it is a required step in the
installation
procedure.

In any case, a disk declared FAULTED is no longer used by ZFS, except when a
pool is cleared. The volume-check runner can do this on your behalf, if it is 
configured to do so. See Data Management -> Runners -> volume-check
And, of course, these actions are recorded in the logs and report database.
> - The storage system performance diminished at 9am the next morning.
> Intermittent spikes in system load (of the VMs hosted on the unit).
This is consistent with reset storms.
> - By 11am, the Nexenta interface and console were unresponsive and the
> virtual machines dependent on the underlying storage stalled completely.
Also consistent with reset storms.
> - At 12pm, I gained physical access to the server, but I could not acquire
> console access (shell or otherwise). I did see the FMA error output on the
> screen indicating the actual device FAULT time.
> - I powered the system off, removed the Intel X-25M, and powered back on.
> The VMs picked up where they left off and the system stabilized.
> 
> The total impact to end-users was 3 hours of either poor performance or
> straight downtime. 
Yes, this is consistent with reset storms. Older Intel SSDs are not the only
devices that
handle this poorly. In my experience a number of SATA devices are poorly
designed :-(
 -- richard

Edmund White

2011-Jun-13 00:52 UTC

head link

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

On 6/12/11 7:25 PM, "Richard Elling" <richard.elling at
gmail.com> wrote:

>>> 
>> 
>> Here''s the timeline:
>> 
>> - The Intel X25-M was marked "FAULTED" Monday evening, 6pm.
This was not
>> detected by NexentaStor.
>
>Is the volume-check runner enabled? All of the check runner results are
>logged in
>the report database and sent to the system administrator via email. I
>will assume
>that you have configured email for delivery, as it is a required step in
>the installation
>procedure.
>
>In any case, a disk declared FAULTED is no longer used by ZFS, except
>when a
>pool is cleared. The volume-check runner can do this on your behalf, if
>it is 
>configured to do so. See Data Management -> Runners -> volume-check
>And, of course, these actions are recorded in the logs and report
>database.
>
>-- richard

I checked seven of my NexentaStor installations (3.0.4 and 3.0.5). Six of
them had the disk-check fault trigger disabled by default. volume-check is
enabled on all and is set to run hourly. Email notification is configured,
and I actively receive other alerts (DDT table, auto-sync) and reports.

-- 
Edmund White
ewwhite at mac.com
847-530-1605

zfs discuss - Jun 2011 - Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations

[zfs-discuss] Impact of L2ARC device failure and SSD recommendations