thr3ads.net - zfs discuss - [zfs-discuss] help diagnosing system hang [Dec 2008]

If this information is useful, please help other people find it:
Share via:

Ethan Erchinger

2008-Dec-04 06:42 UTC

[zfs-discuss] help diagnosing system hang

Hi all,

First, I''ll say my intent is not to spam a bunch of lists, but after 
posting to opensolaris-discuss I had someone communicate with me offline 
that these lists would possibly be a better place to start.  So here we 
are. For those on all three lists, sorry for the repetition.

Second, this message is meant to solicit help in diagnosing the issue 
described below.  Any hints on how DTrace may help, or where in general 
to start would be much appreciated.  Back to the subject at hand.

-----------------------

I''m testing an application which makes use of a large file
mmap''d into
memory, as if it the application was using malloc().  The file is 
roughly 2x the size of physical ram.  Basically, I''m seeing the system 
stall for long periods of time, 60+ seconds, and then resume.  The file 
lives on an SSD (Intel x25-e) and I''m using zfs''s lzjb
compression to
make more efficient use of the ~30G of space provided by that SSD.

The general flow of things is, start application, ask it to use a 50G 
file. The file is created in a sparse manner at the location
designated, then mmap is called on the entire file.  All fine up to this
point.

I then start loading data into the application, and it starts pushing
data to the file as you''d expect.  Data is pushed to the file early and
often, as it''s mmap''d with the MAP_SHARED flag.  But, when the
application''s resident size reaches about 80% of the physical ram on
the
system, the system starts paging and things are still working relatively 
well, though slower, as expected.

Soon after, when reaching about 40G of data, I get stalls accessing the
SSD (according to iostat), in other words, no IO to that drive.  When I
started looking into what could be causing it, such as IO timeouts, I
run dmesg and it hangs after printing a timestamp.  I can ctrl-c dmesg,
but subsequent runs provide no better results.  I see no new messages in
/var/adm/messages, as I''d expect.

Eventually the system recovers, the latest case took over 10 minutes to
recover, after killing the application mentioned above, and I do see
disk timeouts in dmesg.

So, I can only assume that there''s either a driver bug in the SATA/SAS
controller I''m using and it''s throwing timeouts, or the SSD is
having
issues.  Looking at the zpool configuration, I see that failmode=wait,
and since that SSD is the only member of the zpool I would expect IO to
hang.

But, does that mean that dmesg should hang also?  Does that mean that
the kernel has at least one thread stuck?  Would failmode=continue be
more desired, or resilient?

During the hang, load-avg is artificially high, fmd being the one
process that sticks out in prstat output.  But fmdump -v doesn''t show
anything relevant.

Anyone have ideas on how to diagnose what''s going on there?

Thanks,
Ethan

System: Sun x4240 dual-amd2347, 32G of ram
SAS/SATA Controller: LSI3081E
OS: osol snv_98
SSD: Intel x25-e

Ross

2008-Dec-04 11:36 UTC

head link

[zfs-discuss] help diagnosing system hang

I''m no expert, but the first thing I''d ask is whether you
could repeat that test without using compression?  I''d be quite worried
about how a system is going to perform when it''s basically running off
a 50GB compressed file.

There seem to be a lot of variables here, on quite a few new systems, and
I''d want to try to focus on individual subsystems to ensure
they''re all working to spec fist.

If you can try without compression, the next thing I''d want to test is
to try with a file that''s smaller than RAM.  That should hopefully
eliminate paging or RAM pressure, and allow you to verify that the storage
subsystem is performing as expected.  You could then re-enable compression and
see if it still works.
-- 
This message posted from opensolaris.org

Richard Elling

2008-Dec-04 17:05 UTC

head link

[zfs-discuss] help diagnosing system hang

Ethan Erchinger wrote:> Hi all,
>
> First, I''ll say my intent is not to spam a bunch of lists, but
after
> posting to opensolaris-discuss I had someone communicate with me offline 
> that these lists would possibly be a better place to start.  So here we 
> are. For those on all three lists, sorry for the repetition.
>
> Second, this message is meant to solicit help in diagnosing the issue 
> described below.  Any hints on how DTrace may help, or where in general 
> to start would be much appreciated.  Back to the subject at hand.
>
> -----------------------
>
> I''m testing an application which makes use of a large file
mmap''d into
> memory, as if it the application was using malloc().  The file is 
> roughly 2x the size of physical ram.  Basically, I''m seeing the
system
> stall for long periods of time, 60+ seconds, and then resume.  The file 
> lives on an SSD (Intel x25-e) and I''m using zfs''s lzjb
compression to
> make more efficient use of the ~30G of space provided by that SSD.
>
> The general flow of things is, start application, ask it to use a 50G 
> file. The file is created in a sparse manner at the location
> designated, then mmap is called on the entire file.  All fine up to this
> point.
>
> I then start loading data into the application, and it starts pushing
> data to the file as you''d expect.  Data is pushed to the file
early and
> often, as it''s mmap''d with the MAP_SHARED flag.  But,
when the
> application''s resident size reaches about 80% of the physical ram
on the
> system, the system starts paging and things are still working relatively 
> well, though slower, as expected.
>
> Soon after, when reaching about 40G of data, I get stalls accessing the
> SSD (according to iostat), in other words, no IO to that drive.  When I
> started looking into what could be causing it, such as IO timeouts, I
> run dmesg and it hangs after printing a timestamp.  I can ctrl-c dmesg,
> but subsequent runs provide no better results.  I see no new messages in
> /var/adm/messages, as I''d expect.
>
> Eventually the system recovers, the latest case took over 10 minutes to
> recover, after killing the application mentioned above, and I do see
> disk timeouts in dmesg.
>
> So, I can only assume that there''s either a driver bug in the
SATA/SAS
> controller I''m using and it''s throwing timeouts, or the
SSD is having
> issues.  Looking at the zpool configuration, I see that failmode=wait,
> and since that SSD is the only member of the zpool I would expect IO to
> hang.
>
> But, does that mean that dmesg should hang also?  Does that mean that
> the kernel has at least one thread stuck?  Would failmode=continue be
> more desired, or resilient?
>
> During the hang, load-avg is artificially high, fmd being the one
> process that sticks out in prstat output.  But fmdump -v doesn''t
show
> anything relevant.
>   
I''ve seen these symptoms when a large number of errors were reported
in a short period of time and memory was low.  What does "fmdump -eV"
show?

Also, it would help to know what OS release you are using.
 -- richard
> Anyone have ideas on how to diagnose what''s going on there?
>
> Thanks,
> Ethan
>
> System: Sun x4240 dual-amd2347, 32G of ram
> SAS/SATA Controller: LSI3081E
> OS: osol snv_98
> SSD: Intel x25-e
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Ethan Erchinger

2008-Dec-04 17:40 UTC

head link

[zfs-discuss] help diagnosing system hang

Richard Elling wrote:>
> I''ve seen these symptoms when a large number of errors were
reported
> in a short period of time and memory was low.  What does "fmdump
-eV"
> show?
>fmdump -eV shows lots of messages like this, and yea, I believe that to 
be sd16 which is the SSD:

Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.dev.rqs.derr
nvlist version: 0
        class = ereport.io.scsi.cmd.disk.dev.rqs.derr
        ena = 0x76bbfe8ef9000c01
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci at 0,0/pci10de,375 at f/pci1000,3150 at 0/sd
at f,0
                devid = id1,sd at n5001517958702bc7
        (end detector)

        driver-assessment = retry
        op-code = 0x2a
        cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
        pkt-reason = 0x0
        pkt-state = 0x37
        pkt-stats = 0x0
        stat-code = 0x2
        key = 0x6
        asc = 0x29
        ascq = 0x0
        sense-data = 0x70 0x0 0x6 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 
0x29 0x0 0x0 0x0 0x0 0x0 0x0 0x0
        __ttl = 0x1
        __tod = 0x4936b44f 0xd6481a3

Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.recovered
nvlist version: 0
        class = ereport.io.scsi.cmd.disk.recovered
        ena = 0x76bbfe8ef9000c01
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = dev
                device-path = /pci at 0,0/pci10de,375 at f/pci1000,3150 at 0/sd
at f,0
                devid = id1,sd at n5001517958702bc7
        (end detector)

        driver-assessment = recovered
        op-code = 0x2a
        cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
        pkt-reason = 0x0
        pkt-state = 0x1f
        pkt-stats = 0x0
        __ttl = 0x1
        __tod = 0x4936b44f 0xd6481a3
> Also, it would help to know what OS release you are using.Oh, it was at the bottom of my original message, but I just realized 
that I''m actually running snv_99, forgot that I had performed an 
image-update a while back.

Ethan Erchinger

2008-Dec-04 17:55 UTC

head link

[zfs-discuss] help diagnosing system hang

Ross wrote:> I''m no expert, but the first thing I''d ask is whether you
could repeat that test without using compression?  I''d be quite worried
about how a system is going to perform when it''s basically running off
a 50GB compressed file.
>
>   Yes this does occur with compression off, but it''s harder to repeat.  I
think that''s only due to the fact that physical memory happens to 
closely match the size of the SSD though.  32G of ram, 32G SSD, both 
with some overhead taking the respectable levels to
26-28G.> There seem to be a lot of variables here, on quite a few new systems, and
I''d want to try to focus on individual subsystems to ensure
they''re all working to spec fist.
>   New systems?  SSD yes.  But the application is very well tested, 
although using malloc(), not mmap().  Compression isn''t new, lots of 
people use it.  I''ve run this system, running MySQL, through many many 
tests and it''s worked quite nicely.  Many variables, yes, lots of new 
code paths, I''d hope not.> If you can try without compression, the next thing I''d want to
test is to try with a file that''s smaller than RAM.  That should
hopefully eliminate paging or RAM pressure, and allow you to verify that the
storage subsystem is performing as expected.  You could then re-enable
compression and see if it still works.
>   Running with a file smaller than ram does not have this issue.  As a 
test I''ve run things a little differently.  I put the SSD as swap and 
used malloc() again, but allocate a large dataset > physical memory, and 
I''ve seen very similar stalls. There is 0 IO, but the application is 
blocked on something.  I guess I should try to insert to some debug 
code, or use dtruss to see if the application is waiting on a syscall.

Tim

2008-Dec-04 18:02 UTC

head link

[zfs-discuss] help diagnosing system hang

On Thu, Dec 4, 2008 at 11:55 AM, Ethan Erchinger <ethan at plaxo.com>
wrote:
>
>
> Ross wrote:
> > I''m no expert, but the first thing I''d ask is
whether you could repeat
> that test without using compression?  I''d be quite worried about
how a
> system is going to perform when it''s basically running off a 50GB
compressed
> file.
> >
> >
> Yes this does occur with compression off, but it''s harder to
repeat.  I
> think that''s only due to the fact that physical memory happens to
> closely match the size of the SSD though.  32G of ram, 32G SSD, both
> with some overhead taking the respectable levels to 26-28G.
> > There seem to be a lot of variables here, on quite a few new systems,
and
> I''d want to try to focus on individual subsystems to ensure
they''re all
> working to spec fist.
> >
> New systems?  SSD yes.  But the application is very well tested,
> although using malloc(), not mmap().  Compression isn''t new, lots
of
> people use it.  I''ve run this system, running MySQL, through many
many
> tests and it''s worked quite nicely.  Many variables, yes, lots of
new
> code paths, I''d hope not.
> > If you can try without compression, the next thing I''d want
to test is to
> try with a file that''s smaller than RAM.  That should hopefully
eliminate
> paging or RAM pressure, and allow you to verify that the storage subsystem
> is performing as expected.  You could then re-enable compression and see if
> it still works.
> >
> Running with a file smaller than ram does not have this issue.  As a
> test I''ve run things a little differently.  I put the SSD as swap
and
> used malloc() again, but allocate a large dataset > physical memory, and
> I''ve seen very similar stalls. There is 0 IO, but the application
is
> blocked on something.  I guess I should try to insert to some debug
> code, or use dtruss to see if the application is waiting on a syscall.
>

Are you leaving ANY ram for zfs to do it''s thing?  If you''re
consuming ALL
system memory for just this file/application, I would expect the system to
fall over and die.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081204/a9193b3a/attachment.html>

Ethan Erchinger

2008-Dec-04 18:13 UTC

head link

[zfs-discuss] help diagnosing system hang

Tim wrote:>
>
> Are you leaving ANY ram for zfs to do it''s thing?  If
you''re consuming
> ALL system memory for just this file/application, I would expect the 
> system to fall over and die.
>Hmm.  I believe that the kernel should manage that relationship for me.  
If the system cannot manage swap or paging on large mmap() files, there 
is something very wrong with the virtual memory manager in the kernel.

Richard Elling

2008-Dec-04 19:46 UTC

head link

[zfs-discuss] help diagnosing system hang

Ethan Erchinger wrote:>
>
> Richard Elling wrote:
>>
>> I''ve seen these symptoms when a large number of errors were
reported
>> in a short period of time and memory was low.  What does "fmdump
-eV"
>> show?
>>
> fmdump -eV shows lots of messages like this, and yea, I believe that 
> to be sd16 which is the SSD:
>
> Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.dev.rqs.derr
> nvlist version: 0
>        class = ereport.io.scsi.cmd.disk.dev.rqs.derr
>        ena = 0x76bbfe8ef9000c01
>        detector = (embedded nvlist)
>        nvlist version: 0
>                version = 0x0
>                scheme = dev
>                device-path = /pci at 0,0/pci10de,375 at f/pci1000,3150 at
0/sd at f,0
>                devid = id1,sd at n5001517958702bc7
>        (end detector)
>
>        driver-assessment = retry
We retry, which is why there is a recovery e-report later.
>        op-code = 0x2a
>        cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
>        pkt-reason = 0x0
>        pkt-state = 0x37
>        pkt-stats = 0x0
>        stat-code = 0x2
>        key = 0x6
>        asc = 0x29
>        ascq = 0x0
ASC/ASCQ 29/00 is POWER ON, RESET, OR BUS DEVICE RESET OCCURRED
http://www.t10.org/lists/asc-num.htm#ASC_29

[this should be more descriptive as the codes are, more-or-less,
standardized, I''ll try to file an RFE, unless someone beat me to it]

Depending on which system did the reset, it should be noted in the
/var/adm/messages log.  This makes me suspect the hardware (firmware,
actually).
 -- richard
>        sense-data = 0x70 0x0 0x6 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 
> 0x29 0x0 0x0 0x0 0x0 0x0 0x0 0x0
>        __ttl = 0x1
>        __tod = 0x4936b44f 0xd6481a3
>
> Dec 03 2008 08:31:11.224690595 ereport.io.scsi.cmd.disk.recovered
> nvlist version: 0
>        class = ereport.io.scsi.cmd.disk.recovered
>        ena = 0x76bbfe8ef9000c01
>        detector = (embedded nvlist)
>        nvlist version: 0
>                version = 0x0
>                scheme = dev
>                device-path = /pci at 0,0/pci10de,375 at f/pci1000,3150 at
0/sd at f,0
>                devid = id1,sd at n5001517958702bc7
>        (end detector)
>
>        driver-assessment = recovered
>        op-code = 0x2a
>        cdb = 0x2a 0x0 0x1 0xe8 0x20 0x88 0x0 0x0 0x3 0x0
>        pkt-reason = 0x0
>        pkt-state = 0x1f
>        pkt-stats = 0x0
>        __ttl = 0x1
>        __tod = 0x4936b44f 0xd6481a3
>
>> Also, it would help to know what OS release you are using.
> Oh, it was at the bottom of my original message, but I just realized 
> that I''m actually running snv_99, forgot that I had performed an 
> image-update a while back.
>

Ethan Erchinger

2008-Dec-04 23:14 UTC

head link

[zfs-discuss] help diagnosing system hang

Richard Elling wrote:>>
>>        asc = 0x29
>>        ascq = 0x0
>
> ASC/ASCQ 29/00 is POWER ON, RESET, OR BUS DEVICE RESET OCCURRED
> http://www.t10.org/lists/asc-num.htm#ASC_29
>
> [this should be more descriptive as the codes are, more-or-less,
> standardized, I''ll try to file an RFE, unless someone beat me to
it]
>
> Depending on which system did the reset, it should be noted in the
> /var/adm/messages log.  This makes me suspect the hardware (firmware,
> actually).
>Firmware of the SSD, or something else?

Richard Elling

2008-Dec-05 14:57 UTC

head link

[zfs-discuss] help diagnosing system hang

Ethan Erchinger wrote:> 
> Richard Elling wrote:
>>>
>>>        asc = 0x29
>>>        ascq = 0x0
>>
>> ASC/ASCQ 29/00 is POWER ON, RESET, OR BUS DEVICE RESET OCCURRED
>> http://www.t10.org/lists/asc-num.htm#ASC_29
>>
>> [this should be more descriptive as the codes are, more-or-less,
>> standardized, I''ll try to file an RFE, unless someone beat me
to it]
>>
>> Depending on which system did the reset, it should be noted in the
>> /var/adm/messages log.  This makes me suspect the hardware (firmware,
>> actually).
>>
> Firmware of the SSD, or something else?
The answer may lie in the /var/adm/messages file which should report
if a reset was received or sent.
  -- richard

Ethan Erchinger

2008-Dec-05 16:58 UTC

head link

[zfs-discuss] help diagnosing system hang

Richard Elling wrote:> The answer may lie in the /var/adm/messages file which should report
> if a reset was received or sent.Here is a sample set of messages at that time.  It looks like timeouts 
on the SSD for various requested blocks.  Maybe I need to talk with 
Intel about this issue.

Ethan
=========================
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0/sd at f,0 (sd16):
Dec  2 20:14:01 opensolaris     Error for Command: 
write                   Error Level: Retryable
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]       
Requested Block: 840                       Error Block: 840
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]       Vendor: 
ATA                                Serial Number: CVEM840201EU
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]       Sense 
Key: Unit_Attention
Dec  2 20:14:01 opensolaris scsi: [ID 107833 kern.notice]       ASC: 
0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Dec  2 20:15:08 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:15:08 opensolaris     Disconnected command timeout for Target 15
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:15:09 opensolaris     Log info 0x31140000 received for target 15.
Dec  2 20:15:09 opensolaris     scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:15:09 opensolaris     Log info 0x31140000 received for target 15.
Dec  2 20:15:09 opensolaris     scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:15:09 opensolaris     Log info 0x31140000 received for target 15.
Dec  2 20:15:09 opensolaris     scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:09 opensolaris scsi: [ID 365881 kern.info] 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:15:09 opensolaris     Log info 0x31140000 received for target 15.
Dec  2 20:15:09 opensolaris     scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0/sd at f,0 (sd16):
Dec  2 20:15:12 opensolaris     Error for Command: 
write                   Error Level: Retryable
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]       
Requested Block: 810                       Error Block: 810
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]       Vendor: 
ATA                                Serial Number: CVEM840201EU
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]       Sense 
Key: Unit_Attention
Dec  2 20:15:12 opensolaris scsi: [ID 107833 kern.notice]       ASC: 
0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Dec  2 20:16:19 opensolaris scsi: [ID 107833 kern.warning] WARNING: 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:16:19 opensolaris     Disconnected command timeout for Target 15
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:16:21 opensolaris     Log info 0x31140000 received for target 15.
Dec  2 20:16:21 opensolaris     scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:16:21 opensolaris     Log info 0x31140000 received for target 15.
Dec  2 20:16:21 opensolaris     scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:16:21 opensolaris     Log info 0x31140000 received for target 15.
Dec  2 20:16:21 opensolaris     scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Dec  2 20:16:21 opensolaris scsi: [ID 365881 kern.info] 
/pci at 0,0/pci10de,375 at f/pci1000,3150 at 0 (mpt0):
Dec  2 20:16:21 opensolaris     Log info 0x31140000 received for target 15.
Dec  2 20:16:21 opensolaris     scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc

Ethan Erchinger

2008-Dec-07 03:40 UTC

head link

[zfs-discuss] help diagnosing system hang

Ethan Erchinger wrote:> Here is a sample set of messages at that time.  It looks like timeouts 
> on the SSD for various requested blocks.  Maybe I need to talk with 
> Intel about this issue.
>   Keeping everyone up-to-date, for those who care, I''ve RMAd the Intel 
drive, and will retest when the replacement arrives.  I''m working under
the assumption that I have a bad drive.

Ethan

Apparently Analagous Threads

Search for more seemingly similar threads

zfs discuss - Dec 2008 - help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

[zfs-discuss] help diagnosing system hang

Apparently Analagous Threads