thr3ads.net - zfs discuss - [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Ross

2008-Jul-24 06:38 UTC

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Has anybody here got any thoughts on how to resolve this problem:
http://www.opensolaris.org/jive/thread.jspa?messageID=261204&tstart=0

It sounds like two of us have been affected by this now, and it''s a bit
of a nuisance your entire server hanging when a drive is removed, makes you
worry about how Solaris would handle a drive failure.

Has anybody tried pulling a drive on a live Thumper, surely they don''t
hang like this?  Although, having said that I do remember they do have a great
big warning in the manual about using cfgadm to stop the disk before removal
saying:

"Caution - You must follow these steps before removing a disk from service.
Failure to follow the procedure can corrupt your data or render your file system
inoperable."

Ross
 
 
This message posted from opensolaris.org

Dave

2008-Jul-24 18:23 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

I''ve discovered this as well - b81 to b93 (latest I''ve tried).
I
switched from my on-board SATA controller to AOC-SAT2-MV8 cards because 
the MCP55 controller caused random disk hangs. Now the SAT2-MV8 works as 
long as the drives are working correctly, but the system can''t handle a
drive failure or disconnect. :(

I don''t think there''s a bug filed for it. That would probably
be the
first step to getting this resolved (might also post to storage-discuss).

--
Dave

Ross wrote:> Has anybody here got any thoughts on how to resolve this problem:
> http://www.opensolaris.org/jive/thread.jspa?messageID=261204&tstart=0
> 
> It sounds like two of us have been affected by this now, and it''s
a bit of a nuisance your entire server hanging when a drive is removed, makes
you worry about how Solaris would handle a drive failure.
> 
> Has anybody tried pulling a drive on a live Thumper, surely they
don''t hang like this?  Although, having said that I do remember they do
have a great big warning in the manual about using cfgadm to stop the disk
before removal saying:
> 
> "Caution - You must follow these steps before removing a disk from
service.  Failure to follow the procedure can corrupt your data or render your
file system inoperable."
> 
> Ross
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Ross

2008-Jul-24 18:50 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Yeah, I thought of the storage forum today and found somebody else with the
problem, and since my post a couple of people have reported similar issues on
Thumpers.

I guess the storage thread is the best place for this now:
http://www.opensolaris.org/jive/thread.jspa?threadID=42507&tstart=0
 
 
This message posted from opensolaris.org

Ross

2008-Jul-28 16:03 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Ok, after doing a lot more testing of this I''ve found it''s not
the Supermicro controller causing problems.  It''s purely ZFS, and it
causes some major problems!  I''ve even found one scenario that appears
to cause huge data loss without any warning from ZFS - up to 30,000 files and
100MB of data missing after a reboot, with zfs reporting that the pool is OK.

***********************************************************************
1. Solaris handles USB and SATA hot plug fine

If disks are not in use by ZFS, you can unplug USB or SATA devices, cfgadm will
recognise the disconnection.  USB devices are recognised automatically as you
reconnect them, SATA devices need reconfiguring.  Cfgadm even recognises the
SATA device as an empty bay:

# cfgadm
Ap_Id                          Type         Receptacle   Occupant     Condition
sata1/7                        sata-port    empty        unconfigured ok
usb1/3                         unknown      empty        unconfigured ok

-- insert devices --

# cfgadm
Ap_Id                          Type         Receptacle   Occupant     Condition
sata1/7                        disk         connected    unconfigured unknown
usb1/3                         usb-storage  connected    configured   ok

To bring the sata drive online it''s just a case of running
# cfgadm -c configure sata1/7 

***********************************************************************
2. If ZFS is using a hot plug device, disconnecting it will hang all ZFS status
tools.

While pools remain accessible, any attempt to run "zpool status" will
hang.  I don''t know if there is any way to recover these tools once
this happens.  While this is a pretty big problem in itself, it also makes me
worry if other types of error could have the same effect.  I see potential for
this leaving a server in a state whereby you know there are errors in a pool,
but have no way of finding out what those errors might be without rebooting the
server.

***********************************************************************
3. Once ZFS status tools are hung the computer will not shut down.

The only way I''ve found to recover from this is to physically power
down the server.  The solaris shutdown process simply hangs.

***********************************************************************
4. While reading an offline disk causes errors, writing does not!  
    *** CAUSES DATA LOSS ***

This is a big one:  ZFS can continue writing to an unavailable pool.  It
doesn''t always generate errors (I''ve seen it copy over 100MB
before erroring), and if not spotted, this *will* cause data loss after you
reboot.

I discovered this while testing how ZFS coped with the removal of a hot plug
SATA drive.  I knew that the ZFS admin tools were hanging, but that redundant
pools remained available.  I wanted to see whether it was just the ZFS admin
tools that were failing, or whether ZFS was also failing to send appropriate
error messages back to the OS.

These are the tests I carried out:

Zpool:  Single drive zpool, consisting of one 250GB SATA drive in a hot plug
bay.
Test data:  A folder tree containing 19,160 items.  71.1MB in total.

TEST1:  Opened File Browser, copied the test data to the pool.  Half way through
the copy I pulled the drive.  THE COPY COMPLETED WITHOUT ERROR.  Zpool list
reports the pool as online, however zpool status hung as expected.

Not quite believing the results, I rebooted and tried again.

TEST2:  Opened File Browser, copied the data to the pool.  Pulled the drive half
way through.  The copy again finished without error.  Checking the properties
shows 19,160 files in the copy.  ZFS list again shows the filesystem as ONLINE.

Now I decided to see how many files I could copy before it errored.  I started
the copy again.  File Browser managed a further 9,171 files before it stopped. 
That''s nearly 30,000 files before any error was detected.  Again,
despite the copy having finally errored, zpool list shows the pool as online,
even though zpool status hangs.

I rebooted the server, and found that after the reboot my first copy contains
just 10,952 items, and my second copy is completely missing.  That''s a
loss of almost 20,000 files.  Zpool status however reports NO ERRORS.

For the third test I decided to see if these files are actually accessible
before the reboot:

TEST3:  This time I pulled the drive *before* starting the copy.  The copy
started much slower this time and only got to 2,939 files before reporting an
error.  At this point I copied all the files that had been copied to another
pool, and then rebooted.

After the reboot, the folder in the test pool had disappeared completely, but
the copy I took before rebooting was fine and contains 2,938 items,
approximately 12MB of data.  Again, zpool status reports no errors.

Further tests revealed that reading the pool results in an error almost
immediately.  Writing to the pool appears very inconsistent.

This is a huge problem.  Data can be written without error, and is still served
to users.  It is only later on that the server will begin to issue errors, but
at that point zfs admin tools are useless.  The only possible recovery is a
server reboot, but that will loose recent data written to the pool, but will do
so without any warnings at all from ZFS.

Needless to say I have a lot less faith in ZFS'' error checking after
having seen it loose 30,000 files without error.

***********************************************************************
5. If you are using CIFS and pull a drive from the volume, the whole server
hangs!

This appears to be the original problem I found.  While ZFS doesn''t
handle drive removal well, the combination of ZFS and CIFS is worse.  If you
pull a drive from a ZFS pool (redundant or not), which is serving CIFS data, the
entire server freezes until you re-insert the drive.

Note that ZFS itself does not recover after the drive is inserted;  admin tools
will still hang.  However the re-insertion of the drive is enough to unfreeze
the server.

Of course, you still need a physical reboot to get your ZFS admin tools back,
but in the meantime data is accessible again.
 
 
This message posted from opensolaris.org

Mattias Pantzare

2008-Jul-28 16:39 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

> 4. While reading an offline disk causes errors, writing does not!
>    *** CAUSES DATA LOSS ***
>
> This is a big one:  ZFS can continue writing to an unavailable pool.  It
doesn''t always generate errors (I''ve seen it copy over 100MB
> before erroring), and if not spotted, this *will* cause data loss after you
reboot.
>
> I discovered this while testing how ZFS coped with the removal of a hot
plug SATA drive.  I knew that the ZFS admin tools were
> hanging, but that redundant pools remained available.  I wanted to see
whether it was just the ZFS admin tools that were failing,
> or whether ZFS was also failing to send appropriate error messages back to
the OS.
>
This is not unique for zfs. If you need to know that your writes has
reached stable store you have to call fsync(). It is not enough to
close a file. This is true even for UFS, but UFS won''t delay writes
for all operations so you will notice faster. But you will still loose
data.

I have been able to undo rm -rf / on a FreeBSD system by pulling the
power cord before it wrote the changes...

Databases use fsync (or similar) before they close a transaction, that
one of the reasons that databases like hardware write caches.
cp will not.

Bob Friesenhahn

2008-Jul-28 18:03 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

On Mon, 28 Jul 2008, Ross wrote:>
> TEST1:  Opened File Browser, copied the test data to the pool. 
> Half way through the copy I pulled the drive.  THE COPY COMPLETED 
> WITHOUT ERROR.  Zpool list reports the pool as online, however zpool 
> status hung as expected.
Are you sure that this reference software you call "File Browser" 
actually responds to errors?  Maybe it is typical Linux-derived 
software which does not check for or handle errors and ZFS is 
reporting errors all along while the program pretends to copy the lost 
files.  If you were using Microsoft Windows, its file browser would 
probably report "Unknown error: 666" but at least you would see an 
error dialog and you could visit the Microsoft knowledge base to learn 
that message ID 666 means "Unknown error".  The other possibility is 
that all of these files fit in the ZFS write cache so the error 
reporting is delayed.

The Dtrace Toolkit provides a very useful DTrace script called 
''errinfo'' which will list every system call which reports and
error.
This is very useful and informative.  If you run it, you will see 
every error reported to the application level.

Bob
=====================================Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Ross Smith

2008-Jul-28 18:09 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

"File Browser" is the name of the program that Solaris opens when you
open "Computer" on the desktop.  It''s the default graphical
file manager.
 
It does eventually stop copying with an error, but it takes a good long while
for ZFS to throw up that error, and even when it does, the pool doesn''t
report any problems at all.> Date: Mon, 28 Jul 2008 13:03:24 -0500> From: bfriesen at
simple.dallas.tx.us> To: myxiplx at hotmail.com> CC: zfs-discuss at
opensolaris.org> Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when
drive removed> > On Mon, 28 Jul 2008, Ross wrote:> >> > TEST1:
Opened File Browser, copied the test data to the pool. > > Half way
through the copy I pulled the drive. THE COPY COMPLETED > > WITHOUT ERROR.
Zpool list reports the pool as online, however zpool > > status hung as
expected.> > Are you sure that this reference software you call "File
Browser" > actually responds to errors? Maybe it is typical
Linux-derived > software which does not check for or handle errors and ZFS is
> reporting errors all along while the program pretends to copy the lost >
files. If you were using Microsoft Windows, its file browser would > probably
report "Unknown error: 666" but at least you would see an > error
dialog and you could visit the Microsoft knowledge base to learn > that
message ID 666 means "Unknown error". The other possibility is >
that all of these files fit in the ZFS write cache so the error > reporting
is delayed.> > The Dtrace Toolkit provides a very useful DTrace script
called > ''errinfo'' which will list every system call which
reports and error. > This is very useful and informative. If you run it, you
will see > every error reported to the application level.> > Bob>
======================================> Bob Friesenhahn> bfriesen at
simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/>
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/>_________________________________________________________________
Invite your Facebook friends to chat on Messenger
http://clk.atdmt.com/UKM/go/101719649/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080728/7ccd448c/attachment.html>

Ross Smith

2008-Jul-28 18:10 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

snv_91.  I downloaded snv_94 today so I''ll be testing with that
tomorrow.> Date: Mon, 28 Jul 2008 09:58:43 -0700> From: Richard.Elling at
Sun.COM> Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive
removed> To: myxiplx at hotmail.com> > Which OS and revision?> --
richard> > > Ross wrote:> > Ok, after doing a lot more testing of
this I''ve found it''s not the Supermicro controller causing
problems. It''s purely ZFS, and it causes some major problems!
I''ve even found one scenario that appears to cause huge data loss
without any warning from ZFS - up to 30,000 files and 100MB of data missing
after a reboot, with zfs reporting that the pool is OK.> >> >
***********************************************************************> >
1. Solaris handles USB and SATA hot plug fine> >> > If disks are not
in use by ZFS, you can unplug USB or SATA devices, cfgadm will recognise the
disconnection. USB devices are recognised automatically as you reconnect them,
SATA devices need reconfiguring. Cfgadm even recognises the SATA device as an
empty bay:> >> > # cfgadm> > Ap_Id Type Receptacle Occupant
Condition> > sata1/7 sata-port empty unconfigured ok> > usb1/3
unknown empty unconfigured ok> >> > -- insert devices -->
>> > # cfgadm> > Ap_Id Type Receptacle Occupant Condition>
> sata1/7 disk connected unconfigured unknown> > usb1/3 usb-storage
connected configured ok> >> > To bring the sata drive online
it''s just a case of running> > # cfgadm -c configure sata1/7 >
>> >
***********************************************************************> >
2. If ZFS is using a hot plug device, disconnecting it will hang all ZFS status
tools.> >> > While pools remain accessible, any attempt to run
"zpool status" will hang. I don''t know if there is any way to
recover these tools once this happens. While this is a pretty big problem in
itself, it also makes me worry if other types of error could have the same
effect. I see potential for this leaving a server in a state whereby you know
there are errors in a pool, but have no way of finding out what those errors
might be without rebooting the server.> >> >
***********************************************************************> >
3. Once ZFS status tools are hung the computer will not shut down.> >>
> The only way I''ve found to recover from this is to physically
power down the server. The solaris shutdown process simply hangs.> >>
> ***********************************************************************>
> 4. While reading an offline disk causes errors, writing does not! > >
*** CAUSES DATA LOSS ***> >> > This is a big one: ZFS can continue
writing to an unavailable pool. It doesn''t always generate errors
(I''ve seen it copy over 100MB before erroring), and if not spotted,
this *will* cause data loss after you reboot.> >> > I discovered
this while testing how ZFS coped with the removal of a hot plug SATA drive. I
knew that the ZFS admin tools were hanging, but that redundant pools remained
available. I wanted to see whether it was just the ZFS admin tools that were
failing, or whether ZFS was also failing to send appropriate error messages back
to the OS.> >> > These are the tests I carried out:> >>
> Zpool: Single drive zpool, consisting of one 250GB SATA drive in a hot plug
bay.> > Test data: A folder tree containing 19,160 items. 71.1MB in
total.> >> > TEST1: Opened File Browser, copied the test data to the
pool. Half way through the copy I pulled the drive. THE COPY COMPLETED WITHOUT
ERROR. Zpool list reports the pool as online, however zpool status hung as
expected.> >> > Not quite believing the results, I rebooted and
tried again.> >> > TEST2: Opened File Browser, copied the data to
the pool. Pulled the drive half way through. The copy again finished without
error. Checking the properties shows 19,160 files in the copy. ZFS list again
shows the filesystem as ONLINE.> >> > Now I decided to see how many
files I could copy before it errored. I started the copy again. File Browser
managed a further 9,171 files before it stopped. That''s nearly 30,000
files before any error was detected. Again, despite the copy having finally
errored, zpool list shows the pool as online, even though zpool status
hangs.> >> > I rebooted the server, and found that after the reboot
my first copy contains just 10,952 items, and my second copy is completely
missing. That''s a loss of almost 20,000 files. Zpool status however
reports NO ERRORS.> >> > For the third test I decided to see if
these files are actually accessible before the reboot:> >> > TEST3:
This time I pulled the drive *before* starting the copy. The copy started much
slower this time and only got to 2,939 files before reporting an error. At this
point I copied all the files that had been copied to another pool, and then
rebooted.> >> > After the reboot, the folder in the test pool had
disappeared completely, but the copy I took before rebooting was fine and
contains 2,938 items, approximately 12MB of data. Again, zpool status reports no
errors.> >> > Further tests revealed that reading the pool results
in an error almost immediately. Writing to the pool appears very
inconsistent.> >> > This is a huge problem. Data can be written
without error, and is still served to users. It is only later on that the server
will begin to issue errors, but at that point zfs admin tools are useless. The
only possible recovery is a server reboot, but that will loose recent data
written to the pool, but will do so without any warnings at all from ZFS. >
>> > Needless to say I have a lot less faith in ZFS'' error
checking after having seen it loose 30,000 files without error.> >>
> ***********************************************************************>
> 5. If you are using CIFS and pull a drive from the volume, the whole server
hangs!> >> > This appears to be the original problem I found. While
ZFS doesn''t handle drive removal well, the combination of ZFS and CIFS
is worse. If you pull a drive from a ZFS pool (redundant or not), which is
serving CIFS data, the entire server freezes until you re-insert the drive.>
>> > Note that ZFS itself does not recover after the drive is inserted;
admin tools will still hang. However the re-insertion of the drive is enough to
unfreeze the server.> >> > Of course, you still need a physical
reboot to get your ZFS admin tools back, but in the meantime data is accessible
again.> > > > > > This message posted from opensolaris.org>
> _______________________________________________> > zfs-discuss
mailing list> > zfs-discuss at opensolaris.org> >
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss> > >_________________________________________________________________
Find the best and worst places on the planet
http://clk.atdmt.com/UKM/go/101719807/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080728/ae39563b/attachment.html>

Ross Smith

2008-Jul-28 18:41 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Heh, sounds like there are a few problems with that tool then.  I guess
that''s one of the benefits of me being so new to Solaris.  I''m
still learning all the command line tools so I''m playing with the
graphical stuff as much as possible. :)
 
Regarding the delay, I plan to have a go tomorrow and see just how much of a
delay there can be.  I''ve definately had the system up for 10 minutes
still reading data that''s going to disappear on reboot and suspect I
can stretch it a lot longer than that.
 
The biggest concern for me with the delay is that the data appears fine to all
intents & purposes.  You can read it off the pool and copy it elsewhere. 
There doesn''t seem to be any indication that it''s going to
disappear after a reboot.> Date: Mon, 28 Jul 2008 13:35:21 -0500> From: bfriesen at
simple.dallas.tx.us> To: myxiplx at hotmail.com> Subject: RE:
[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed> > On Mon,
28 Jul 2008, Ross Smith wrote:> > >> > "File Browser"
is the name of the program that Solaris opens when > > you open
"Computer" on the desktop. It''s the default graphical file
> > manager.> > Got it. I have brought it up once or twice. I tend
to distrust such > tools since I am not sure if their implementation is
sound. In fact, > usually it is not.> > Now that you mention this tool,
I am going to see what happens when it > enters my test directory containing
a million files. Hmmm, this turd > says "Loading" and I see that
system error messages are scrolling by > as fast as dtrace can report
them:> > nautilus ioctl 25 Inappropriate ioctl for device> nautilus acl
89 Unsupported file system operation> nautilus ioctl 25 Inappropriate ioctl
for device> nautilus acl 89 Unsupported file system operation> nautilus
ioctl 25 Inappropriate ioctl for device> > we shall see if it crashes or
if it eventually returns. Ahhh, it has > returned and declared that my
directory with a million files is > "(Empty)". So much for a short
stint of trusting this tool.> > > It does eventually stop copying with
an error, but it takes a good > > long while for ZFS to throw up that
error, and even when it does, > > the pool doesn''t report any
problems at all.> > The delayed error report may be ok but the pool not
reporting a > problem does not seem very ok.> > Bob>
======================================> Bob Friesenhahn> bfriesen at
simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/>
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/>_________________________________________________________________
Play and win great prizes with Live Search and Kung Fu Panda
http://clk.atdmt.com/UKM/go/101719966/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080728/264e1a43/attachment.html>

Miles Nordin

2008-Jul-29 01:24 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

>>>>> "mp" == Mattias Pantzare <pantzer at
ludd.ltu.se> writes:
>> This is a big one: ZFS can continue writing to an unavailable
>> pool. It doesn''t always generate errors (I''ve
seen it copy
>> over 100MB before erroring), and if not spotted, this *will*
>> cause data loss after you reboot.

mp> This is not unique for zfs. If you need to know that your
mp> writes has reached stable store you have to call fsync().

seconded.

How about this:

* start the copy

* pull the disk, without waiting for an error reported to the application

* type ''lockfs -fa''. Does either lockfs hang, or you get an
immediate error after requesting the lockfs?

If so, I think it''s ok and within the unix tradition to allow all
these writes, it''s just maybe a more extreme version of the tradition,
which might not be an entirely bad compromise if ZFS can keep up this
behavior, and actually retry the unreported failed writes, when
confronted with FC, iSCSI, USB, FW targets that bounce. I''m not sure
if it can ever do that yet or not, but architecturally I wouldn''t want
to demand that it return failure to the app too soon, so long as
fsync() still behaves correctly w.r.t. power failures.

However the other problems you report are things I''ve run into, also.
''zpool status'' should not be touching the disk at all. so, we
have:

* ''zpool list'' shows ONLINE several minutes after a drive is
yanked.
At the time ''zpool list'' still shows ONLINE,
''zpool status'' doesn''t
show anything at all because it hangs, so ONLINE seems too
positive a report for the situation. I''d suggest:

+ ''zpool list'' should not borrow the ONLINE terminology
from ''zpool
status'' if the list command means something different by the word
ONLINE. maybe SEEMS_TO_BE_AROUND_SOMEWHERE is more appropriate.

+ during this problem, ''zpool list'' is available while
''zpool
status'' is not working. Fine, maybe, during a failure, not all
status tools will be available. However it would be nice if, as
a minimum, some status tool capable of reporting ``pool X is
failing'''' were available. In the absence of that, you
may have
to reboot the machine without ever knowing even which pool failed
to bring it down.

* maybe sometimes certain types of status and statistics aren''t
available, but no status-reporting tools should ever be subject to
blocking inside the kernel. At worst they should refuse to give
information, and return to a prompt, immediately. I''m in the habit
of typing ''zpool status &'' during serious problems so I
don''t lose
control of the console.

* ''zpool status'' is used when things are failing. Cabling
and driver
state machines are among the failures from which a volume manager
should protect us---that''s why we say ``buy redundant controllers
if possible.''''

In this scenario, a read is an intrusive act, because it could
provoke a problem. so even if ''zpool status'' is only
reading, not
writing to disk nor to data structures inside the kernel, it is
still not really a status tool. It''s an invasive
poking/pinging/restarting/breaking tool. Such tools should be
segregated, and shouldn''t substitute for the requirement to have
true status tools that only read data structures kept in the
kernel, not update kernel structures and not touch disks. This
would be like if ''ps'' made an implicit call to rcapd, or
activated
some swapping thread, or something like that. ``My machine is
sluggish. I wonder what''s slowing it down.
...''ps''... oh, shit,
now it''s not responding at all, and I''ll never know
why.''''

There can be other tools, too, but I think LVM2 and SVM both have
carefully non-invasive status tools, don''t they?

This principle should be followed everywhere. For example,
''iscsiadm list discovery-address'' should simply list the
discovery
addresses. It should not implicitly attempt to contact each
discovery address in its list, while I wait.

-----8<-----
terabithia:/# time iscsiadm list discovery-address
Discovery Address: 10.100.100.135:3260
Discovery Address: 10.100.100.138:3260

real 0m45.935s
user 0m0.006s
sys 0m0.019s
terabithia:/# jobs
[1]+ Running zpool status &
terabithia:/#
-----8<-----

now, if you''re really scalable, try the above again with 100 iSCSI
targets and 20 pools. A single ''iscsiadm list
discovery-address''
command, even if it''s sort-of ``working'''', can
take hours to
complete.

This does not happen on Linux where I configure through text files
and inspect status through ''cat /proc/...''

In other words, it''s not just that the information ''zpool
status''
gives is inaccurate. It''s not just that some information is hidden
(like how sometimes a device listed as ONLINE will say ``no valid
replicas'''' when you try to offline it, and sometimes it
won''t, and the
only way to tell the difference is to attempt to offline the
device---so trying to ''zpool offline'' each device in turn is a
way to
get some more indication of pool health than what ''zpool
status'' gives
on its own). It''s also that I don''t trust ''zpool
status'' not to
affect the information it''s supposed to be reporting.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080728/dd4f305b/attachment.bin>

Ross Smith

2008-Jul-29 10:07 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

A little more information today. I had a feeling that ZFS would continue quite
some time before giving an error, and today I''ve shown that you can
carry on working with the filesystem for at least half an hour with the disk
removed.

I suspect on a system with little load you could carry on working for several
hours without any indication that there is a problem. It looks to me like ZFS
is caching reads & writes, and that provided requests can be fulfilled from
the cache, it doesn''t care whether the disk is present or not.

I would guess that ZFS is attempting to write to the disk in the background, and
that this is silently failing.

Here''s the log of the tests I did today. After removing the drive,
over a period of 30 minutes I copied folders to the filesystem, created an
archive, set permissions, and checked properties. I did this both in the
command line and with the graphical file manager tool in Solaris. Neither
reported any errors, and all the data could be read & written fine. Until
the reboot, at which point all the data was lost, again without error.

If you''re not interested in the detail, please skip to the end where
I''ve got some thoughts on just how many problems there are here.

# zpool status test pool: test state: ONLINE scrub: none requestedconfig:
NAME STATE READ WRITE CKSUM test ONLINE 0
0 0 c2t7d0 ONLINE 0 0 0
errors: No known data errors# zfs list testNAME USED AVAIL REFER
MOUNTPOINTtest 243M 228G 242M /test# zpool list testNAME SIZE USED
AVAIL CAP HEALTH ALTROOTtest 232G 243M 232G 0% ONLINE -

-- drive removed --

# cfgadm |grep sata1/7sata1/7 sata-port empty
unconfigured ok

-- cfgadmin knows the drive is removed. How come ZFS does not? --

# cp -r /rc-pool/copytest /test/copytest# zpool list testNAME SIZE USED
AVAIL CAP HEALTH ALTROOTtest 232G 73.4M 232G 0% ONLINE -# zfs
list testNAME USED AVAIL REFER MOUNTPOINTtest 142K 228G 18K /test

-- Yup, still up. Let''s start the clock --

# dateTue Jul 29 09:31:33 BST 2008# du -hs /test/copytest 667K /test/copytest

-- 5 minutes later, still going strong --

# dateTue Jul 29 09:36:30 BST 2008# zpool list testNAME SIZE USED AVAIL
CAP HEALTH ALTROOTtest 232G 73.4M 232G 0% ONLINE -# cp -r
/rc-pool/copytest /test/copytest2# ls /testcopytest copytest2# du -h -s /test
1.3M /test# zpool list testNAME SIZE USED AVAIL CAP HEALTH ALTROOTtest
232G 73.4M 232G 0% ONLINE -# find /test | wc -l
2669# find //test/copytest | wc -l 1334# find /rc-pool/copytest | wc -l
1334# du -h -s /rc-pool/copytest 5.3M /rc-pool/copytest

-- Not sure why the original pool has 5.3MB of data when I use du. --
-- File Manager reports that they both have the same size --

-- 15 minutes later it''s still working. I can read data fine --
# dateTue Jul 29 09:43:04 BST 2008# chmod 777 /test/*# mkdir /rc-pool/test2# cp
-r /test/copytest2 /rc-pool/test2/copytest2# find /rc-pool/test2/copytest2 | wc
-l 1334# zpool list testNAME SIZE USED AVAIL CAP HEALTH
ALTROOTtest 232G 73.4M 232G 0% ONLINE -

-- and yup, the drive is still offline --

# cfgadm | grep sata1/7sata1/7 sata-port empty
unconfigured ok
-- And finally, after 30 minutes the pool is still going strong --

# dateTue Jul 29 09:59:56 BST 2008
# tar -cf /test/copytest.tar /test/copytest/*# ls -ltotal 3drwxrwxrwx 3 root
root 3 Jul 29 09:30 copytest-rwxrwxrwx 1 root root 4626432
Jul 29 09:59 copytest.tardrwxrwxrwx 3 root root 3 Jul 29 09:39
copytest2# zpool list testNAME SIZE USED AVAIL CAP HEALTH ALTROOTtest
232G 73.4M 232G 0% ONLINE -

After a full 30 minutes there''s no indication whatsoever of any
problem. Checking properties of the folder in File Browser reports 2665 items,
totalling 9.0MB.

At this point I tried "# zfs set sharesmb=on test". I didn''t
really expect it to work, and sure enough, that command hung. zpool status also
hung, so I had to reboot the server.

-- Rebooted server --

Now I found that not only are all the files I''ve written in the last 30
minutes missing, but in fact files that I had deleted several minutes prior to
removing the drive have re-appeared.

-- /test mount point is still present, I''ll probably have to remove
that manually --

# cd /# lsbin export media proc systemboot home
mnt rc-pool testdev kernel net rc-usb
tmpdevices lib opt root usretc lost+found
platform sbin var

-- ZFS still has the pool mounted, but at least now it realises it''s
not working --

# zpool listNAME SIZE USED AVAIL CAP HEALTH ALTROOTrc-pool 2.27T
52.6G 2.21T 2% DEGRADED -test - - - - FAULTED -#
zpool status test pool: test state: UNAVAILstatus: One or more devices could
not be opened. There are insufficient replicas for the pool to continue
functioning.action: Attach the missing device and online it using
''zpool online''. see: http://www.sun.com/msg/ZFS-8000-3C
scrub: none requestedconfig:
NAME STATE READ WRITE CKSUM test UNAVAIL 0 0 0
insufficient replicas c2t7d0 UNAVAIL 0 0 0 cannot open

-- At least re-activating the pool is simple, but gotta love the "No known
data errors" line --

# cfgadm -c configure sata1/7# zpool status test pool: test state: ONLINE
scrub: none requestedconfig:
NAME STATE READ WRITE CKSUM test ONLINE 0 0 0
c2t7d0 ONLINE 0 0 0
errors: No known data errors

-- But of course, although ZFS thinks it''s online, it didn''t
mount properly --

# cd /test# ls# zpool export test# rm -r /test# zpool import test# cd test#
lsvar (copy) var2

-- Now that''s unexpected. Those folders should be long gone.
Let''s see how many files ZFS failed to delete --

# du -h -s /test 77M /test# find /test | wc -l 19033

So in addition to working for a full half hour creating files, it''s
also failed to remove 77MB of data contained in nearly 20,000 files. And
it''s done all that without reporting any error or problem with the
pool.

In fact, if I didn''t know what I was looking for, there would be no
indication of a problem at all. Before the reboot I can''t find
what''s going on as "zfs status" hangs. After the reboot it
says there''s no problem. Both ZFS and it''s troubleshooting
tools fail in a big way here.

As others have said, "zfs status" should not hang. ZFS has to know
the state of all the drives and pools it''s currently using, "zfs
status" should simply report the current known status from ZFS''
internal state. It shouldn''t need to scan anything. ZFS''
internal state should also be checking with cfgadm so that it knows if a disk
isn''t there. It should also be updated if the cache can''t be
flushed to disk, and "zfs list / zpool list" needs to borrow state
information from the status commands so that they don''t say
''online'' when the pool has problems.

ZFS needs to deal more intelligently with mount points when a pool has problems.
Leaving the folder lying around in a way that prevents the pool mounting
properly when the drives are recovered is not good. When the pool appears to
come back online without errors, it would be very easy for somebody to assume
the data was lost from the pool without realising that it simply hasn''t
mounted and they''re actually looking at an empty folder. Firstly ZFS
should be removing the mount point when problems occur, and secondly, ZFS list
or ZFS status should include information to inform you that the pool could not
be mounted properly.

ZFS status really should be warning of any ZFS errors that occur. Including
things like being unable to mount the pool, CIFS mounts failing, etc...

And finally, if ZFS does find problems writing from the cache, it really needs
to log somewhere the names of all the files affected, and the action that could
not be carried out. ZFS knows the files it was meant to delete here, it also
knows the files that were written. I can accept that with delayed writes files
may occasionally be lost when a failure happens, but I don''t accept
that we need to loose all knowledge of the affected files when the filesystem
has complete knowledge of what is affected. If there are any working
filesystems on the server, ZFS should make an attempt to store a log of the
problem, failing that it should e-mail the data out. The admin really needs to
know what files have been affected so that they can notify users of the data
loss. I don''t know where you would store this information, but
wherever that is, "zpool status" should be reporting the error and
directing the admin to the log file.

I would probably say this could be safely stored on the system drive. Would it
be possible to have a number of possible places to store this log? What
I''m thinking is that if the system drive is unavailable, ZFS could try
each pool in turn and attempt to store the log there.

In fact e-mail alerts or external error logging would be a great addition to
ZFS. Surely it makes sense that filesystem errors would be better off being
stored and handled externally?

Ross
> Date: Mon, 28 Jul 2008 12:28:34 -0700> From: Richard.Elling at
Sun.COM> Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive
removed> To: myxiplx at hotmail.com> > I''m trying to reproduce
and will let you know what I find.> -- richard>_________________________________________________________________
The John Lewis Clearance - save up to 50% with FREE delivery
http://clk.atdmt.com/UKM/go/101719806/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080729/3c5ea29f/attachment.html>

David Collier-Brown

2008-Jul-29 15:59 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Just a side comment: this discussion shows all the classic symptoms of 
two groups of people with different basic assumptions, each wondering why 
the other said what they did.
  Getting these out in the open would be A Good Thing (;-))

--dave

Jonathan Loran wrote:> I think the important point here is that this makes the case for ZFS 
> handling at least one layer of redundancy.  If the disk you pulled was 
> part of a mirror or raidz, there wouldn''t be data loss when your
system
> was rebooted.  In fact, the zpool status commands would likely keep 
> working, and a reboot wouldn''t be necessary at all.  I think
it''s
> unreasonable to expect a system with any file system to recover from a 
> single drive being pulled.  Of course, loosing extra work because of the 
> delayed notification is bad, but none the less, this is not a reasonable 
> test.  Basically, always provide redundancy in your zpool config.
> 
> Jon
> 
> Ross Smith wrote:
> 
>>A little more information today.  I had a feeling that ZFS would 
>>continue quite some time before giving an error, and today I''ve
shown
>>that you can carry on working with the filesystem for at least half an 
>>hour with the disk removed.
>> 
>>I suspect on a system with little load you could carry on working for 
>>several hours without any indication that there is a problem.  It 
>>looks to me like ZFS is caching reads & writes, and that provided 
>>requests can be fulfilled from the cache, it doesn''t care
whether the
>>disk is present or not.
>> 
>>I would guess that ZFS is attempting to write to the disk in the 
>>background, and that this is silently failing.
>> 
>>Here''s the log of the tests I did today.  After removing the
drive,
>>over a period of 30 minutes I copied folders to the filesystem, 
>>created an archive, set permissions, and checked properties.  I did 
>>this both in the command line and with the graphical file manager tool 
>>in Solaris.  Neither reported any errors, and all the data could be 
>>read & written fine.  Until the reboot, at which point all the data 
>>was lost, again without error.
>> 
>>If you''re not interested in the detail, please skip to the end
where
>>I''ve got some thoughts on just how many problems there are
here.
>> 
>> 
>># zpool status test
>>  pool: test
>> state: ONLINE
>> scrub: none requested
>>config:
>>        NAME        STATE     READ WRITE CKSUM
>>        test        ONLINE       0     0     0
>>          c2t7d0    ONLINE       0     0     0
>>errors: No known data errors
>># zfs list test
>>NAME   USED  AVAIL  REFER  MOUNTPOINT
>>test   243M   228G   242M  /test
>># zpool list test
>>NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
>>test   232G   243M   232G     0%  ONLINE  -
>> 
>>
>>-- drive removed --
>> 
>>
>># cfgadm |grep sata1/7
>>sata1/7                        sata-port    empty        unconfigured ok
>> 
>> 
>>-- cfgadmin knows the drive is removed.  How come ZFS does not? --
>> 
>>
>># cp -r /rc-pool/copytest /test/copytest
>># zpool list test
>>NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
>>test      232G  73.4M   232G     0%  ONLINE  -
>># zfs list test
>>NAME   USED  AVAIL  REFER  MOUNTPOINT
>>test   142K   228G    18K  /test
>> 
>> 
>>-- Yup, still up.  Let''s start the clock --
>> 
>>
>># date
>>Tue Jul 29 09:31:33 BST 2008
>># du -hs /test/copytest
>> 667K /test/copytest
>> 
>> 
>>-- 5 minutes later, still going strong --
>> 
>>
>># date
>>Tue Jul 29 09:36:30 BST 2008
>># zpool list test
>>NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
>>test      232G  73.4M   232G     0%  ONLINE  -
>># cp -r /rc-pool/copytest /test/copytest2
>># ls /test
>>copytest   copytest2
>># du -h -s /test
>> 1.3M /test
>># zpool list test
>>NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
>>test   232G  73.4M   232G     0%  ONLINE  -
>># find /test | wc -l                        
>>    2669
>># find //test/copytest | wc -l
>>    1334
>># find /rc-pool/copytest | wc -l
>>    1334
>># du -h -s /rc-pool/copytest
>> 5.3M /rc-pool/copytest
>> 
>> 
>>-- Not sure why the original pool has 5.3MB of data when I use du. --
>>-- File Manager reports that they both have the same size --
>> 
>> 
>>-- 15 minutes later it''s still working.  I can read data fine
--
>>
>># date
>>Tue Jul 29 09:43:04 BST 2008
>># chmod 777 /test/*
>># mkdir /rc-pool/test2
>># cp -r /test/copytest2 /rc-pool/test2/copytest2
>># find /rc-pool/test2/copytest2 | wc -l
>>    1334
>># zpool list test
>>NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
>>test      232G  73.4M   232G     0%  ONLINE  -
>> 
>> 
>>-- and yup, the drive is still offline --
>> 
>>
>># cfgadm | grep sata1/7
>>sata1/7                        sata-port    empty        unconfigured ok
>>
>>
>>-- And finally, after 30 minutes the pool is still going strong --
>> 
>>
>># date
>>Tue Jul 29 09:59:56 BST 2008
>># tar -cf /test/copytest.tar /test/copytest/*
>># ls -l
>>total 3
>>drwxrwxrwx   3 root     root           3 Jul 29 09:30 copytest
>>-rwxrwxrwx   1 root     root     4626432 Jul 29 09:59 copytest.tar
>>drwxrwxrwx   3 root     root           3 Jul 29 09:39 copytest2
>># zpool list test
>>NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
>>test   232G  73.4M   232G     0%  ONLINE  -
>>
>> 
>>After a full 30 minutes there''s no indication whatsoever of any
>>problem.  Checking properties of the folder in File Browser reports 
>>2665 items, totalling 9.0MB.
>> 
>>At this point I tried "# zfs set sharesmb=on test".  I
didn''t really
>>expect it to work, and sure enough, that command hung.  zpool status 
>>also hung, so I had to reboot the server.
>> 
>> 
>>-- Rebooted server --
>> 
>> 
>>Now I found that not only are all the files I''ve written in the
last
>>30 minutes missing, but in fact files that I had deleted several 
>>minutes prior to removing the drive have re-appeared.
>> 
>> 
>>-- /test mount point is still present, I''ll probably have to
remove
>>that manually --
>> 
>> 
>># cd /
>># ls
>>bin         export      media       proc        system
>>boot        home        mnt         rc-pool     test
>>dev         kernel      net         rc-usb      tmp
>>devices     lib         opt         root        usr
>>etc         lost+found  platform    sbin        var
>> 
>> 
>>-- ZFS still has the pool mounted, but at least now it realises
it''s
>>not working --
>> 
>> 
>># zpool list
>>NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
>>rc-pool  2.27T  52.6G  2.21T     2%  DEGRADED  -
>>test         -      -      -      -  FAULTED  -
>># zpool status test
>>  pool: test
>> state: UNAVAIL
>>status: One or more devices could not be opened.  There are insufficient
>> replicas for the pool to continue functioning.
>>action: Attach the missing device and online it using ''zpool
online''.
>>   see: http://www.sun.com/msg/ZFS-8000-3C
>> scrub: none requested
>>config:
>> NAME        STATE     READ WRITE CKSUM
>> test        UNAVAIL      0     0     0  insufficient replicas
>>   c2t7d0    UNAVAIL      0     0     0  cannot open
>> 
>> 
>>-- At least re-activating the pool is simple, but gotta love the
"No
>>known data errors" line --
>> 
>>
>># cfgadm -c configure sata1/7
>># zpool status test
>>  pool: test
>> state: ONLINE
>> scrub: none requested
>>config:
>> NAME        STATE     READ WRITE CKSUM
>> test        ONLINE       0     0     0
>>   c2t7d0    ONLINE       0     0     0
>>errors: No known data errors
>> 
>> 
>>-- But of course, although ZFS thinks it''s online, it
didn''t mount
>>properly --
>> 
>>
>># cd /test
>># ls
>># zpool export test
>># rm -r /test
>># zpool import test
>># cd test
>># ls
>>var (copy)  var2
>> 
>> 
>>-- Now that''s unexpected.  Those folders should be long gone. 
Let''s
>>see how many files ZFS failed to delete --
>> 
>>
>># du -h -s /test
>>  77M /test
>># find /test | wc -l
>>   19033
>> 
>> 
>>So in addition to working for a full half hour creating files,
it''s
>>also failed to remove 77MB of data contained in nearly 20,000 files.  
>>And it''s done all that without reporting any error or problem
with the
>>pool.
>> 
>>In fact, if I didn''t know what I was looking for, there would
be no
>>indication of a problem at all.  Before the reboot I can''t find
what''s
>>going on as "zfs status" hangs.  After the reboot it says
there''s no
>>problem.  Both ZFS and it''s troubleshooting tools fail in a big
way
>>here. 
>> 
>>As others have said, "zfs status" should not hang.  ZFS has to
know
>>the state of all the drives and pools it''s currently using,
"zfs
>>status" should simply report the current known status from
ZFS''
>>internal state.  It shouldn''t need to scan anything. 
ZFS'' internal
>>state should also be checking with cfgadm so that it knows if a disk 
>>isn''t there.  It should also be updated if the cache
can''t be flushed
>>to disk, and "zfs list / zpool list" needs to borrow state
information
>>from the status commands so that they don''t say
''online'' when the pool
>>has problems.
>> 
>>ZFS needs to deal more intelligently with mount points when a pool has 
>>problems.  Leaving the folder lying around in a way that prevents the 
>>pool mounting properly when the drives are recovered is not good.  
>>When the pool appears to come back online without errors, it would be 
>>very easy for somebody to assume the data was lost from the pool 
>>without realising that it simply hasn''t mounted and
they''re actually
>>looking at an empty folder.  Firstly ZFS should be removing the mount 
>>point when problems occur, and secondly, ZFS list or ZFS status should 
>>include information to inform you that the pool could not be mounted 
>>properly.
>> 
>>ZFS status really should be warning of any ZFS errors that occur.  
>>Including things like being unable to mount the pool, CIFS mounts 
>>failing, etc...
>> 
>>And finally, if ZFS does find problems writing from the cache, it 
>>really needs to log somewhere the names of all the files affected, and 
>>the action that could not be carried out.  ZFS knows the files it was 
>>meant to delete here, it also knows the files that were written.  I 
>>can accept that with delayed writes files may occasionally be lost 
>>when a failure happens, but I don''t accept that we need to
loose all
>>knowledge of the affected files when the filesystem has complete 
>>knowledge of what is affected.  If there are any working filesystems 
>>on the server, ZFS should make an attempt to store a log of the 
>>problem, failing that it should e-mail the data out.  The admin really 
>>needs to know what files have been affected so that they can notify 
>>users of the data loss.  I don''t know where you would store
this
>>information, but wherever that is, "zpool status" should be
reporting
>>the error and directing the admin to the log file.
>> 
>>I would probably say this could be safely stored on the system drive.  
>>Would it be possible to have a number of possible places to store this 
>>log?  What I''m thinking is that if the system drive is
unavailable,
>>ZFS could try each pool in turn and attempt to store the log there.
>> 
>>In fact e-mail alerts or external error logging would be a great 
>>addition to ZFS.  Surely it makes sense that filesystem errors would 
>>be better off being stored and handled externally?
>> 
>>Ross
>> 
>>
>>
>>
>>>Date: Mon, 28 Jul 2008 12:28:34 -0700
>>>From: Richard.Elling at Sun.COM
>>>Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive 
>>
>>removed
>>
>>>To: myxiplx at hotmail.com
>>>
>>>I''m trying to reproduce and will let you know what I find.
>>>-- richard
>>>
>>
>>
>>------------------------------------------------------------------------
>>Win ?3000 to spend on whatever you want at Uni! Click here to WIN! 
>><http://clk.atdmt.com/UKM/go/101719803/direct/01/>
>>------------------------------------------------------------------------
>>
>>_______________________________________________
>>zfs-discuss mailing list
>>zfs-discuss at opensolaris.org
>>http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>  
> 
> 
-- 
David Collier-Brown            | Always do right. This will gratify
Sun Microsystems, Toronto      | some people and astonish the rest
davecb at sun.com                 |                      -- Mark Twain
cell: (647) 833-9377, bridge: (877) 385-4099 code: 506 9191#

Jonathan Loran

2008-Jul-29 19:23 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

I think the important point here is that this makes the case for ZFS 
handling at least one layer of redundancy.  If the disk you pulled was 
part of a mirror or raidz, there wouldn''t be data loss when your system
was rebooted.  In fact, the zpool status commands would likely keep 
working, and a reboot wouldn''t be necessary at all.  I think
it''s
unreasonable to expect a system with any file system to recover from a 
single drive being pulled.  Of course, loosing extra work because of the 
delayed notification is bad, but none the less, this is not a reasonable 
test.  Basically, always provide redundancy in your zpool config.

Jon

Ross Smith wrote:> A little more information today.  I had a feeling that ZFS would 
> continue quite some time before giving an error, and today I''ve
shown
> that you can carry on working with the filesystem for at least half an 
> hour with the disk removed.
>  
> I suspect on a system with little load you could carry on working for 
> several hours without any indication that there is a problem.  It 
> looks to me like ZFS is caching reads & writes, and that provided 
> requests can be fulfilled from the cache, it doesn''t care whether
the
> disk is present or not.
>  
> I would guess that ZFS is attempting to write to the disk in the 
> background, and that this is silently failing.
>  
> Here''s the log of the tests I did today.  After removing the
drive,
> over a period of 30 minutes I copied folders to the filesystem, 
> created an archive, set permissions, and checked properties.  I did 
> this both in the command line and with the graphical file manager tool 
> in Solaris.  Neither reported any errors, and all the data could be 
> read & written fine.  Until the reboot, at which point all the data 
> was lost, again without error.
>  
> If you''re not interested in the detail, please skip to the end
where
> I''ve got some thoughts on just how many problems there are here.
>  
>  
> # zpool status test
>   pool: test
>  state: ONLINE
>  scrub: none requested
> config:
>         NAME        STATE     READ WRITE CKSUM
>         test        ONLINE       0     0     0
>           c2t7d0    ONLINE       0     0     0
> errors: No known data errors
> # zfs list test
> NAME   USED  AVAIL  REFER  MOUNTPOINT
> test   243M   228G   242M  /test
> # zpool list test
> NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
> test   232G   243M   232G     0%  ONLINE  -
>  
>
> -- drive removed --
>  
>
> # cfgadm |grep sata1/7
> sata1/7                        sata-port    empty        unconfigured ok
>  
>  
> -- cfgadmin knows the drive is removed.  How come ZFS does not? --
>  
>
> # cp -r /rc-pool/copytest /test/copytest
> # zpool list test
> NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
> test      232G  73.4M   232G     0%  ONLINE  -
> # zfs list test
> NAME   USED  AVAIL  REFER  MOUNTPOINT
> test   142K   228G    18K  /test
>  
>  
> -- Yup, still up.  Let''s start the clock --
>  
>
> # date
> Tue Jul 29 09:31:33 BST 2008
> # du -hs /test/copytest
>  667K /test/copytest
>  
>  
> -- 5 minutes later, still going strong --
>  
>
> # date
> Tue Jul 29 09:36:30 BST 2008
> # zpool list test
> NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
> test      232G  73.4M   232G     0%  ONLINE  -
> # cp -r /rc-pool/copytest /test/copytest2
> # ls /test
> copytest   copytest2
> # du -h -s /test
>  1.3M /test
> # zpool list test
> NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
> test   232G  73.4M   232G     0%  ONLINE  -
> # find /test | wc -l                        
>     2669
> # find //test/copytest | wc -l
>     1334
> # find /rc-pool/copytest | wc -l
>     1334
> # du -h -s /rc-pool/copytest
>  5.3M /rc-pool/copytest
>  
>  
> -- Not sure why the original pool has 5.3MB of data when I use du. --
> -- File Manager reports that they both have the same size --
>  
>  
> -- 15 minutes later it''s still working.  I can read data fine --
>
> # date
> Tue Jul 29 09:43:04 BST 2008
> # chmod 777 /test/*
> # mkdir /rc-pool/test2
> # cp -r /test/copytest2 /rc-pool/test2/copytest2
> # find /rc-pool/test2/copytest2 | wc -l
>     1334
> # zpool list test
> NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
> test      232G  73.4M   232G     0%  ONLINE  -
>  
>  
> -- and yup, the drive is still offline --
>  
>
> # cfgadm | grep sata1/7
> sata1/7                        sata-port    empty        unconfigured ok
>
>
> -- And finally, after 30 minutes the pool is still going strong --
>  
>
> # date
> Tue Jul 29 09:59:56 BST 2008
> # tar -cf /test/copytest.tar /test/copytest/*
> # ls -l
> total 3
> drwxrwxrwx   3 root     root           3 Jul 29 09:30 copytest
> -rwxrwxrwx   1 root     root     4626432 Jul 29 09:59 copytest.tar
> drwxrwxrwx   3 root     root           3 Jul 29 09:39 copytest2
> # zpool list test
> NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
> test   232G  73.4M   232G     0%  ONLINE  -
>
>  
> After a full 30 minutes there''s no indication whatsoever of any 
> problem.  Checking properties of the folder in File Browser reports 
> 2665 items, totalling 9.0MB.
>  
> At this point I tried "# zfs set sharesmb=on test".  I
didn''t really
> expect it to work, and sure enough, that command hung.  zpool status 
> also hung, so I had to reboot the server.
>  
>  
> -- Rebooted server --
>  
>  
> Now I found that not only are all the files I''ve written in the
last
> 30 minutes missing, but in fact files that I had deleted several 
> minutes prior to removing the drive have re-appeared.
>  
>  
> -- /test mount point is still present, I''ll probably have to
remove
> that manually --
>  
>  
> # cd /
> # ls
> bin         export      media       proc        system
> boot        home        mnt         rc-pool     test
> dev         kernel      net         rc-usb      tmp
> devices     lib         opt         root        usr
> etc         lost+found  platform    sbin        var
>  
>  
> -- ZFS still has the pool mounted, but at least now it realises
it''s
> not working --
>  
>  
> # zpool list
> NAME      SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
> rc-pool  2.27T  52.6G  2.21T     2%  DEGRADED  -
> test         -      -      -      -  FAULTED  -
> # zpool status test
>   pool: test
>  state: UNAVAIL
> status: One or more devices could not be opened.  There are insufficient
>  replicas for the pool to continue functioning.
> action: Attach the missing device and online it using ''zpool
online''.
>    see: http://www.sun.com/msg/ZFS-8000-3C
>  scrub: none requested
> config:
>  NAME        STATE     READ WRITE CKSUM
>  test        UNAVAIL      0     0     0  insufficient replicas
>    c2t7d0    UNAVAIL      0     0     0  cannot open
>  
>  
> -- At least re-activating the pool is simple, but gotta love the "No 
> known data errors" line --
>  
>
> # cfgadm -c configure sata1/7
> # zpool status test
>   pool: test
>  state: ONLINE
>  scrub: none requested
> config:
>  NAME        STATE     READ WRITE CKSUM
>  test        ONLINE       0     0     0
>    c2t7d0    ONLINE       0     0     0
> errors: No known data errors
>  
>  
> -- But of course, although ZFS thinks it''s online, it
didn''t mount
> properly --
>  
>
> # cd /test
> # ls
> # zpool export test
> # rm -r /test
> # zpool import test
> # cd test
> # ls
> var (copy)  var2
>  
>  
> -- Now that''s unexpected.  Those folders should be long gone. 
Let''s
> see how many files ZFS failed to delete --
>  
>
> # du -h -s /test
>   77M /test
> # find /test | wc -l
>    19033
>  
>  
> So in addition to working for a full half hour creating files,
it''s
> also failed to remove 77MB of data contained in nearly 20,000 files.  
> And it''s done all that without reporting any error or problem with
the
> pool.
>  
> In fact, if I didn''t know what I was looking for, there would be
no
> indication of a problem at all.  Before the reboot I can''t find
what''s
> going on as "zfs status" hangs.  After the reboot it says
there''s no
> problem.  Both ZFS and it''s troubleshooting tools fail in a big
way
> here. 
>  
> As others have said, "zfs status" should not hang.  ZFS has to
know
> the state of all the drives and pools it''s currently using,
"zfs
> status" should simply report the current known status from
ZFS''
> internal state.  It shouldn''t need to scan anything. 
ZFS'' internal
> state should also be checking with cfgadm so that it knows if a disk 
> isn''t there.  It should also be updated if the cache
can''t be flushed
> to disk, and "zfs list / zpool list" needs to borrow state
information
> from the status commands so that they don''t say
''online'' when the pool
> has problems.
>  
> ZFS needs to deal more intelligently with mount points when a pool has 
> problems.  Leaving the folder lying around in a way that prevents the 
> pool mounting properly when the drives are recovered is not good.  
> When the pool appears to come back online without errors, it would be 
> very easy for somebody to assume the data was lost from the pool 
> without realising that it simply hasn''t mounted and
they''re actually
> looking at an empty folder.  Firstly ZFS should be removing the mount 
> point when problems occur, and secondly, ZFS list or ZFS status should 
> include information to inform you that the pool could not be mounted 
> properly.
>  
> ZFS status really should be warning of any ZFS errors that occur.  
> Including things like being unable to mount the pool, CIFS mounts 
> failing, etc...
>  
> And finally, if ZFS does find problems writing from the cache, it 
> really needs to log somewhere the names of all the files affected, and 
> the action that could not be carried out.  ZFS knows the files it was 
> meant to delete here, it also knows the files that were written.  I 
> can accept that with delayed writes files may occasionally be lost 
> when a failure happens, but I don''t accept that we need to loose
all
> knowledge of the affected files when the filesystem has complete 
> knowledge of what is affected.  If there are any working filesystems 
> on the server, ZFS should make an attempt to store a log of the 
> problem, failing that it should e-mail the data out.  The admin really 
> needs to know what files have been affected so that they can notify 
> users of the data loss.  I don''t know where you would store this 
> information, but wherever that is, "zpool status" should be
reporting
> the error and directing the admin to the log file.
>  
> I would probably say this could be safely stored on the system drive.  
> Would it be possible to have a number of possible places to store this 
> log?  What I''m thinking is that if the system drive is
unavailable,
> ZFS could try each pool in turn and attempt to store the log there.
>  
> In fact e-mail alerts or external error logging would be a great 
> addition to ZFS.  Surely it makes sense that filesystem errors would 
> be better off being stored and handled externally?
>  
> Ross
>  
>
>
> > Date: Mon, 28 Jul 2008 12:28:34 -0700
> > From: Richard.Elling at Sun.COM
> > Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive 
> removed
> > To: myxiplx at hotmail.com
> >
> > I''m trying to reproduce and will let you know what I find.
> > -- richard
> >
>
>
> ------------------------------------------------------------------------
> Win ?3000 to spend on whatever you want at Uni! Click here to WIN! 
> <http://clk.atdmt.com/UKM/go/101719803/direct/01/>
> ------------------------------------------------------------------------
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   
-- 


-     _____/     _____/      /           - Jonathan Loran -           -
-    /          /           /                IT Manager               -
-  _____  /   _____  /     /     Space Sciences Laboratory, UC Berkeley
-        /          /     /      (510) 643-5146 jloran at ssl.berkeley.edu
- ______/    ______/    ______/           AST:7731^29u18e3

Ross

2008-Jul-30 07:19 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Well yeah, this is obviously not a valid setup for my data, but if you read my
first e-mail, the whole point of this test was that I had seen Solaris hang when
a drive was removed from a fully redundant array (five sets of three way
mirrors), and wanted to see what was going on.

So I started with the most basic pool I could to see how ZFS and Solaris
actually reacted to a drive being removed.  I was fully expecting ZFS to simply
error when the drive was removed, and move the test on to move complex pools.  I
did not expect to find so many problems with such a simple setup.  And the
problems I have found also lead to potential data loss in a redundant array,
although it would have been much more difficult to spot:

Imagine you had a raid-z array and pulled a drive as I''m doing here. 
Because ZFS isn''t aware of the removal it keeps writing to that drive
as if it''s valid.  That means ZFS still believes the array is online
when in fact it should be degrated.  If any other drive now fails, ZFS will
consider the status degrated instead of faulted, and will continue writing data.
The problem is, ZFS is writing some of that data to a drive which
doesn''t exist, meaning all that data will be lost on reboot.
 
 
This message posted from opensolaris.org

Bob Friesenhahn

2008-Jul-30 14:48 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

On Wed, 30 Jul 2008, Ross wrote:>
> Imagine you had a raid-z array and pulled a drive as I''m doing
here.
> Because ZFS isn''t aware of the removal it keeps writing to that 
> drive as if it''s valid.  That means ZFS still believes the array
is
> online when in fact it should be degrated.  If any other drive now 
> fails, ZFS will consider the status degrated instead of faulted, and 
> will continue writing data.  The problem is, ZFS is writing some of 
> that data to a drive which doesn''t exist, meaning all that data
will
> be lost on reboot.
While I do believe that device drivers. or the fault system, should 
notify ZFS when a device fails (and ZFS should appropriately react), I 
don''t think that ZFS should be responsible for fault monitoring.  ZFS 
is in a rather poor position for device fault monitoring, and if it 
attempts to do so then it will be slow and may misbehave in other 
ways.  The software which communicates with the device (i.e. the 
device driver) is in the best position to monitor the device.

The primary goal of ZFS is to be able to correctly read data which was 
successfully committed to disk.  There are programming interfaces 
(e.g. fsync(), msync()) which may be used to ensure that data is 
committed to disk, and which should return an error if there is a 
problem.  If you were performing your tests over an NFS mount then the 
results should be considerably different since NFS requests that its 
data be committed to disk.

Bob
=====================================Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Ross Smith

2008-Jul-30 15:03 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

I agree that device drivers should perform the bulk of the fault monitoring,
however I disagree that this absolves ZFS of any responsibility for checking for
errors.  The primary goal of ZFS is to be a filesystem and maintain data
integrity, and that entails both reading and writing data to the devices.  It is
no good having checksumming when reading data if you are loosing huge amounts of
data when a disk fails.

I''m not saying that ZFS should be monitoring disks and drivers to
ensure they are working, just that if ZFS attempts to write data and
doesn''t get the response it''s expecting, an error should be
logged against the device regardless of what the driver says.  If ZFS is really
about end-to-end data integrity, then you do need to consider the possibility of
a faulty driver.  Now I don''t know what the root cause of this error
is, but I suspect it will be either a bad response from the SATA driver, or
something within ZFS that is not working correctly.  Either way however I
believe ZFS should have caught this.

It''s similar to the iSCSI problem I posted a few months back where the
ZFS pool hangs for 3 minutes when a device is disconnected.  There''s
absolutely no need for the entire pool to hang when the other half of the mirror
is working fine.  ZFS is often compared to hardware raid controllers, but so far
it''s ability to handle problems is falling short.

Ross
 > Date: Wed, 30 Jul 2008 09:48:34 -0500> From: bfriesen at
simple.dallas.tx.us> To: myxiplx at hotmail.com> CC: zfs-discuss at
opensolaris.org> Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when
drive removed> > On Wed, 30 Jul 2008, Ross wrote:> >> >
Imagine you had a raid-z array and pulled a drive as I''m doing here.
> > Because ZFS isn''t aware of the removal it keeps writing to
that > > drive as if it''s valid. That means ZFS still believes
the array is > > online when in fact it should be degrated. If any other
drive now > > fails, ZFS will consider the status degrated instead of
faulted, and > > will continue writing data. The problem is, ZFS is
writing some of > > that data to a drive which doesn''t exist,
meaning all that data will > > be lost on reboot.> > While I do
believe that device drivers. or the fault system, should > notify ZFS when a
device fails (and ZFS should appropriately react), I > don''t think
that ZFS should be responsible for fault monitoring. ZFS > is in a rather
poor position for device fault monitoring, and if it > attempts to do so then
it will be slow and may misbehave in other > ways. The software which
communicates with the device (i.e. the > device driver) is in the best
position to monitor the device.> > The primary goal of ZFS is to be able
to correctly read data which was > successfully committed to disk. There are
programming interfaces > (e.g. fsync(), msync()) which may be used to ensure
that data is > committed to disk, and which should return an error if there
is a > problem. If you were performing your tests over an NFS mount then the
> results should be considerably different since NFS requests that its >
data be committed to disk.> > Bob>
======================================> Bob Friesenhahn> bfriesen at
simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/>
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/>_________________________________________________________________
Find the best and worst places on the planet
http://clk.atdmt.com/UKM/go/101719807/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080730/fc493c26/attachment.html>

Bob Friesenhahn

2008-Jul-30 15:21 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

On Wed, 30 Jul 2008, Ross Smith wrote:>
> I''m not saying that ZFS should be monitoring disks and drivers to 
> ensure they are working, just that if ZFS attempts to write data and 
> doesn''t get the response it''s expecting, an error should
be logged
> against the device regardless of what the driver says.  If ZFS is
A few things to consider:

  * Maybe the device driver has not yet reported (or fails to report)
    and error and just seems "slow".

  * ZFS is at such a high level that in many cases it has no useful
    knowledge of actual devices.  For example, MPXIO (multipath) may be
    layered on top, or maybe an ethernet network is involved.

If ZFS experiences a temporary problem with reaching a device, does 
that mean the device has failed, or does it perhaps indicate that a 
path is temporarily slow?

If one device is a local disk and the other device is accessed via 
iSCSI and is located on the other end of the country, should ZFS 
refuse to operate if the remote disk is slow or stops responding for 
several minutes?  This would be a typical situation when using 
mirroring, and one mirror device is remote.

The parameters that a device driver for a local device uses to decide 
if there is a fault will be (and should be) substantially different 
than the parameters for a remote device.  That is why most 
responsibility is left to the device driver.  ZFS will behave 
according to how the device driver behaves.

Bob
=====================================Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Peter Cudhea

2008-Jul-30 15:27 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Your point is well taken that ZFS should not duplicate functionality 
that is already or should be available at the device driver level.    In 
this case, I think it misses the point of what ZFS should be doing that 
it is not.

ZFS does its own periodic commits to the disk, and it knows if those 
commit points have reached the disk or not, or whether they are getting 
errors.    In this particular case, those commits to disk are presumably 
failing, because one of the disks they depend on has been removed from 
the system.   (If the writes are not being marked as failures, that 
would definitely be an error in the device driver, as you say.)  In this 
case, however, the ZIL log has stopped being updated, but ZFS does 
nothing to announce that this has happened, or to indicate that a remedy 
is required.

At the very least, it would be extremely helpful if  ZFS had a status to 
report that indicates that the ZIL log is out of date, or that there are 
troubles writing to the ZIL log, or something like that.

An additional feature would be to have user-selectable behavior when the 
ZIL log is significantly out of date.    For example, if the ZIL log is 
more than X seconds out of date, then new writes to the system should 
pause, or give errors or continue to silently succeed.

In an earlier phase of my career when I worked for a database company, I 
was responsible for a similar bug.   It caused a major customer to lose 
a major amount of data when a system rebooted when not all good data had 
been successfully committed to disk.    The resulting stink caused us to 
add a feature to detect the cases when the writing-to-disk process had 
fallen too far behind, and to pause new writes to the database until the 
situation was resolved.

Peter

Bob Friesenhahn wrote:> While I do believe that device drivers. or the fault system, should 
> notify ZFS when a device fails (and ZFS should appropriately react), I 
> don''t think that ZFS should be responsible for fault monitoring. 
ZFS
> is in a rather poor position for device fault monitoring, and if it 
> attempts to do so then it will be slow and may misbehave in other 
> ways.  The software which communicates with the device (i.e. the 
> device driver) is in the best position to monitor the device.
>
> The primary goal of ZFS is to be able to correctly read data which was 
> successfully committed to disk.  There are programming interfaces 
> (e.g. fsync(), msync()) which may be used to ensure that data is 
> committed to disk, and which should return an error if there is a 
> problem.  If you were performing your tests over an NFS mount then the 
> results should be considerably different since NFS requests that its 
> data be committed to disk.
>
> Bob
> =====================================> Bob Friesenhahn
> bfriesen at simple.dallas.tx.us,
http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Richard Elling

2008-Jul-30 18:17 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

I was able to reproduce this in b93, but might have a different
interpretation of the conditions.  More below...

Ross Smith wrote:> A little more information today.  I had a feeling that ZFS would 
> continue quite some time before giving an error, and today I''ve
shown
> that you can carry on working with the filesystem for at least half an 
> hour with the disk removed.
>  
> I suspect on a system with little load you could carry on working for 
> several hours without any indication that there is a problem.  It 
> looks to me like ZFS is caching reads & writes, and that provided 
> requests can be fulfilled from the cache, it doesn''t care whether
the
> disk is present or not.
In my USB-flash-disk-sudden-removal-while-writing-big-file-test,
1. I/O to the missing device stopped (as I expected)
2. FMA kicked in, as expected.
3. /var/adm/messages recorded "Command failed to complete... device
gone."
4. After exactly 9 minutes, 17,951 e-reports had been processed and the
diagnosis was complete.  FMA logged the following to /var/adm/messages

  Jul 30 10:33:44 grond scsi: [ID 107833 kern.warning] WARNING:   
/pci at 0,0/pci1458,5004 at b,1/storage at 8/disk at 0,0 (sd1):
  Jul 30 10:33:44 grond     Command failed to complete...Device is gone
  Jul 30 10:42:31 grond fmd: [ID 441519 daemon.error] SUNW-MSG-ID: 
ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
  Jul 30 10:42:31 grond EVENT-TIME: Wed Jul 30 10:42:30 PDT 2008
  Jul 30 10:42:31 grond PLATFORM:  , CSN:  , HOSTNAME: grond
  Jul 30 10:42:31 grond SOURCE: zfs-diagnosis, REV: 1.0
  Jul 30 10:42:31 grond EVENT-ID: d99769aa-28e8-cf16-d181-945592130525
  Jul 30 10:42:31 grond DESC: The number of I/O errors associated with a 
ZFS device exceeded
  Jul 30 10:42:31 grond          acceptable levels.  Refer to 
http://sun.com/msg/ZFS-8000-FD for more information.
  Jul 30 10:42:31 grond AUTO-RESPONSE: The device has been offlined and 
marked as faulted.  An attempt
  Jul 30 10:42:31 grond          will be made to activate a hot spare if 
available.
  Jul 30 10:42:31 grond IMPACT: Fault tolerance of the pool may be 
compromised.
  Jul 30 10:42:31 grond REC-ACTION: Run ''zpool status -x'' and
replace
the bad device.

The above URL shows what you expect, but more (and better) info
is available from zpool status -xv

    pool: rmtestpool
   state: UNAVAIL
  status: One or more devices are faultd in response to IO failures.
  action: Make sure the affected devices are connected, then run ''zpool
clear''.
     see: http://www.sun.com/msg/ZFS-8000-HC
   scrub: none requested
  config:

      NAME        STATE     READ WRITE CKSUM
      rmtestpool  UNAVAIL      0 15.7K     0  insufficient replicas
        c2t0d0p0  FAULTED      0 15.7K     0  experienced I/O failures

  errors: Permanent errors have been detected in the following files:

          /rmtestpool/random.data

If you surf to http://www.sun.com/msg/ZFS-8000-HC you''ll
see words to the effect that,
    The pool has experienced I/O failures. Since the ZFS pool property
  ''failmode'' is set to ''wait'', all I/Os
(reads and writes) are
  blocked. See the zpool(1M) manpage for more information on the
  ''failmode'' property. Manual intervention is required for
I/Os to
  be serviced.
>  
> I would guess that ZFS is attempting to write to the disk in the 
> background, and that this is silently failing.
It is clearly not silently failing.

However, the default failmode property is set to "wait" which will
patiently
wait forever.  If you would rather have the I/O fail, then you should change
the failmode to "continue"  I would not normally recommend a failmode
of
"panic"

Now to figure out how to recover gracefully... zpool clear isn''t
happy...

[sidebar]
while performing this experiment, I noticed that fmd was checkpointing
the diagnosis engine to disk in the /var/fm/fmd/ckpt/zfs-diagnosis 
directory.
If this had been the boot disk, with failmode=wait, I''m not convinced
that we''d get a complete diagnosis... I''ll explore that later.
[/sidebar]

 -- richard

Paul Fisher

2008-Jul-30 18:24 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Richard Elling wrote:> I was able to reproduce this in b93, but might have a different
> interpretation of the conditions.  More below...
>
> Ross Smith wrote:
>   
>> A little more information today.  I had a feeling that ZFS would
>> continue quite some time before giving an error, and today
I''ve shown
>> that you can carry on working with the filesystem for at least half an
>> hour with the disk removed.
>>
>> I suspect on a system with little load you could carry on working for
>> several hours without any indication that there is a problem.  It
>> looks to me like ZFS is caching reads & writes, and that provided
>> requests can be fulfilled from the cache, it doesn''t care
whether the
>> disk is present or not.
>>     
>
> In my USB-flash-disk-sudden-removal-while-writing-big-file-test,
> 1. I/O to the missing device stopped (as I expected)
> 2. FMA kicked in, as expected.
> 3. /var/adm/messages recorded "Command failed to complete... device
gone."
> 4. After exactly 9 minutes, 17,951 e-reports had been processed and the
> diagnosis was complete.  FMA logged the following to /var/adm/messages
>   Wow! Who knew that 17, 951 was the magic number...  Seriously, this does 
seem like an "excessive amount of certainty".


--
paul

Neil Perrin

2008-Jul-30 18:41 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Peter Cudhea wrote:> Your point is well taken that ZFS should not duplicate functionality 
> that is already or should be available at the device driver level.    In 
> this case, I think it misses the point of what ZFS should be doing that 
> it is not.
> 
> ZFS does its own periodic commits to the disk, and it knows if those 
> commit points have reached the disk or not, or whether they are getting 
> errors.    In this particular case, those commits to disk are presumably 
> failing, because one of the disks they depend on has been removed from 
> the system.   (If the writes are not being marked as failures, that 
> would definitely be an error in the device driver, as you say.)  In this 
> case, however, the ZIL log has stopped being updated, but ZFS does 
> nothing to announce that this has happened, or to indicate that a remedy 
> is required.
I think you have some misconceptions about how the ZIL works.
It doesn''t provide journalling like UFS. The following might help:

http://blogs.sun.com/perrin/entry/the_lumberjack

The ZIL isn''t used at all unless there''s fsync/O_DSYNC
activity.
> 
> At the very least, it would be extremely helpful if  ZFS had a status to 
> report that indicates that the ZIL log is out of date, or that there are 
> troubles writing to the ZIL log, or something like that.
If the ZIL cannot be written then we force a transaction group (txg)
commit. That is the only recourse to force data to stable storage before
returning to the application. 
> 
> An additional feature would be to have user-selectable behavior when the 
> ZIL log is significantly out of date.    For example, if the ZIL log is 
> more than X seconds out of date, then new writes to the system should 
> pause, or give errors or continue to silently succeed.
Again this doesn''t make sense given how the ZIL works.
> 
> In an earlier phase of my career when I worked for a database company, I 
> was responsible for a similar bug.   It caused a major customer to lose 
> a major amount of data when a system rebooted when not all good data had 
> been successfully committed to disk.    The resulting stink caused us to 
> add a feature to detect the cases when the writing-to-disk process had 
> fallen too far behind, and to pause new writes to the database until the 
> situation was resolved.
> 
> Peter
> 
> Bob Friesenhahn wrote:
>> While I do believe that device drivers. or the fault system, should 
>> notify ZFS when a device fails (and ZFS should appropriately react), I 
>> don''t think that ZFS should be responsible for fault
monitoring.  ZFS
>> is in a rather poor position for device fault monitoring, and if it 
>> attempts to do so then it will be slow and may misbehave in other 
>> ways.  The software which communicates with the device (i.e. the 
>> device driver) is in the best position to monitor the device.
>>
>> The primary goal of ZFS is to be able to correctly read data which was 
>> successfully committed to disk.  There are programming interfaces 
>> (e.g. fsync(), msync()) which may be used to ensure that data is 
>> committed to disk, and which should return an error if there is a 
>> problem.  If you were performing your tests over an NFS mount then the 
>> results should be considerably different since NFS requests that its 
>> data be committed to disk.
>>
>> Bob
>> =====================================>> Bob Friesenhahn
>> bfriesen at simple.dallas.tx.us,
http://www.simplesystems.org/users/bfriesen/
>> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>   
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Peter Cudhea

2008-Jul-30 19:42 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Thanks, this is helpful. I was definitely misunderstanding the part that
the ZIL plays in ZFS.

I found Richard Elling''s discussion of the FMA response to the failure
very informative.   I see how the device driver, the fault analysis
layer and the ZFS layer are all working together.    Though the
customer''s complaint that the change in state from "working"
to "not
working" is taking too long seems pretty valid.

Peter

Neil Perrin wrote:>
>
> Peter Cudhea wrote:
>> Your point is well taken that ZFS should not duplicate functionality 
>> that is already or should be available at the device driver level.    
>> In this case, I think it misses the point of what ZFS should be doing 
>> that it is not.
>>
>> ZFS does its own periodic commits to the disk, and it knows if those 
>> commit points have reached the disk or not, or whether they are 
>> getting errors.    In this particular case, those commits to disk are 
>> presumably failing, because one of the disks they depend on has been 
>> removed from the system.   (If the writes are not being marked as 
>> failures, that would definitely be an error in the device driver, as 
>> you say.)  In this case, however, the ZIL log has stopped being 
>> updated, but ZFS does nothing to announce that this has happened, or 
>> to indicate that a remedy is required.
>
> I think you have some misconceptions about how the ZIL works.
> It doesn''t provide journalling like UFS. The following might help:
>
> http://blogs.sun.com/perrin/entry/the_lumberjack
>
> The ZIL isn''t used at all unless there''s fsync/O_DSYNC
activity.
>
>>
>> At the very least, it would be extremely helpful if  ZFS had a status 
>> to report that indicates that the ZIL log is out of date, or that 
>> there are troubles writing to the ZIL log, or something like that.
>
> If the ZIL cannot be written then we force a transaction group (txg)
> commit. That is the only recourse to force data to stable storage before
> returning to the application.
>>
>> An additional feature would be to have user-selectable behavior when 
>> the ZIL log is significantly out of date.    For example, if the ZIL 
>> log is more than X seconds out of date, then new writes to the system 
>> should pause, or give errors or continue to silently succeed.
>
> Again this doesn''t make sense given how the ZIL works.
>
>>
>> In an earlier phase of my career when I worked for a database 
>> company, I was responsible for a similar bug.   It caused a major 
>> customer to lose a major amount of data when a system rebooted when 
>> not all good data had been successfully committed to disk.    The 
>> resulting stink caused us to add a feature to detect the cases when 
>> the writing-to-disk process had fallen too far behind, and to pause 
>> new writes to the database until the situation was resolved.
>>
>> Peter
>>
>> Bob Friesenhahn wrote:
>>> While I do believe that device drivers. or the fault system, should
>>> notify ZFS when a device fails (and ZFS should appropriately
react),
>>> I don''t think that ZFS should be responsible for fault
monitoring.
>>> ZFS is in a rather poor position for device fault monitoring, and
if
>>> it attempts to do so then it will be slow and may misbehave in
other
>>> ways.  The software which communicates with the device (i.e. the 
>>> device driver) is in the best position to monitor the device.
>>>
>>> The primary goal of ZFS is to be able to correctly read data which 
>>> was successfully committed to disk.  There are programming 
>>> interfaces (e.g. fsync(), msync()) which may be used to ensure that
>>> data is committed to disk, and which should return an error if
there
>>> is a problem.  If you were performing your tests over an NFS mount 
>>> then the results should be considerably different since NFS
requests
>>> that its data be committed to disk.
>>>
>>> Bob
>>> =====================================>>> Bob Friesenhahn
>>> bfriesen at simple.dallas.tx.us, 
>>> http://www.simplesystems.org/users/bfriesen/
>>> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>>>
>>> _______________________________________________
>>> zfs-discuss mailing list
>>> zfs-discuss at opensolaris.org
>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>>   
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Richard Elling

2008-Jul-30 21:04 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Peter Cudhea wrote:> Thanks, this is helpful. I was definitely misunderstanding the part that
> the ZIL plays in ZFS.
>
> I found Richard Elling''s discussion of the FMA response to the
failure
> very informative.   I see how the device driver, the fault analysis
> layer and the ZFS layer are all working together.    Though the
> customer''s complaint that the change in state from
"working" to "not
> working" is taking too long seems pretty valid.
>   
I wish there was a simple answer to the can-of-worms^TM that this
question opens.  But there really isn''t.  As Paul Fisher points out,
logging 17,951 e-reports in 9 minutes seems like a lot, but I''m quite
sure that is CPU bound and I could log more with a faster system :-)
The key here is that 9 minutes represents some combination of timeouts
in the sd/scsa2usb/usb stack.  The myth of layered software says that
timeouts compound, so digging around for a better collection might
or might not be generally satisfying.  Since this is not a ZFS timeout,
perhaps the conversation should be continued in a more appropriate
forum?
 -- richard

Jonathan Loran

2008-Jul-30 21:44 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

From a reporting perspective, yes, zpool status should not hang, and 
should report an error if a drive goes away, or is in any way behaving 
badly.  No arguments there.  From the data integrity perspective, the 
only event zfs needs to know about is when a bad drive is replaced, such 
that a resilver is triggered.  If a drive is suddenly gone, but it is 
only one component of a redundant set, your data should still be fine.  
Now, if enough drives go away to break the redundancy, that''s a 
different story altogether.

Jon

Ross Smith wrote:> I agree that device drivers should perform the bulk of the fault 
> monitoring, however I disagree that this absolves ZFS of any 
> responsibility for checking for errors.  The primary goal of ZFS is to 
> be a filesystem and maintain data integrity, and that entails both 
> reading and writing data to the devices.  It is no good having 
> checksumming when reading data if you are loosing huge amounts of data 
> when a disk fails.
>  
> I''m not saying that ZFS should be monitoring disks and drivers to 
> ensure they are working, just that if ZFS attempts to write data and 
> doesn''t get the response it''s expecting, an error should
be logged
> against the device regardless of what the driver says.  If ZFS is 
> really about end-to-end data integrity, then you do need to consider 
> the possibility of a faulty driver.  Now I don''t know what the
root
> cause of this error is, but I suspect it will be either a bad response 
> from the SATA driver, or something within ZFS that is not working 
> correctly.  Either way however I believe ZFS should have caught this.
>  
> It''s similar to the iSCSI problem I posted a few months back where
the
> ZFS pool hangs for 3 minutes when a device is disconnected. 
There''s
> absolutely no need for the entire pool to hang when the other half of 
> the mirror is working fine.  ZFS is often compared to hardware raid 
> controllers, but so far it''s ability to handle problems is falling
short.
>  
> Ross
>  
>
> > Date: Wed, 30 Jul 2008 09:48:34 -0500
> > From: bfriesen at simple.dallas.tx.us
> > To: myxiplx at hotmail.com
> > CC: zfs-discuss at opensolaris.org
> > Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive 
> removed
> >
> > On Wed, 30 Jul 2008, Ross wrote:
> > >
> > > Imagine you had a raid-z array and pulled a drive as I''m
doing here.
> > > Because ZFS isn''t aware of the removal it keeps writing
to that
> > > drive as if it''s valid. That means ZFS still believes
the array is
> > > online when in fact it should be degrated. If any other drive now
> > > fails, ZFS will consider the status degrated instead of faulted,
and
> > > will continue writing data. The problem is, ZFS is writing some
of
> > > that data to a drive which doesn''t exist, meaning all
that data will
> > > be lost on reboot.
> >
> > While I do believe that device drivers. or the fault system, should
> > notify ZFS when a device fails (and ZFS should appropriately react), I
> > don''t think that ZFS should be responsible for fault
monitoring. ZFS
> > is in a rather poor position for device fault monitoring, and if it
> > attempts to do so then it will be slow and may misbehave in other
> > ways. The software which communicates with the device (i.e. the
> > device driver) is in the best position to monitor the device.
> >
> > The primary goal of ZFS is to be able to correctly read data which was
> > successfully committed to disk. There are programming interfaces
> > (e.g. fsync(), msync()) which may be used to ensure that data is
> > committed to disk, and which should return an error if there is a
> > problem. If you were performing your tests over an NFS mount then the
> > results should be considerably different since NFS requests that its
> > data be committed to disk.
> >
> > Bob
>
-- 


-     _____/     _____/      /           - Jonathan Loran -           -
-    /          /           /                IT Manager               -
-  _____  /   _____  /     /     Space Sciences Laboratory, UC Berkeley
-        /          /     /      (510) 643-5146 jloran at ssl.berkeley.edu
- ______/    ______/    ______/           AST:7731^29u18e3

Ross Smith

2008-Jul-31 12:28 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

I''m not sure you''re actually seeing the same problem there
Richard.  It seems that for you I/O is stopping on removal of the device,
whereas for me I/O continues for some considerable time.  You are also able to
obtain a result from "zpool status" whereas that completely hangs for
me.
 
To illustrate the difference, this is what I saw today in snv_94, with a pool
created from a single external USB hard drive.
 
1. As before I started a copy of a directory using Solaris'' file
manager.  About 1/3 of the way through I pulled the plug on the drive.
2. File manager continued to copy a further 30MB+ of files across.  Checking the
properties of the copy shows it contains 71.1MB of data and 19,160 files,
despite me pulling the drive at around 8,000 files.
 
3.  8:24am  I ran "zpool status":
# zpool status rc-usb  pool: rc-usb state: ONLINEstatus: One or more devices has
experienced an error resulting in data corruption.  Applications may be
affected.action: Restore the file in question if possible.  Otherwise restore
the entire pool from backup.   see: http://www.sun.com/msg/ZFS-8000-8A scrub:
none requested
 
That is as far as it gets.  It never gives me any further information.  I left
it two hours, and it still had not displayed the status of the drive in the
pool.  I also did a "zfs list", that also hangs now although
I''m pretty sure that if you run "zfs list" before "zpool
status" it works fine.
 
As you can see from /var/adm/messages, I am getting nothing at all from FMA:
Jul 31 08:16:46 unknown usba: [ID 912658 kern.info] USB 2.0 device (usbd49,7350)
operating at hi speed (USB 2.x) on USB 2.0 root hub: storage at 3, scsa2usb0 at
bus address 2Jul 31 08:16:46 unknown usba: [ID 349649 kern.info]  Maxtor  
OneTouch         2HAP70DZ    Jul 31 08:16:46 unknown genunix: [ID 936769
kern.info] scsa2usb0 is /pci at 0,0/pci15d9,a011 at 2,1/storage at 3Jul 31
08:16:46 unknown genunix: [ID 408114 kern.info] /pci at 0,0/pci15d9,a011 at
2,1/storage at 3 (scsa2usb0) onlineJul 31 08:16:46 unknown scsi: [ID 193665
kern.info] sd17 at scsa2usb0: target 0 lun 0Jul 31 08:16:46 unknown genunix: [ID
936769 kern.info] sd17 is /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk at
0,0Jul 31 08:16:46 unknown genunix: [ID 340201 kern.warning] WARNING: Page83
data not standards compliant Maxtor   OneTouch         0125Jul 31 08:16:46
unknown genunix: [ID 408114 kern.info] /pci at 0,0/pci15d9,a011 at 2,1/storage
at 3/disk at 0,0 (sd17) onlineJul 31 08:16:49 unknown pcplusmp: [ID 444295
kern.info] pcplusmp: ide (ata) instance #1 vector 0xf ioapic 0x4 intin 0xf is
bound to cpu 3Jul 31 08:16:49 unknown scsi: [ID 193665 kern.info] sd14 at
marvell88sx1: target 7 lun 0Jul 31 08:16:49 unknown genunix: [ID 936769
kern.info] sd14 is /pci at 1,0/pci1022,7458 at 2/pci11ab,11ab at 1/disk at
7,0Jul 31 08:16:49 unknown genunix: [ID 408114 kern.info] /pci at
1,0/pci1022,7458 at 2/pci11ab,11ab at 1/disk at 7,0 (sd14) onlineJul 31 08:21:35
unknown usba: [ID 691482 kern.warning] WARNING: /pci at 0,0/pci15d9,a011 at
2,1/storage at 3 (scsa2usb0): Disconnected device was busy, please reconnect.Jul
31 08:21:38 unknown scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):Jul 31 08:21:38 unknown 
Command failed to complete...Device is goneJul 31 08:21:38 unknown scsi: [ID
107833 kern.warning] WARNING: /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk
at 0,0 (sd17):Jul 31 08:21:38 unknown  Command failed to complete...Device is
goneJul 31 08:21:38 unknown scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):Jul 31 08:21:38 unknown 
Command failed to complete...Device is goneJul 31 08:21:38 unknown scsi: [ID
107833 kern.warning] WARNING: /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk
at 0,0 (sd17):Jul 31 08:21:38 unknown  Command failed to complete...Device is
goneJul 31 08:21:38 unknown scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):Jul 31 08:21:38 unknown 
Command failed to complete...Device is goneJul 31 08:21:38 unknown scsi: [ID
107833 kern.warning] WARNING: /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk
at 0,0 (sd17):Jul 31 08:21:38 unknown  Command failed to complete...Device is
goneJul 31 08:21:38 unknown scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):Jul 31 08:21:38 unknown 
Command failed to complete...Device is goneJul 31 08:21:38 unknown scsi: [ID
107833 kern.warning] WARNING: /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk
at 0,0 (sd17):Jul 31 08:21:38 unknown  Command failed to complete...Device is
goneJul 31 08:24:26 unknown scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):Jul 31 08:24:26 unknown 
Command failed to complete...Device is goneJul 31 08:24:26 unknown scsi: [ID
107833 kern.warning] WARNING: /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk
at 0,0 (sd17):Jul 31 08:24:26 unknown  Command failed to complete...Device is
goneJul 31 08:24:26 unknown scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):Jul 31 08:24:26 unknown 
drive offlineJul 31 08:27:43 unknown smbd[603]: [ID 766186 daemon.error]
NbtDatagramDecode[11]: too small packetJul 31 08:39:43 unknown smbd[603]: [ID
766186 daemon.error] NbtDatagramDecode[11]: too small packetJul 31 08:44:50
unknown /sbin/dhcpagent[95]: [ID 732317 daemon.warning] accept_v4_acknak: ACK
packet on nge0 missing mandatory lease option, ignoredJul 31 08:44:58 unknown
last message repeated 3 timesJul 31 08:45:06 unknown /sbin/dhcpagent[95]: [ID
732317 daemon.warning] accept_v4_acknak: ACK packet on nge0 missing mandatory
lease option, ignoredJul 31 08:45:06 unknown last message repeated 1 timeJul 31
08:51:44 unknown smbd[603]: [ID 766186 daemon.error] NbtDatagramDecode[11]: too
small packetJul 31 09:03:44 unknown smbd[603]: [ID 766186 daemon.error]
NbtDatagramDecode[11]: too small packetJul 31 09:13:51 unknown
/sbin/dhcpagent[95]: [ID 732317 daemon.warning] accept_v4_acknak: ACK packet on
nge0 missing mandatory lease option, ignoredJul 31 09:14:09 unknown last message
repeated 5 timesJul 31 09:15:44 unknown smbd[603]: [ID 766186 daemon.error]
NbtDatagramDecode[11]: too small packetJul 31 09:27:44 unknown smbd[603]: [ID
766186 daemon.error] NbtDatagramDecode[11]: too small packetJul 31 09:27:55
unknown pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata) instance #1 vector
0xf ioapic 0x4 intin 0xf is bound to cpu 3
 
cfgadm reports that the port is empty but still configured:
# cfgadmAp_Id                          Type         Receptacle   Occupant    
Conditionusb1/3                         unknown      empty        configured  
unusable
4. 9:32am I now tried writing more data to the pool, to see if I can trigger the
I/O error you are seeing.  I tried making a second copy of the files on the USB
drive in the Solaris File manager, but that attempt simply hung the copy dialog.
I''m still seeing nothing else that appears relevant in
/var/adm/messages.
 
5. 10:08am While checking free space, I found that although df works, "df
-kh" hangs, apparently when it tries to query any zfs pool:
# df/                  (/dev/dsk/c1t0d0s0 ): 2504586 blocks   656867
files/devices           (/devices          ):       0 blocks        0 files/dev 
(/dev              ):       0 blocks        0 files/system/contract   (ctfs     
):       0 blocks 2147483609 files/proc              (proc              ):      
0 blocks    29902 files/etc/mnttab        (mnttab            ):       0 blocks  
0 files/etc/svc/volatile  (swap              ): 9850928 blocks  1180374
files/system/object     (objfs             ):       0 blocks 2147483409
files/etc/dfs/sharetab  (sharefs           ):       0 blocks 2147483646
files/lib/libc.so.1     (/usr/lib/libc/libc_hwcap2.so.1): 2504586 blocks  
656867 files/dev/fd            (fd                ):       0 blocks        0
files/tmp               (swap              ): 9850928 blocks  1180374
files/var/run           (swap              ): 9850928 blocks  1180374
files/export/home       (/dev/dsk/c1t0d0s7 ):881398942 blocks 53621232
files/rc-pool           (rc-pool           ):4344346098 blocks 4344346098
files/rc-pool/admin     (rc-pool/admin     ):4344346098 blocks 4344346098
files/rc-pool/ross-home (rc-pool/ross-home ):4344346098 blocks 4344346098
files/rc-pool/vmware    (rc-pool/vmware    ):4344346098 blocks 4344346098
files/rc-usb            (rc-usb            ):153725153 blocks 153725153 files#
df -khFilesystem             size   used  avail capacity  Mounted
on/dev/dsk/c1t0d0s0      7.2G   6.0G   1.1G    85%    //devices                
0K     0K     0K     0%    /devices/dev                     0K     0K     0K    
0%    /devctfs                     0K     0K     0K     0%   
/system/contractproc                     0K     0K     0K     0%    /procmnttab 
0K     0K     0K     0%    /etc/mnttabswap                   4.7G   1.1M   4.7G 
1%    /etc/svc/volatileobjfs                    0K     0K     0K     0%   
/system/objectsharefs                  0K     0K     0K     0%   
/etc/dfs/sharetab/usr/lib/libc/libc_hwcap2.so.1                       7.2G  
6.0G   1.1G    85%    /lib/libc.so.1fd                       0K     0K     0K   
0%    /dev/fdswap                   4.7G    48K   4.7G     1%    /tmpswap       
4.7G    76K   4.7G     1%    /var/run/dev/dsk/c1t0d0s7      425G   4.8G   416G  
2%    /export/home
 
6. 10:35am  It''s now been two hours, neither "zpool status"
nor "zfs list" have ever finished.  The file copy attempt has also
been hung for over an hour (although that''s not unexpected with
''wait'' as the failmode).
 
Richard, you say ZFS is not silently failing, well for me it appears that it is.
I can''t see any warnings from ZFS, I can''t get any status
information.  I see no way that I could find out what files are going to be lost
on this server.
 
Yes, I''m now aware that the pool has hung since file operations are
hanging, however had that been my first indication of a problem I believe I am
now left in a position where I cannot find out either the cause, nor the files
affected.  I don''t believe I have any way to find out which operations
had completed without error, but are not currently committed to disk.  I
certainly don''t get the status message you do saying permanent errors
have been found in files.
 
I plugged the USB drive back in now, Solaris detected it ok, but ZFS is still
hung.  The rest of /var/adm/messages is:
Jul 31 09:39:44 unknown smbd[603]: [ID 766186 daemon.error]
NbtDatagramDecode[11]: too small packetJul 31 09:45:22 unknown
/sbin/dhcpagent[95]: [ID 732317 daemon.warning] accept_v4_acknak: ACK packet on
nge0 missing mandatory lease option, ignoredJul 31 09:45:38 unknown last message
repeated 5 timesJul 31 09:51:44 unknown smbd[603]: [ID 766186 daemon.error]
NbtDatagramDecode[11]: too small packetJul 31 10:03:44 unknown last message
repeated 2 timesJul 31 10:14:27 unknown /sbin/dhcpagent[95]: [ID 732317
daemon.warning] accept_v4_acknak: ACK packet on nge0 missing mandatory lease
option, ignoredJul 31 10:14:45 unknown last message repeated 5 timesJul 31
10:15:44 unknown smbd[603]: [ID 766186 daemon.error] NbtDatagramDecode[11]: too
small packetJul 31 10:27:45 unknown smbd[603]: [ID 766186 daemon.error]
NbtDatagramDecode[11]: too small packet
Jul 31 10:36:25 unknown usba: [ID 691482 kern.warning] WARNING: /pci at
0,0/pci15d9,a011 at 2,1/storage at 3 (scsa2usb0): Reinserted device is
accessible again.Jul 31 10:39:45 unknown smbd[603]: [ID 766186 daemon.error]
NbtDatagramDecode[11]: too small packetJul 31 10:45:53 unknown
/sbin/dhcpagent[95]: [ID 732317 daemon.warning] accept_v4_acknak: ACK packet on
nge0 missing mandatory lease option, ignoredJul 31 10:46:09 unknown last message
repeated 5 timesJul 31 10:51:45 unknown smbd[603]: [ID 766186 daemon.error]
NbtDatagramDecode[11]: too small packet
 
7. 10:55am  Gave up on ZFS ever recovering.  A shutdown attempt hung as
expected.  I hard-reset the computer.
 
Ross
 
 > Date: Wed, 30 Jul 2008 11:17:08 -0700> From: Richard.Elling at
Sun.COM> Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive
removed> To: myxiplx at hotmail.com> CC: zfs-discuss at
opensolaris.org> > I was able to reproduce this in b93, but might have a
different> interpretation of the conditions. More below...> > Ross
Smith wrote:> > A little more information today. I had a feeling that ZFS
would > > continue quite some time before giving an error, and today
I''ve shown > > that you can carry on working with the filesystem
for at least half an > > hour with the disk removed.> > > > I
suspect on a system with little load you could carry on working for > >
several hours without any indication that there is a problem. It > > looks
to me like ZFS is caching reads & writes, and that provided > >
requests can be fulfilled from the cache, it doesn''t care whether the
> > disk is present or not.> > In my
USB-flash-disk-sudden-removal-while-writing-big-file-test,> 1. I/O to the
missing device stopped (as I expected)> 2. FMA kicked in, as expected.> 3.
/var/adm/messages recorded "Command failed to complete... device
gone."> 4. After exactly 9 minutes, 17,951 e-reports had been processed
and the> diagnosis was complete. FMA logged the following to
/var/adm/messages> > Jul 30 10:33:44 grond scsi: [ID 107833 kern.warning]
WARNING: > /pci at 0,0/pci1458,5004 at b,1/storage at 8/disk at 0,0
(sd1):> Jul 30 10:33:44 grond Command failed to complete...Device is gone>
Jul 30 10:42:31 grond fmd: [ID 441519 daemon.error] SUNW-MSG-ID: >
ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major> Jul 30 10:42:31 grond
EVENT-TIME: Wed Jul 30 10:42:30 PDT 2008> Jul 30 10:42:31 grond PLATFORM: ,
CSN: , HOSTNAME: grond> Jul 30 10:42:31 grond SOURCE: zfs-diagnosis, REV:
1.0> Jul 30 10:42:31 grond EVENT-ID: d99769aa-28e8-cf16-d181-945592130525>
Jul 30 10:42:31 grond DESC: The number of I/O errors associated with a > ZFS
device exceeded> Jul 30 10:42:31 grond acceptable levels. Refer to >
http://sun.com/msg/ZFS-8000-FD for more information.> Jul 30 10:42:31 grond
AUTO-RESPONSE: The device has been offlined and > marked as faulted. An
attempt> Jul 30 10:42:31 grond will be made to activate a hot spare if >
available.> Jul 30 10:42:31 grond IMPACT: Fault tolerance of the pool may be
> compromised.> Jul 30 10:42:31 grond REC-ACTION: Run ''zpool
status -x'' and replace > the bad device.> > The above URL
shows what you expect, but more (and better) info> is available from zpool
status -xv> > pool: rmtestpool> state: UNAVAIL> status: One or more
devices are faultd in response to IO failures.> action: Make sure the
affected devices are connected, then run ''zpool >
clear''.> see: http://www.sun.com/msg/ZFS-8000-HC> scrub: none
requested> config:> > NAME STATE READ WRITE CKSUM> rmtestpool
UNAVAIL 0 15.7K 0 insufficient replicas> c2t0d0p0 FAULTED 0 15.7K 0
experienced I/O failures> > errors: Permanent errors have been detected in
the following files:> > /rmtestpool/random.data> > > If you surf
to http://www.sun.com/msg/ZFS-8000-HC you''ll> see words to the
effect that,> The pool has experienced I/O failures. Since the ZFS pool
property> ''failmode'' is set to ''wait'',
all I/Os (reads and writes) are> blocked. See the zpool(1M) manpage for more
information on the> ''failmode'' property. Manual
intervention is required for I/Os to> be serviced.> > > > > I
would guess that ZFS is attempting to write to the disk in the > >
background, and that this is silently failing.> > It is clearly not
silently failing.> > However, the default failmode property is set to
"wait" which will patiently> wait forever. If you would rather have
the I/O fail, then you should change> the failmode to "continue" I
would not normally recommend a failmode of> "panic"> > Now to
figure out how to recover gracefully... zpool clear isn''t happy...>
> [sidebar]> while performing this experiment, I noticed that fmd was
checkpointing> the diagnosis engine to disk in the
/var/fm/fmd/ckpt/zfs-diagnosis > directory.> If this had been the boot
disk, with failmode=wait, I''m not convinced> that we''d get
a complete diagnosis... I''ll explore that later.> [/sidebar]>
> -- richard>_________________________________________________________________
The John Lewis Clearance - save up to 50% with FREE delivery
http://clk.atdmt.com/UKM/go/101719806/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080731/3dac936f/attachment.html>

Andrew Hisgen

2008-Aug-01 13:36 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Question embedded below...

Richard Elling wrote:
...> If you surf to http://www.sun.com/msg/ZFS-8000-HC you''ll
> see words to the effect that,
>     The pool has experienced I/O failures. Since the ZFS pool property
>   ''failmode'' is set to ''wait'', all I/Os
(reads and writes) are
>   blocked. See the zpool(1M) manpage for more information on the
>   ''failmode'' property. Manual intervention is required
for I/Os to
>   be serviced.
> 
>>  
>> I would guess that ZFS is attempting to write to the disk in the 
>> background, and that this is silently failing.
> 
> It is clearly not silently failing.
> 
> However, the default failmode property is set to "wait" which
will patiently
> wait forever.  If you would rather have the I/O fail, then you should
change
> the failmode to "continue"  I would not normally recommend a
failmode of
> "panic"
Hi Richard,

Does failmode==wait cause ZFS itself to retry i/o, that is, to retry an
i/o where an earlier request (of that same i/o) returned from the driver
with an error?  If so, that will compound timeouts even further.

I''m also confused by your statement that wait means wait forever, given
that the actual circumstances here are that zfs (and the rest of the
i/o stack) returned after 9 minutes.

thanks,
Andy

Richard Elling

2008-Aug-01 15:59 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Hi Andy, answer & pointer below...

Andrew Hisgen wrote:> Question embedded below...
>
> Richard Elling wrote:
> ...
>> If you surf to http://www.sun.com/msg/ZFS-8000-HC you''ll
>> see words to the effect that,
>>     The pool has experienced I/O failures. Since the ZFS pool property
>>   ''failmode'' is set to ''wait'', all
I/Os (reads and writes) are
>>   blocked. See the zpool(1M) manpage for more information on the
>>   ''failmode'' property. Manual intervention is
required for I/Os to
>>   be serviced.
>>
>>>  
>>> I would guess that ZFS is attempting to write to the disk in the 
>>> background, and that this is silently failing.
>>
>> It is clearly not silently failing.
>>
>> However, the default failmode property is set to "wait" which
will
>> patiently
>> wait forever.  If you would rather have the I/O fail, then you should 
>> change
>> the failmode to "continue"  I would not normally recommend a
failmode of
>> "panic"
>
> Hi Richard,
>
> Does failmode==wait cause ZFS itself to retry i/o, that is, to retry an
> i/o where an earlier request (of that same i/o) returned from the driver
> with an error?  If so, that will compound timeouts even further.
>
> I''m also confused by your statement that wait means wait forever,
given
> that the actual circumstances here are that zfs (and the rest of the
> i/o stack) returned after 9 minutes.
The details are in PSARC/2007/567.  Externally available at:
http://www.opensolaris.org/os/community/arc/caselog/2007/567/

With failmode=wait, I/Os will wait until "manual intervention" which
is shown as an administrator running zpool clear on the affected pool.

I see the need for a document to help people work through these
cases as they can be complex at many different levels.
 -- richard

Miles Nordin

2008-Aug-05 01:16 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

>>>>> "re" == Richard Elling <Richard.Elling at
Sun.COM> writes:
>>>>> "pf" == Paul Fisher <pfisher at
alertlogic.net> writes:
re> I was able to reproduce this in b93, but might have a
re> different interpretation

You weren''t able to reproduce the hang of ''zpool
status''?
Your ''zpool status'' was after the FMA fault kicked in, though.
How
about before FMA decided to mark the pool faulted---did ''zpool
status''
hang, or work? If it worked, what did it report?

The ''zpool status'' hanging happens for me on b71 when an iSCSI
target
goes away. (IIRC ''iscsiadm remove discovery-address ...''
unwedges
zpool status for me, but my notes could be more careful.)

re> However, the default failmode property is set to "wait"
which
re> will patiently wait forever. If you would rather have the I/O
re> fail, then you should change the failmode to "continue"

for him, it sounds like it''s not doing either. I think he does not
have the failmode property, since it is so new?

It sounds like ''continue'' should return I/O errors sooner than
9
minutes after the unredundant disks generate them (but not at all for
degraded redundant pools of course). And it sounds like
''wait'' should
block the writing program, forever if necessary, like an NFS hard
mount.

(1) Is the latter what ''wait'' actually did for you? Or did
the
writing process get I/O errors after the 9-minutes-later FMA
diagnosis?

(2) is it like NFS ''hard'' or is it like
''hard,intr''? :)

It''s great to see these things improving.

pf> Wow! Who knew that 17, 951 was the magic number... Seriously,
pf> this does seem like an "excessive amount of certainty".

I agree it''s an awfully forgiving constant, so big that it sounds like
it might not be a constant manually set to 16384 or something, but
rather an accident. I''m surprised to find FMA is responsible for
deciding the length of this 9-minute (or more, for Ross) delay.

note that, if the false positives one is trying to filter out are
things like USB/SAN cabling spasms and drive recalibrations, the right
metric is time, not number of failed CDB''s.

The hugely-delayed response may be a blessing in disguise though,
because arranging for the differnet FMA states to each last tens of
minutes means it''s possible to evaluate the system''s behavior
in each
state, to see if it''s correct. For example, within this 9-minute
window:

* what does ''zpool status'' say before the FMA faulting

* what do applications experience, ex.,

+ is it possible to get an I/O error during this window with failmode=wait?
how about with failmode=continue?

+ are reads and writes that block interruptible or uninterruptible?

+ What about fsync()?

o what about fsync() if there is a slog?

* is the system stable or are there ``lazy panic'''' cases?

+ what if you ``ask for it'''' by calling ''zpool
clear'' or ''zpool
scrub'' within the 9-minute window?

* are other pools that don''t include failed devices affected (for
reading/writing. but, also, if ''zpool status'' is frozen
for all
pools, then other pools are affected.)

* probably other stuff...

God willing some day some of the states can be shortened to values
more like 1 second or 1 minute, or really aggressive
variance-and-average-based threshholds like TCP timers, so that FMA is
actually useful rather than a step backwards from SVM as it seems to
me right now. The NetApp paper Richard posted earlier was saying
NetApp never waits the 30 seconds for an ATAPI error, they just ignore
the disk if it doesn''t answer within 1000ms or so. But my crappy
Linux iSCSI targets would probably miss 1000ms timeouts all the time
just because they''re heavily loaded---you could get pools that go
FAULTED whenever they get heavy use.

so some of FMA''s states maybe should be short, but they''re
harder to
observe when they''re so short. The point of FMA, AIUI, is to make the
failure state machine really complicated. We want it complicated to
deal with both netapp''s good example of aggressive timers and also
deal with my crappy Linux IET setup, so increasingly hairy rules can
be written with experience. Complicated means that observing each
state is important to verify the complicated system''s correctness.
And observing means they can''t be 1 second long even if that''s
the
appropriate length. But I don''t know if that''s really the
developer''s
intent, or just my dreaming and hoping.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080804/692ef033/attachment.bin>

Ross Smith

2008-Aug-05 14:04 UTC

head link

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

Ok, I think I''ve got to the bottom of all this now, but it took some
work to figure out everything that was going on.  I couldn''t think of
any way to sensibly write this all up in an e-mail, so I''ve written up
my findings and they''re all in the attached PDF.
 
The initial problem can be summarised as: ZFS can cause silent data loss if you
accidentally remove a device from a pool that''s in a non-redundant
state.
 
But that breaks down into several individual issues:

SATA hot plug is poorly supported on the Supermicro AOC-SAT2-MV8 card,which uses
a Marvell 88SX6081 controller.
ZFS is inconsistent in its handling of SATA devices going offline.
FMA takes too long to diagnose a device removal, and can generate hundredsof MB
of errors while doing so..
ZFS can continue to read and write from a pool for some considerable time after
it has gone offline.
"zpool status" can not only hang, but can lock out other tools.  BUG:
6667199
"zpool clear" hangs on single drives (and probably also hangs for any
pool in anon redundant state).Probably related to BUG: 667208
"zpool status" doesn''t report if there has been a problem
mounting the pool
Ross> Date: Thu, 31 Jul 2008 09:17:46 -0700> From: Richard.Elling at
Sun.COM> Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive
removed> To: myxiplx at hotmail.com> > Ross Smith wrote:> > Ok,
in snv_94, with a USB drive that I pulled half way through the > >
standard copy of 19k files (71MB).> > > > This time the copy
operation paused after just 5-10MB more, and it''s > > currently
sat there. FMdump doesn''t have a a lot to say, fmdump -e has > >
been scrolling zfs io & data messages down the screen for nearly 10 >
> minutes now.> > OK, that is what I saw. There is a transaction group
which is waiting> to get out and it has up to 5 seconds of writes in it. >
> There is a couple of rounds of logic going on here with the diagnosis>
and feedback to ZFS to stop trying. These things can get very complex> to
solve for the general case, but the current state seems to be>
suboptimal.> > > > > # fmdump> > TIME UUID SUNW-MSG-ID>
> Jul 25 11:27:27.2858 08faf2a3-e39f-e435-8229-d409514f8531 ZFS-8000-D3>
> Interesting... you got 3 -D3 diagnoses and one -HC (which is what I
also> got). The -D3 is similar, but may also lead to a different zpool status
-x> result (which has yet another diagnosis).> > > Jul 29
16:27:56.5151 c2537861-80bb-6154-c8d2-cac9fb1674ae ZFS-8000-D3> > Jul 30
14:11:08.8059 7e33e484-728e-4ffe-cbdc-e9d8a05e33aa ZFS-8000-HC> > Jul 31
11:45:12.3883 d76fcc2c-acee-6b62-f70f-b770651ea5ad ZFS-8000-D3> > >
> The fmdump -e lines are all along the lines of:> > Jul 31
08:21:38.9999 ereport.fs.zf.io> > Jul 31 08:21:38.9999
ereport.fs.zf.data> > Yes, these are error reports where ZFS hit an I/O
error and that> will stimulate a data error report, too. The correlation and
analysis> of these errors is done by FMA (actually fmd). I also noticed a>
lot of activity on the /var file system as fmd was busy checkpointing> the
zfs diagnosis. This is probably redundant, redundant also.> > > >
> I plugged the USB disk in again, /var/adm/messages says:> > > >
Jul 31 16:45:06 unknown usba: [ID 691482 kern.warning] WARNING: > > /pci
at 0,0/pci15d9,a011 at 2,1/storage at 3 (scsa2usb0): Disconnected device >
> was busy, please reconnect.> > Jul 31 16:45:07 unknown scsi: [ID
107833 kern.warning] WARNING: > > /pci at 0,0/pci15d9,a011 at 2,1/storage
at 3/disk at 0,0 (sd17):> > Jul 31 16:45:07 unknown Command failed to
complete...Device is gone> > Jul 31 16:45:07 unknown scsi: [ID 107833
kern.warning] WARNING: > > /pci at 0,0/pci15d9,a011 at 2,1/storage at
3/disk at 0,0 (sd17):> > Jul 31 16:45:07 unknown Command failed to
complete...Device is gone> > Jul 31 16:45:07 unknown scsi: [ID 107833
kern.warning] WARNING: > > /pci at 0,0/pci15d9,a011 at 2,1/storage at
3/disk at 0,0 (sd17):> > Jul 31 16:45:07 unknown Command failed to
complete...Device is gone> > Jul 31 16:45:07 unknown scsi: [ID 107833
kern.warning] WARNING: > > /pci at 0,0/pci15d9,a011 at 2,1/storage at
3/disk at 0,0 (sd17):> > Jul 31 16:45:07 unknown Command failed to
complete...Device is gone> > Jul 31 16:45:07 unknown scsi: [ID 107833
kern.warning] WARNING: > > /pci at 0,0/pci15d9,a011 at 2,1/storage at
3/disk at 0,0 (sd17):> > Jul 31 16:45:07 unknown Command failed to
complete...Device is gone> > Jul 31 16:45:07 unknown scsi: [ID 107833
kern.warning] WARNING: > > /pci at 0,0/pci15d9,a011 at 2,1/storage at
3/disk at 0,0 (sd17):> > Jul 31 16:45:07 unknown Command failed to
complete...Device is gone> > Jul 31 16:45:07 unknown scsi: [ID 107833
kern.warning] WARNING: > > /pci at 0,0/pci15d9,a011 at 2,1/storage at
3/disk at 0,0 (sd17):> > Jul 31 16:45:07 unknown Command failed to
complete...Device is gone> > Jul 31 16:45:17 unknown smbd[516]: [ID 766186
daemon.error] > > NbtDatagramDecode[11]: too small packet> > Jul 31
16:47:17 unknown last message repeated 1 time> > Jul 31 16:49:06 unknown
/sbin/dhcpagent[100]: [ID 732317 > > daemon.warning] accept_v4_acknak: ACK
packet on nge0 missing mandatory > > lease option, ignored> > Jul 31
16:49:22 unknown last message repeated 5 times> > Jul 31 16:49:54 unknown
usba: [ID 691482 kern.warning] WARNING: > > /pci at 0,0/pci15d9,a011 at
2,1/storage at 3 (scsa2usb0): Reinserted device is > > accessible
again.> >> > After a few minutes (at 16:51), fmdump -e changed from
the above lines to:> > fmdump: warning: skipping record: log file
corruption detected> > > > Checking /var/adm/messages now gives:>
> Jul 31 16:50:17 unknown smbd[516]: [ID 766186 daemon.error] > >
NbtDatagramDecode[11]: too small packet> > Jul 31 16:50:50 unknown fmd:
[ID 441519 daemon.error] SUNW-MSG-ID: > > ZFS-8000-FD, TYPE: Fault, VER:
1, SEVERITY: Major> > Jul 31 16:50:50 unknown EVENT-TIME: Thu Jul 31
16:50:49 BST 2008> > Jul 31 16:50:50 unknown PLATFORM: H8DM3-2, CSN:
1234567890, HOSTNAME: > > unknown> > Jul 31 16:50:50 unknown SOURCE:
zfs-diagnosis, REV: 1.0> > Jul 31 16:50:50 unknown EVENT-ID:
3a12b357-2d61-491f-e8ab-9247ebcea342> > Jul 31 16:50:50 unknown DESC: The
number of I/O errors associated with > > a ZFS device exceeded> >
Jul 31 16:50:50 unknown acceptable levels. Refer to > >
http://sun.com/msg/ZFS-8000-FD for more information.> > Jul 31 16:50:50
unknown AUTO-RESPONSE: The device has been offlined > > and marked as
faulted. An attempt> > Jul 31 16:50:50 unknown will be made to activate a
hot spare if > > available.> > Jul 31 16:50:50 unknown IMPACT: Fault
tolerance of the pool may be > > compromised.> > Jul 31 16:50:50
unknown REC-ACTION: Run ''zpool status -x'' and replace >
> the bad device.> > > > Which looks pretty similar to what you
saw. zpool status still > > appears to hang though.> > Yes. The hang
is due to the failmode property. A process waiting on I/O> in UNIX will not
receive any signals until it wakes from the wait... which> won''t
happen because the failmode=wait. I''m going to try another test>
with failmode=continue and see what happens.> > FWIW, there is
considerable debate about whether failmode=wait or> continue is the best
default. wait works like the default for NFS, which> works like most PC-like
operating systems. For highly available systems,> we''d actually
rather ''get off the pot'' than ''sh*t'' so we
tend to prefer> panic, with a compromise on continue.> > > > >
Running fmdump again, I now have this line at the bottom:> > > >
TIME UUID SUNW-MSG-ID> > Jul 31 16:50:49.9906
3a12b357-2d61-491f-e8ab-9247ebcea342 ZFS-8000-FD> >> > This is the
first time I''ve ever seen that FMD message appear in > >
/var/adm/messages. I wonder if it''s the zpool status hanging
that''s > > causing the FMD stuff to not work? What happens if you
try to > > reproduce this there and run zpool status as you remove your
drive?> > Some zpool commands will wait, but I had good luck with>
zpool status -x... but now that seems to be hanging too. I don''t>
think zpool status should hang, ever, so this looks like a real> bug.> --
richard> > > > > > Ross> > > >> >> >
> Date: Thu, 31 Jul 2008 07:42:48 -0700> > > From: Richard.Elling at
Sun.COM> > > Subject: Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 hang
when drive > > removed> > > To: myxiplx at hotmail.com> >
>> > > [off-alias, as the e-mails may get large...]> > >
what does fmdump and fmdump -e say?> > > -- richard> > >>
> >> > > Ross Smith wrote:> > > > I''m not
sure you''re actually seeing the same problem there Richard.> >
> > It seems that for you I/O is stopping on removal of the device,>
> > > whereas for me I/O continues for some considerable time. You are
also> > > > able to obtain a result from ''zpool
status'' whereas that completely> > > > hangs for me.>
> > >> > > > To illustrate the difference, this is what I
saw today in snv_94, > > with> > > > a pool created from a
single external USB hard drive.> > > >> > > > 1. As
before I started a copy of a directory using Solaris'' file> >
> > manager. About 1/3 of the way through I pulled the plug on the
drive.> > > > 2. File manager continued to copy a further 30MB+ of
files across.> > > > Checking the properties of the copy shows it
contains 71.1MB of data> > > > and 19,160 files, despite me pulling
the drive at around 8,000 files.> > > >> > > > 3. 8:24am
I ran ''zpool status'':> > > > # zpool status
rc-usb> > > > pool: rc-usb> > > > state: ONLINE> >
> > status: One or more devices has experienced an error resulting in
data> > > > corruption. Applications may be affected.> > >
> action: Restore the file in question if possible. Otherwise > >
restore the> > > > entire pool from backup.> > > > see:
http://www.sun.com/msg/ZFS-8000-8A> > > > scrub: none requested>
> > >> > > > That is as far as it gets. It never gives me
any further> > > > information. I left it two hours, and it still
had not displayed the> > > > status of the drive in the pool. I also
did a ''zfs list'', that also> > > > hangs now
although I''m pretty sure that if you run ''zfs list''
before> > > > ''zpool status'' it works fine.>
> > >> > > > As you can see from /var/adm/messages, I am
getting nothing at all> > > > from FMA:> > > > Jul 31
08:16:46 unknown usba: [ID 912658 kern.info] USB 2.0 device> > > >
(usbd49,7350) operating at hi speed (USB 2.x) on USB 2.0 root hub:> > >
> storage at 3 <mailto:storage at 3>, scsa2usb0 at bus address 2>
> > > Jul 31 08:16:46 unknown usba: [ID 349649 kern.info] Maxtor>
> > > OneTouch 2HAP70DZ> > > > Jul 31 08:16:46 unknown
genunix: [ID 936769 kern.info] scsa2usb0 is> > > > /pci at
0,0/pci15d9,a011 at 2,1/storage at 3> > > > Jul 31 08:16:46 unknown
genunix: [ID 408114 kern.info]> > > > /pci at 0,0/pci15d9,a011 at
2,1/storage at 3 (scsa2usb0) online> > > > Jul 31 08:16:46 unknown
scsi: [ID 193665 kern.info] sd17 at > > scsa2usb0:> > > >
target 0 lun 0> > > > Jul 31 08:16:46 unknown genunix: [ID 936769
kern.info] sd17 is> > > > /pci at 0,0/pci15d9,a011 at 2,1/storage at
3/disk at 0,0> > > > Jul 31 08:16:46 unknown genunix: [ID 340201
kern.warning] WARNING:> > > > Page83 data not standards compliant
Maxtor OneTouch 0125> > > > Jul 31 08:16:46 unknown genunix: [ID
408114 kern.info]> > > > /pci at 0,0/pci15d9,a011 at 2,1/storage at
3/disk at 0,0 (sd17) online> > > > Jul 31 08:16:49 unknown pcplusmp:
[ID 444295 kern.info] pcplusmp: ide> > > > (ata) instance #1 vector
0xf ioapic 0x4 intin 0xf is bound to cpu 3> > > > Jul 31 08:16:49
unknown scsi: [ID 193665 kern.info] sd14 at> > > > marvell88sx1:
target 7 lun 0> > > > Jul 31 08:16:49 unknown genunix: [ID 936769
kern.info] sd14 is> > > > /pci at 1,0/pci1022,7458 at 2/pci11ab,11ab
at 1/disk at 7,0> > > > Jul 31 08:16:49 unknown genunix: [ID 408114
kern.info]> > > > /pci at 1,0/pci1022,7458 at 2/pci11ab,11ab at
1/disk at 7,0 (sd14) online> > > > Jul 31 08:21:35 unknown usba: [ID
691482 kern.warning] WARNING:> > > > /pci at 0,0/pci15d9,a011 at
2,1/storage at 3 (scsa2usb0): Disconnected device> > > > was busy,
please reconnect.> > > > Jul 31 08:21:38 unknown scsi: [ID 107833
kern.warning] WARNING:> > > > /pci at 0,0/pci15d9,a011 at
2,1/storage at 3/disk at 0,0 (sd17):> > > > Jul 31 08:21:38 unknown
Command failed to complete...Device is gone> > > > Jul 31 08:21:38
unknown scsi: [ID 107833 kern.warning] WARNING:> > > > /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):> > > > Jul
31 08:21:38 unknown Command failed to complete...Device is gone> > >
> Jul 31 08:21:38 unknown scsi: [ID 107833 kern.warning] WARNING:> >
> > /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):>
> > > Jul 31 08:21:38 unknown Command failed to complete...Device is
gone> > > > Jul 31 08:21:38 unknown scsi: [ID 107833 kern.warning]
WARNING:> > > > /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk at
0,0 (sd17):> > > > Jul 31 08:21:38 unknown Command failed to
complete...Device is gone> > > > Jul 31 08:21:38 unknown scsi: [ID
107833 kern.warning] WARNING:> > > > /pci at 0,0/pci15d9,a011 at
2,1/storage at 3/disk at 0,0 (sd17):> > > > Jul 31 08:21:38 unknown
Command failed to complete...Device is gone> > > > Jul 31 08:21:38
unknown scsi: [ID 107833 kern.warning] WARNING:> > > > /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):> > > > Jul
31 08:21:38 unknown Command failed to complete...Device is gone> > >
> Jul 31 08:21:38 unknown scsi: [ID 107833 kern.warning] WARNING:> >
> > /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):>
> > > Jul 31 08:21:38 unknown Command failed to complete...Device is
gone> > > > Jul 31 08:21:38 unknown scsi: [ID 107833 kern.warning]
WARNING:> > > > /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk at
0,0 (sd17):> > > > Jul 31 08:21:38 unknown Command failed to
complete...Device is gone> > > > Jul 31 08:24:26 unknown scsi: [ID
107833 kern.warning] WARNING:> > > > /pci at 0,0/pci15d9,a011 at
2,1/storage at 3/disk at 0,0 (sd17):> > > > Jul 31 08:24:26 unknown
Command failed to complete...Device is gone> > > > Jul 31 08:24:26
unknown scsi: [ID 107833 kern.warning] WARNING:> > > > /pci at
0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):> > > > Jul
31 08:24:26 unknown Command failed to complete...Device is gone> > >
> Jul 31 08:24:26 unknown scsi: [ID 107833 kern.warning] WARNING:> >
> > /pci at 0,0/pci15d9,a011 at 2,1/storage at 3/disk at 0,0 (sd17):>
> > > Jul 31 08:24:26 unknown drive offline> > > > Jul 31
08:27:43 unknown smbd[603]: [ID 766186 daemon.error]> > > >
NbtDatagramDecode[11]: too small packet> > > > Jul 31 08:39:43
unknown smbd[603]: [ID 766186 daemon.error]> > > >
NbtDatagramDecode[11]: too small packet> > > > Jul 31 08:44:50
unknown /sbin/dhcpagent[95]: [ID 732317> > > > daemon.warning]
accept_v4_acknak: ACK packet on nge0 missing > > mandatory> > >
> lease option, ignored> > > > Jul 31 08:44:58 unknown last
message repeated 3 times> > > > Jul 31 08:45:06 unknown
/sbin/dhcpagent[95]: [ID 732317> > > > daemon.warning]
accept_v4_acknak: ACK packet on nge0 missing > > mandatory> > >
> lease option, ignored> > > > Jul 31 08:45:06 unknown last
message repeated 1 time> > > > Jul 31 08:51:44 unknown smbd[603]:
[ID 766186 daemon.error]> > > > NbtDatagramDecode[11]: too small
packet> > > > Jul 31 09:03:44 unknown smbd[603]: [ID 766186
daemon.error]> > > > NbtDatagramDecode[11]: too small packet>
> > > Jul 31 09:13:51 unknown /sbin/dhcpagent[95]: [ID 732317> >
> > daemon.warning] accept_v4_acknak: ACK packet on nge0 missing > >
mandatory> > > > lease option, ignored> > > > Jul 31
09:14:09 unknown last message repeated 5 times> > > > Jul 31
09:15:44 unknown smbd[603]: [ID 766186 daemon.error]> > > >
NbtDatagramDecode[11]: too small packet> > > > Jul 31 09:27:44
unknown smbd[603]: [ID 766186 daemon.error]> > > >
NbtDatagramDecode[11]: too small packet> > > > Jul 31 09:27:55
unknown pcplusmp: [ID 444295 kern.info] pcplusmp: ide> > > > (ata)
instance #1 vector 0xf ioapic 0x4 intin 0xf is bound to cpu 3> > >
>> > > > cfgadm reports that the port is empty but still
configured:> > > > # cfgadm> > > > Ap_Id Type Receptacle
Occupant> > > > Condition> > > > usb1/3 unknown empty
configured> > > > unusable> > > >> > > > 4.
9:32am I now tried writing more data to the pool, to see if I can> > >
> trigger the I/O error you are seeing. I tried making a second copy of>
> > > the files on the USB drive in the Solaris File manager, but
that> > > > attempt simply hung the copy dialog. I''m still
seeing nothing else> > > > that appears relevant in
/var/adm/messages.> > > >> > > > 5. 10:08am While
checking free space, I found that although df works,> > > >
''df -kh'' hangs, apparently when it tries to query any zfs
pool:> > > > # df> > > > / (/dev/dsk/c1t0d0s0 ): 2504586
blocks 656867 files> > > > /devices (/devices ): 0 blocks 0
files> > > > /dev (/dev ): 0 blocks 0 files> > > >
/system/contract (ctfs ): 0 blocks 2147483609 files> > > > /proc
(proc ): 0 blocks 29902 files> > > > /etc/mnttab (mnttab ): 0 blocks
0 files> > > > /etc/svc/volatile (swap ): 9850928 blocks 1180374
files> > > > /system/object (objfs ): 0 blocks 2147483409 files>
> > > /etc/dfs/sharetab (sharefs ): 0 blocks 2147483646 files> >
> > /lib/libc.so.1 (/usr/lib/libc/libc_hwcap2.so.1): 2504586 blocks>
> > > 656867 files> > > > /dev/fd (fd ): 0 blocks 0
files> > > > /tmp (swap ): 9850928 blocks 1180374 files> >
> > /var/run (swap ): 9850928 blocks 1180374 files> > > >
/export/home (/dev/dsk/c1t0d0s7 ):881398942 blocks 53621232 files> > >
> /rc-pool (rc-pool ):4344346098 blocks 4344346098 files> > > >
/rc-pool/admin (rc-pool/admin ):4344346098 blocks 4344346098 files> > >
> /rc-pool/ross-home (rc-pool/ross-home ):4344346098 blocks > >
4344346098 files> > > > /rc-pool/vmware (rc-pool/vmware ):4344346098
blocks 4344346098 files> > > > /rc-usb (rc-usb ):153725153 blocks
153725153 files> > > > # df -kh> > > > Filesystem size
used avail capacity Mounted on> > > > /dev/dsk/c1t0d0s0 7.2G 6.0G
1.1G 85% /> > > > /devices 0K 0K 0K 0% /devices> > > >
/dev 0K 0K 0K 0% /dev> > > > ctfs 0K 0K 0K 0% /system/contract>
> > > proc 0K 0K 0K 0% /proc> > > > mnttab 0K 0K 0K 0%
/etc/mnttab> > > > swap 4.7G 1.1M 4.7G 1% /etc/svc/volatile> >
> > objfs 0K 0K 0K 0% /system/object> > > > sharefs 0K 0K 0K
0% /etc/dfs/sharetab> > > > /usr/lib/libc/libc_hwcap2.so.1> >
> > 7.2G 6.0G 1.1G 85% /lib/libc.so.1> > > > fd 0K 0K 0K 0%
/dev/fd> > > > swap 4.7G 48K 4.7G 1% /tmp> > > > swap
4.7G 76K 4.7G 1% /var/run> > > > /dev/dsk/c1t0d0s7 425G 4.8G 416G 2%
/export/home> > > >> > > > 6. 10:35am It''s now
been two hours, neither ''zpool status'' nor ''zfs>
> > > list'' have ever finished. The file copy attempt has also
been hung> > > > for over an hour (although that''s not
unexpected with ''wait'' as the> > > >
failmode).> > > >> > > > Richard, you say ZFS is not
silently failing, well for me it appears> > > > that it is. I
can''t see any warnings from ZFS, I can''t get any status>
> > > information. I see no way that I could find out what files are
going> > > > to be lost on this server.> > > >> >
> > Yes, I''m now aware that the pool has hung since file
operations are> > > > hanging, however had that been my first
indication of a problem I> > > > believe I am now left in a position
where I cannot find out either > > the> > > > cause, nor the
files affected. I don''t believe I have any way to find> > >
> out which operations had completed without error, but are not> > >
> currently committed to disk. I certainly don''t get the status
message> > > > you do saying permanent errors have been found in
files.> > > >> > > > I plugged the USB drive back in
now, Solaris detected it ok, but ZFS> > > > is still hung. The rest
of /var/adm/messages is:> > > > Jul 31 09:39:44 unknown smbd[603]:
[ID 766186 daemon.error]> > > > NbtDatagramDecode[11]: too small
packet> > > > Jul 31 09:45:22 unknown /sbin/dhcpagent[95]: [ID
732317> > > > daemon.warning] accept_v4_acknak: ACK packet on nge0
missing > > mandatory> > > > lease option, ignored> >
> > Jul 31 09:45:38 unknown last message repeated 5 times> > >
> Jul 31 09:51:44 unknown smbd[603]: [ID 766186 daemon.error]> > >
> NbtDatagramDecode[11]: too small packet> > > > Jul 31 10:03:44
unknown last message repeated 2 times> > > > Jul 31 10:14:27 unknown
/sbin/dhcpagent[95]: [ID 732317> > > > daemon.warning]
accept_v4_acknak: ACK packet on nge0 missing > > mandatory> > >
> lease option, ignored> > > > Jul 31 10:14:45 unknown last
message repeated 5 times> > > > Jul 31 10:15:44 unknown smbd[603]:
[ID 766186 daemon.error]> > > > NbtDatagramDecode[11]: too small
packet> > > > Jul 31 10:27:45 unknown smbd[603]: [ID 766186
daemon.error]> > > > NbtDatagramDecode[11]: too small packet>
> > > Jul 31 10:36:25 unknown usba: [ID 691482 kern.warning]
WARNING:> > > > /pci at 0,0/pci15d9,a011 at 2,1/storage at 3
(scsa2usb0): Reinserted device is> > > > accessible again.> >
> > Jul 31 10:39:45 unknown smbd[603]: [ID 766186 daemon.error]> >
> > NbtDatagramDecode[11]: too small packet> > > > Jul 31
10:45:53 unknown /sbin/dhcpagent[95]: [ID 732317> > > >
daemon.warning] accept_v4_acknak: ACK packet on nge0 missing > >
mandatory> > > > lease option, ignored> > > > Jul 31
10:46:09 unknown last message repeated 5 times> > > > Jul 31
10:51:45 unknown smbd[603]: [ID 766186 daemon.error]> > > >
NbtDatagramDecode[11]: too small packet> > > >> > > > 7.
10:55am Gave up on ZFS ever recovering. A shutdown attempt hung> > >
> as expected. I hard-reset the computer.> > > >> > >
> Ross> > > >> > > >> > > >> > >
>> > > > > Date: Wed, 30 Jul 2008 11:17:08 -0700> > >
> > From: Richard.Elling at Sun.COM> > > > > Subject: Re:
[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive> > > >
removed> > > > > To: myxiplx at hotmail.com> > > >
> CC: zfs-discuss at opensolaris.org> > > > >> > >
> > I was able to reproduce this in b93, but might have a different>
> > > > interpretation of the conditions. More below...> >
> > >> > > > > Ross Smith wrote:> > > > >
> A little more information today. I had a feeling that ZFS would> >
> > > > continue quite some time before giving an error, and today
> > I''ve shown> > > > > > that you can carry on
working with the filesystem for at least> > > > half an> >
> > > > hour with the disk removed.> > > > > >>
> > > > > I suspect on a system with little load you could carry
on > > working for> > > > > > several hours without any
indication that there is a problem. It> > > > > > looks to me
like ZFS is caching reads & writes, and that provided> > > >
> > requests can be fulfilled from the cache, it doesn''t care
> > whether the> > > > > > disk is present or not.>
> > > >> > > > > In my
USB-flash-disk-sudden-removal-while-writing-big-file-test,> > > >
> 1. I/O to the missing device stopped (as I expected)> > > >
> 2. FMA kicked in, as expected.> > > > > 3. /var/adm/messages
recorded ''Command failed to complete... device> > > >
gone.''> > > > > 4. After exactly 9 minutes, 17,951
e-reports had been processed > > and the> > > > > diagnosis
was complete. FMA logged the following to > > /var/adm/messages> >
> > >> > > > > Jul 30 10:33:44 grond scsi: [ID 107833
kern.warning] WARNING:> > > > > /pci at 0,0/pci1458,5004 at
b,1/storage at 8/disk at 0,0 (sd1):> > > > > Jul 30 10:33:44
grond Command failed to complete...Device is gone> > > > > Jul 30
10:42:31 grond fmd: [ID 441519 daemon.error] SUNW-MSG-ID:> > > >
> ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major> > > > >
Jul 30 10:42:31 grond EVENT-TIME: Wed Jul 30 10:42:30 PDT 2008> > >
> > Jul 30 10:42:31 grond PLATFORM: , CSN: , HOSTNAME: grond> > >
> > Jul 30 10:42:31 grond SOURCE: zfs-diagnosis, REV: 1.0> > >
> > Jul 30 10:42:31 grond EVENT-ID:
d99769aa-28e8-cf16-d181-945592130525> > > > > Jul 30 10:42:31
grond DESC: The number of I/O errors associated > > with a> > >
> > ZFS device exceeded> > > > > Jul 30 10:42:31 grond
acceptable levels. Refer to> > > > >
http://sun.com/msg/ZFS-8000-FD for more information.> > > > > Jul
30 10:42:31 grond AUTO-RESPONSE: The device has been > > offlined and>
> > > > marked as faulted. An attempt> > > > > Jul 30
10:42:31 grond will be made to activate a hot spare if> > > > >
available.> > > > > Jul 30 10:42:31 grond IMPACT: Fault tolerance
of the pool may be> > > > > compromised.> > > > >
Jul 30 10:42:31 grond REC-ACTION: Run ''zpool status -x'' and
replace> > > > > the bad device.> > > > >> >
> > > The above URL shows what you expect, but more (and better)
info> > > > > is available from zpool status -xv> > >
> >> > > > > pool: rmtestpool> > > > >
state: UNAVAIL> > > > > status: One or more devices are faultd in
response to IO failures.> > > > > action: Make sure the affected
devices are connected, then run > > ''zpool> > > >
> clear''.> > > > > see:
http://www.sun.com/msg/ZFS-8000-HC> > > > > scrub: none
requested> > > > > config:> > > > >> > >
> > NAME STATE READ WRITE CKSUM> > > > > rmtestpool UNAVAIL
0 15.7K 0 insufficient replicas> > > > > c2t0d0p0 FAULTED 0 15.7K
0 experienced I/O failures> > > > >> > > > >
errors: Permanent errors have been detected in the following files:> >
> > >> > > > > /rmtestpool/random.data> > >
> >> > > > >> > > > > If you surf to
http://www.sun.com/msg/ZFS-8000-HC you''ll> > > > > see
words to the effect that,> > > > > The pool has experienced I/O
failures. Since the ZFS pool property> > > > >
''failmode'' is set to ''wait'', all I/Os (reads
and writes) are> > > > > blocked. See the zpool(1M) manpage for
more information on the> > > > > ''failmode''
property. Manual intervention is required for I/Os to> > > > > be
serviced.> > > > >> > > > > >> > >
> > > I would guess that ZFS is attempting to write to the disk in
the> > > > > > background, and that this is silently
failing.> > > > >> > > > > It is clearly not
silently failing.> > > > >> > > > > However, the
default failmode property is set to ''wait'' which will> >
> > patiently> > > > > wait forever. If you would rather
have the I/O fail, then you > > should> > > > change> >
> > > the failmode to ''continue'' I would not normally
recommend a > > failmode of> > > > >
''panic''> > > > >> > > > > Now to
figure out how to recover gracefully... zpool clear isn''t> > >
> happy...> > > > >> > > > > [sidebar]> >
> > > while performing this experiment, I noticed that fmd was >
> checkpointing> > > > > the diagnosis engine to disk in the
/var/fm/fmd/ckpt/zfs-diagnosis> > > > > directory.> > >
> > If this had been the boot disk, with failmode=wait, I''m not
> > convinced> > > > > that we''d get a complete
diagnosis... I''ll explore that later.> > > > >
[/sidebar]> > > > >> > > > > -- richard> >
> > >> > > >> > > >> > > > >
>
------------------------------------------------------------------------>
> > > Win ?3000 to spend on whatever you want at Uni! Click here to
WIN!> > > >
<http://clk.atdmt.com/UKM/go/101719803/direct/01/>> > >>
>> >> >
------------------------------------------------------------------------>
> Win ?3000 to spend on whatever you want at Uni! Click here to WIN! >
> <http://clk.atdmt.com/UKM/go/101719803/direct/01/>>_________________________________________________________________
Get Hotmail on your mobile from Vodafone 
http://clk.atdmt.com/UKM/go/107571435/direct/01/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080805/55e5b972/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Problems with ZFS + SATA hot plug.pdf
Type: application/pdf
Size: 84624 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080805/55e5b972/attachment.pdf>

zfs discuss - Jul 2008 - Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed

[zfs-discuss] Supermicro AOC-SAT2-MV8 hang when drive removed