thr3ads.net - zfs discuss - [zfs-discuss] ZFS Panicing System Cluster Crash effect [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Jack Dumson

2008-Sep-11 22:37 UTC

[zfs-discuss] ZFS Panicing System Cluster Crash effect

Issues with ZFS and Sun Cluster

If a cluster node crashes and HAStoragePlus resource group containing ZFS
structure (ie. Zpool) is transitioned to a surviving node, the zpool import can
cause the surviving node to panic. Zpool was obviously not exported in
controlled fashion because of hard crash. Storage structure is - HW RAID
protected LUN from array. Zpool build on single HW LUN. Zpool created on a full
device (zpool create HAzpool c1t8d0s2). No RAID-Z configured.

I was under the impression that ZFS was always maintained in a perpetually
consistent state. Panic of surving node appears to be related to some form of
silent corruption in ZFS. But I thought the whole design of ZFS was to prevent
this very thing. Is RAID-Z required to achieve this resiliency?

ZFS is officially supported under Sun Cluster but this situation concerns me
greatly because the whole purpose of a cluster is undermined if a single
resource group using HA ZFS causes a panic on import. The effect could bring the
whole cluster down.

Has anyone got any thoughts, comments or similar experiences on this.

thx
--
This message posted from opensolaris.org

James C. McPherson

2008-Sep-11 22:50 UTC

head link

[zfs-discuss] ZFS Panicing System Cluster Crash effect

Jack Dumson wrote:> Issues with ZFS and Sun Cluster
> 
> If a cluster node crashes and HAStoragePlus resource group containing ZFS
> structure (ie. Zpool) is transitioned to a surviving node, the zpool
> import can cause the surviving node to panic. Zpool was obviously not
> exported in controlled fashion because of hard crash. Storage structure
> is - HW RAID protected LUN from array. Zpool build on single HW LUN.
You''ve got no redundancy from a ZFS perspective.
> Zpool created on a full device (zpool create HAzpool c1t8d0s2). No RAID-Z
> configured.
> 
> I was under the impression that ZFS was always maintained in a
> perpetually consistent state.
It''s a *lot* easier for ZFS to achieve this if you provide it with
a redundant configuration for it to manage.
> Panic of surving node appears to be related
> to some form of silent corruption in ZFS. But I thought the whole design
> of ZFS was to prevent this very thing. Is RAID-Z required to achieve this
> resiliency?
Again, redundancy from ZFS'' perspective.
> ZFS is officially supported under Sun Cluster but this situation concerns
> me greatly because the whole purpose of a cluster is undermined if a
> single resource group using HA ZFS causes a panic on import. The effect
> could bring the whole cluster down.
> Has anyone got any thoughts, comments or similar experiences on this.
I''m pretty sure that the Cluster configuration guide will have
spelt out the ZFS requirements. Apart from that, please look at
the ZFS Best Practices site

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide


James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp	http://www.jmcp.homeunix.com/blog

Ricardo M. Correia

2008-Sep-11 23:02 UTC

head link

[zfs-discuss] ZFS Panicing System Cluster Crash effect

Hi Jack,

On Qui, 2008-09-11 at 15:37 -0700, Jack Dumson wrote:> Issues with ZFS and Sun Cluster
> 
> If a cluster node crashes and HAStoragePlus resource group containing
>  ZFS structure (ie. Zpool) is transitioned to a surviving node, the
>  zpool import can cause the surviving node to panic.
zpool import should not cause a node to panic under any circumstances.

Can you provide us details about the problem, e.g., which Solaris
version are you running, log messages with the panic stack strace, ..?

In any case, you should file a new CR if this is not a known problem
already.

Best regards,
Ricardo

Ricardo M. Correia

2008-Sep-12 03:21 UTC

head link

[zfs-discuss] ZFS Panicing System Cluster Crash effect

Hi John,

On Qui, 2008-09-11 at 20:23 -0600, John Antonio wrote:> It is operating with Sol 10 u3 and also u4. Sun support is claiming
> the issue is related to quiet corruptions.
Probably, yes.
>  Since the ZFS structure was not cleanly exported because of the event
> (Node crash), the statement from support is that these types of
> corruptions could occur.
I don''t think the cause of corruption is because the pool
wasn''t cleanly
exported.

If corruption only happens when a node crashes, there are 3 likely
causes for this problem:

1) The storage subsystem is ignoring disk write cache flush requests and
allows writes to go out-of-order, making the uberblock reach the disk
before other important metadata blocks.

2) Or you''re running into a bug that is corrupting metadata.

3) Or you''re experiencing memory corruption.

The first one should be fixable and there is a bug open for this already
(CR 6667683), the second one is fixable once identified, the third one
is harder to solve.

The ZFS team and a few folks in the Lustre group are looking into making
ZFS more resilient against corrupted metadata, but this is definitely a
hard-to-solve issue.
>  The panic response is apparently the expected behavior during a zpool
> import if this situation occurs.
I wouldn''t say that is the expected behavior.. :-) I''d say a
panic when
importing a pool is a bug.
>  Apparently in u6, there will be additional zpool import options that
> will make the identification of a corruption a passive event. The pool
> won''t import but instead of panicing the server it would respond
with
> a failure status.
Interesting.. I''d love to see the CR for this.
>  Regardless of a passive response or not, it concerns me that the
> condition can occur period. Not that other filesystems don''t
> experience silent corruptions, the concern here is ZFS had been
> promoted as overcoming these exact issues.
"Silent corruptions" is a bit vague :-)
ZFS is promoted as being resilient against most kinds of disk
corruption, but memory corruption and potential bugs are different
issues.

Note that there may be several causes for panicking when importing a
pool, depending on which metadata was corrupted. Some may be easily
fixable, others may be harder.
That''s why providing a stack trace of the panic would help identify
which particular issue you''re running into.

Also note that there are efforts being made into solving these problems.
As an example, Victor Latushkin has very recently identified a similar
panic when importing a pool (CR 6720531) and provided a patch that
allows the corrupted pool to be successfully imported (only for that
particular kind of corruption, of course).
>  The fact that it has been certified to work in a cluster deployment,
> this situation suggests that it may not be ready or a significant bug
> exists.
Yes, it does appear that you''ve ran into a significant bug.
Knowing the exact bug you''re running into would be helpful.

Best regards,
Ricardo

Ross

2008-Sep-12 08:03 UTC

head link

[zfs-discuss] ZFS Panicing System Cluster Crash effect

What version of Solaris are you running there?  For a long while the default
response on encountering unrecoverable errors was to panic, but I believe that
has been improved in newer builds.

Also, part of your problem may be down to running with just a single disk.  With
just one disk, ZFS still uses checksumming to see if data is corrupted, but when
it finds a bad checksum, it has no way to correct the data.  On a cluster I
would always use some kind of redundancy within ZFS, regardless of your
underlying storage hardware.

Having said that though, I still wouldn''t have thought it would be easy
to get ZFS to hang like that.  How did you crash the first node?
--
This message posted from opensolaris.org

Reasonably Related Threads

Search for more maybe matching threads

zfs discuss - Sep 2008 - ZFS Panicing System Cluster Crash effect

[zfs-discuss] ZFS Panicing System Cluster Crash effect

[zfs-discuss] ZFS Panicing System Cluster Crash effect

[zfs-discuss] ZFS Panicing System Cluster Crash effect

[zfs-discuss] ZFS Panicing System Cluster Crash effect

[zfs-discuss] ZFS Panicing System Cluster Crash effect

Reasonably Related Threads