thr3ads.net - zfs discuss - [zfs-discuss] zpool add dumping core [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Brad Plecs

2009-Jan-09 22:16 UTC

[zfs-discuss] zpool add dumping core

I''m trying to add some additional devices to my existing pool, but
it''s not working.  I''m adding a raidz group of 5 300 GB
drives, but the command always fails:

root at kronos:/ # zpool add raid raidz c8t8d0 c8t13d0 c7t8d0 c3t8d0 c5t8d0
Assertion failed: nvlist_lookup_string(cnv, "path", &path) == 0,
file zpool_vdev.c, line 631
Abort (core dumped)

The disks all work, were labeled easily using ''format'' after
zfs and other tools refused to look at them.
Creating a UFS filesystem with newfs on them runs with no issues, but I
can''t add them to the existing zpool.

I can use the same devices to create a NEW zpool without issue. 

I fully patched up this system after encountering this problem, no change. 

The zpool to which I am adding them is fairly large and in a degraded state
(three resilvers running, one that never seems to complete and two related to
trying to add these new disks), but I didn''t think that should prevent
me from adding another vdev.

For those who suggest waiting 20 minutes for the resilver to finish,
it''s been estimating < 30 minutes
for the last 12 hours, and we''re running out of space, so I wanted to
add the new devices sooner rather than later.

Can anyone help? 

extra details below:  

root at kronos:/ # uname -a
SunOS kronos 5.10 Generic_137137-09 sun4u sparc SUNW,Sun-Fire-480R

root at kronos:/ # smpatch analyze 
137276-01 SunOS 5.10: uucico patch
122470-02 Gnome 2.6.0: GNOME Java Help Patch
121430-31 SunOS 5.8 5.9 5.10: Live Upgrade Patch
121428-11 SunOS 5.10: Live Upgrade Zones Support Patch

root at kronos:patch # zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
raid  4.32T  4.23T  92.1G    97%  DEGRADED  -

root at kronos:patch # zpool status   
  pool: raid
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use ''zpool clear'' to
mark the device
        repaired.
 scrub: resilver in progress for 12h22m, 97.25% done, 0h20m to go
config:

        NAME                STATE     READ WRITE CKSUM
        raid                DEGRADED     0     0     0
          raidz1            ONLINE       0     0     0
            c9t0d0          ONLINE       0     0     0
            c6t0d0          ONLINE       0     0     0
            c2t0d0          ONLINE       0     0     0
            c4t0d0          ONLINE       0     0     0
            c10t0d0         ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c9t1d0          ONLINE       0     0     0
            c6t1d0          ONLINE       0     0     0
            c2t1d0          ONLINE       0     0     0
            c4t1d0          ONLINE       0     0     0
            c10t1d0         ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c9t3d0          ONLINE       0     0     0
            c6t3d0          ONLINE       0     0     0
            c2t3d0          ONLINE       0     0     0
            c4t3d0          ONLINE       0     0     0
            c10t3d0         ONLINE       0     0     0
          raidz1            DEGRADED     0     0     0
            c9t4d0          ONLINE       0     0     0
            spare           DEGRADED     0     0     0
              c5t13d0       ONLINE       0     0     0
              c6t4d0        FAULTED      0 12.3K     0  too many errors
            c2t4d0          ONLINE       0     0     0
            c4t4d0          ONLINE       0     0     0
            c10t4d0         ONLINE       0     0     0
          raidz1            DEGRADED     0     0     0
            c9t5d0          ONLINE       0     0     0
            spare           DEGRADED     0     0     0
              replacing     DEGRADED     0     0     0
                c6t5d0s0/o  UNAVAIL      0     0     0  cannot open
                c6t5d0      ONLINE       0     0     0
              c11t13d0      ONLINE       0     0     0
            c2t5d0          ONLINE       0     0     0
            c4t5d0          ONLINE       0     0     0
            c10t5d0         ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c5t9d0          ONLINE       0     0     0
            c7t9d0          ONLINE       0     0     0
            c3t9d0          ONLINE       0     0     0
            c8t9d0          ONLINE       0     0     0
            c11t9d0         ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c5t10d0         ONLINE       0     0     0
            c7t10d0         ONLINE       0     0     0
            c3t10d0         ONLINE       0     0     0
            c8t10d0         ONLINE       0     0     0
            c11t10d0        ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c5t11d0         ONLINE       0     0     0
            c7t11d0         ONLINE       0     0     0
            c3t11d0         ONLINE       0     0     0
            c8t11d0         ONLINE       0     0     0
            c11t11d0        ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c5t12d0         ONLINE       0     0     0
            c7t12d0         ONLINE       0     0     0
            c3t12d0         ONLINE       0     0     0
            c8t12d0         ONLINE       0     0     0
            c11t12d0        ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c9t2d0          ONLINE       0     0     0
            c6t2d0          ONLINE       0     0     0
            replacing       ONLINE       0     0     0
              c11t8d0       ONLINE       0     0     0
              c2t2d0        ONLINE       0     0     0
            c4t2d0          ONLINE       0     0     0
            c10t2d0         ONLINE       0     0     0
        spares
          c6t4d0            INUSE     currently in use
          c3t13d0           AVAIL   
          c7t13d0           AVAIL   
          c11t13d0          INUSE     currently in use

errors: No known data errors
-- 
This message posted from opensolaris.org

Brad Plecs

2009-Jan-09 22:17 UTC

head link

[zfs-discuss] zpool add dumping core

I''m trying to add some additional devices to my existing pool, but
it''s not working.  I''m adding a raidz group of 5 300 GB
drives, but the command always fails:

root at kronos:/ # zpool add raid raidz c8t8d0 c8t13d0 c7t8d0 c3t8d0 c5t8d0
Assertion failed: nvlist_lookup_string(cnv, "path", &path) == 0,
file zpool_vdev.c, line 631
Abort (core dumped)

The disks all work, were labeled easily using ''format'' after
zfs and other tools refused to look at them.
Creating a UFS filesystem with newfs on them runs with no issues, but I
can''t add them to the existing zpool.

I can use the same devices to create a NEW zpool without issue. 

I fully patched up this system after encountering this problem, no change. 

The zpool to which I am adding them is fairly large and in a degraded state
(three resilvers running, one that never seems to complete and two related to
trying to add these new disks), but I didn''t think that should prevent
me from adding another vdev.

For those who suggest waiting 20 minutes for the resilver to finish,
it''s been estimating less than 30 minutes for the last 12 hours, and
we''re running out of space, so I wanted to add the new devices sooner
rather than later.

Can anyone help? 

extra details below:  

root at kronos:/ # uname -a
SunOS kronos 5.10 Generic_137137-09 sun4u sparc SUNW,Sun-Fire-480R

root at kronos:/ # smpatch analyze 
137276-01 SunOS 5.10: uucico patch
122470-02 Gnome 2.6.0: GNOME Java Help Patch
121430-31 SunOS 5.8 5.9 5.10: Live Upgrade Patch
121428-11 SunOS 5.10: Live Upgrade Zones Support Patch

root at kronos:patch # zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
raid  4.32T  4.23T  92.1G    97%  DEGRADED  -

root at kronos:patch # zpool status   
  pool: raid
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use ''zpool clear'' to
mark the device
        repaired.
 scrub: resilver in progress for 12h22m, 97.25% done, 0h20m to go
config:

        NAME                STATE     READ WRITE CKSUM
        raid                DEGRADED     0     0     0
          raidz1            ONLINE       0     0     0
            c9t0d0          ONLINE       0     0     0
            c6t0d0          ONLINE       0     0     0
            c2t0d0          ONLINE       0     0     0
            c4t0d0          ONLINE       0     0     0
            c10t0d0         ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c9t1d0          ONLINE       0     0     0
            c6t1d0          ONLINE       0     0     0
            c2t1d0          ONLINE       0     0     0
            c4t1d0          ONLINE       0     0     0
            c10t1d0         ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c9t3d0          ONLINE       0     0     0
            c6t3d0          ONLINE       0     0     0
            c2t3d0          ONLINE       0     0     0
            c4t3d0          ONLINE       0     0     0
            c10t3d0         ONLINE       0     0     0
          raidz1            DEGRADED     0     0     0
            c9t4d0          ONLINE       0     0     0
            spare           DEGRADED     0     0     0
              c5t13d0       ONLINE       0     0     0
              c6t4d0        FAULTED      0 12.3K     0  too many errors
            c2t4d0          ONLINE       0     0     0
            c4t4d0          ONLINE       0     0     0
            c10t4d0         ONLINE       0     0     0
          raidz1            DEGRADED     0     0     0
            c9t5d0          ONLINE       0     0     0
            spare           DEGRADED     0     0     0
              replacing     DEGRADED     0     0     0
                c6t5d0s0/o  UNAVAIL      0     0     0  cannot open
                c6t5d0      ONLINE       0     0     0
              c11t13d0      ONLINE       0     0     0
            c2t5d0          ONLINE       0     0     0
            c4t5d0          ONLINE       0     0     0
            c10t5d0         ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c5t9d0          ONLINE       0     0     0
            c7t9d0          ONLINE       0     0     0
            c3t9d0          ONLINE       0     0     0
            c8t9d0          ONLINE       0     0     0
            c11t9d0         ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c5t10d0         ONLINE       0     0     0
            c7t10d0         ONLINE       0     0     0
            c3t10d0         ONLINE       0     0     0
            c8t10d0         ONLINE       0     0     0
            c11t10d0        ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c5t11d0         ONLINE       0     0     0
            c7t11d0         ONLINE       0     0     0
            c3t11d0         ONLINE       0     0     0
            c8t11d0         ONLINE       0     0     0
            c11t11d0        ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c5t12d0         ONLINE       0     0     0
            c7t12d0         ONLINE       0     0     0
            c3t12d0         ONLINE       0     0     0
            c8t12d0         ONLINE       0     0     0
            c11t12d0        ONLINE       0     0     0
          raidz1            ONLINE       0     0     0
            c9t2d0          ONLINE       0     0     0
            c6t2d0          ONLINE       0     0     0
            replacing       ONLINE       0     0     0
              c11t8d0       ONLINE       0     0     0
              c2t2d0        ONLINE       0     0     0
            c4t2d0          ONLINE       0     0     0
            c10t2d0         ONLINE       0     0     0
        spares
          c6t4d0            INUSE     currently in use
          c3t13d0           AVAIL   
          c7t13d0           AVAIL   
          c11t13d0          INUSE     currently in use

errors: No known data errors
-- 
This message posted from opensolaris.org

Brad Plecs

2009-Jan-10 16:21 UTC

head link

[zfs-discuss] zpool add dumping core

Problem solved... after the resilvers completed, the status reported that the
filesystem needed an upgrade.

I did a zpool upgrade -a, and after that completed and there was no resilvering
going on, the zpool add ran successfully.

I would like to suggest, however, that the behavior be fixed -- it should report
something more intelligent, either "cannot add to pool during
resilver", or "cannot add to pool until the filesystem is
upgraded", whichever is correct, instead of dumping core.
-- 
This message posted from opensolaris.org

Richard Elling

2009-Jan-10 17:41 UTC

head link

[zfs-discuss] zpool add dumping core

Brad Plecs wrote:> Problem solved... after the resilvers completed, the status reported that
the filesystem needed an upgrade.
>
> I did a zpool upgrade -a, and after that completed and there was no
resilvering going on, the zpool add ran successfully.
>
> I would like to suggest, however, that the behavior be fixed -- it should
report something more intelligent, either "cannot add to pool during
resilver", or "cannot add to pool until the filesystem is
upgraded", whichever is correct, instead of dumping core.
>   
Are you sure this isn''t a case of CR 6433264 which was fixed
long ago, but arrived in patch 118833-36 to Solaris 10?
http://bugs.opensolaris.org/view_bug.do?bug_id=6433264
http://docs.sun.com/app/docs/doc/820-1259/apa-sparc-23587?a=view
 -- richard

Brad Plecs

2009-Jan-10 20:46 UTC

head link

[zfs-discuss] zpool add dumping core

> Are you sure this isn''t a case of CR 6433264 which
> was fixed
> long ago, but arrived in patch 118833-36 to Solaris
> 10?
It certainly looks similar, but this system already had 118833-36 when the error
occurred, so if this bug is truly fixed, it must be something else.  Then again,
I wasn''t adding spares, I was adding a raidz1 group, so maybe it was
patched for adding spares but not other vdevs.

I looked at the bug ID but couldn''t tell if there was a simple test I
could perform to determine if this was the same or a related bug, or something
completely new.   The error message is the same, except for the reported line
number.

Here''s some mdb output similar to what was in the original bug report: 

root at kronos:/ # mdb core
Loading modules: [ libumem.so.1 libnvpair.so.1 libuutil.so.1 libc.so.1
libavl.so.1 libsysevent.so.1 ld.so.1 ]> $clibc.so.1`_lwp_kill+8(6, 0, ff1c3058, ff12bed8, ffffffff, 6)
libc.so.1`abort+0x110(ffbfb760, 1, 0, fcba0, ff1c13d8, 0)
libc.so.1`_assert+0x64(213a8, 213d8, 277, 8d990, fc8bc, 32008)
0x1afe8(11, 0, 1a2d78, dff40, 16f2a400, 4)
0x1b028(8df60, 8cfd0, 0, 0, 0, 4)
make_root_vdev+0x9c(abe48, 0, 1, 0, 8df60, 8cfd0)
0x1342c(8, abe48, 0, 7, 0, ffbffdca)
main+0x154(9, ffbffce4, 9, 3, 33400, ffbffdc6)
_start+0x108(0, 0, 0, 0, 0, 0)

I''m happy to further poke at the core file or provide other data if
anyone''s interested...
-- 
This message posted from opensolaris.org

Richard Elling

2009-Jan-11 17:41 UTC

head link

[zfs-discuss] zpool add dumping core

Brad Plecs wrote:>> Are you sure this isn''t a case of CR 6433264 which
>> was fixed
>> long ago, but arrived in patch 118833-36 to Solaris
>> 10?
>>     
>
> It certainly looks similar, but this system already had 118833-36 when the
error occurred, so if this bug is truly fixed, it must be something else.  Then
again, I wasn''t adding spares, I was adding a raidz1 group, so maybe it
was patched for adding spares but not other vdevs.
>
> I looked at the bug ID but couldn''t tell if there was a simple
test I could perform to determine if this was the same or a related bug, or
something completely new.   The error message is the same, except for the
reported line number.
>   
Assertion failures are always bugs.
http://en.wikipedia.org/wiki/Assertion_(computing)

The question is whether this is a new one or not.  It sounds like a new one,
so please file a bug.
http://bugs.opensolaris.org
 -- richard

zfs discuss - Jan 2009 - zpool add dumping core

[zfs-discuss] zpool add dumping core

[zfs-discuss] zpool add dumping core

[zfs-discuss] zpool add dumping core

[zfs-discuss] zpool add dumping core

[zfs-discuss] zpool add dumping core

[zfs-discuss] zpool add dumping core