I''m trying to add some additional devices to my existing pool, but it''s not working. I''m adding a raidz group of 5 300 GB drives, but the command always fails: root at kronos:/ # zpool add raid raidz c8t8d0 c8t13d0 c7t8d0 c3t8d0 c5t8d0 Assertion failed: nvlist_lookup_string(cnv, "path", &path) == 0, file zpool_vdev.c, line 631 Abort (core dumped) The disks all work, were labeled easily using ''format'' after zfs and other tools refused to look at them. Creating a UFS filesystem with newfs on them runs with no issues, but I can''t add them to the existing zpool. I can use the same devices to create a NEW zpool without issue. I fully patched up this system after encountering this problem, no change. The zpool to which I am adding them is fairly large and in a degraded state (three resilvers running, one that never seems to complete and two related to trying to add these new disks), but I didn''t think that should prevent me from adding another vdev. For those who suggest waiting 20 minutes for the resilver to finish, it''s been estimating < 30 minutes for the last 12 hours, and we''re running out of space, so I wanted to add the new devices sooner rather than later. Can anyone help? extra details below: root at kronos:/ # uname -a SunOS kronos 5.10 Generic_137137-09 sun4u sparc SUNW,Sun-Fire-480R root at kronos:/ # smpatch analyze 137276-01 SunOS 5.10: uucico patch 122470-02 Gnome 2.6.0: GNOME Java Help Patch 121430-31 SunOS 5.8 5.9 5.10: Live Upgrade Patch 121428-11 SunOS 5.10: Live Upgrade Zones Support Patch root at kronos:patch # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT raid 4.32T 4.23T 92.1G 97% DEGRADED - root at kronos:patch # zpool status pool: raid state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use ''zpool clear'' to mark the device repaired. scrub: resilver in progress for 12h22m, 97.25% done, 0h20m to go config: NAME STATE READ WRITE CKSUM raid DEGRADED 0 0 0 raidz1 ONLINE 0 0 0 c9t0d0 ONLINE 0 0 0 c6t0d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c10t0d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c9t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c10t1d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c9t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c10t3d0 ONLINE 0 0 0 raidz1 DEGRADED 0 0 0 c9t4d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c5t13d0 ONLINE 0 0 0 c6t4d0 FAULTED 0 12.3K 0 too many errors c2t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 c10t4d0 ONLINE 0 0 0 raidz1 DEGRADED 0 0 0 c9t5d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 replacing DEGRADED 0 0 0 c6t5d0s0/o UNAVAIL 0 0 0 cannot open c6t5d0 ONLINE 0 0 0 c11t13d0 ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c10t5d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t9d0 ONLINE 0 0 0 c7t9d0 ONLINE 0 0 0 c3t9d0 ONLINE 0 0 0 c8t9d0 ONLINE 0 0 0 c11t9d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t10d0 ONLINE 0 0 0 c7t10d0 ONLINE 0 0 0 c3t10d0 ONLINE 0 0 0 c8t10d0 ONLINE 0 0 0 c11t10d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t11d0 ONLINE 0 0 0 c7t11d0 ONLINE 0 0 0 c3t11d0 ONLINE 0 0 0 c8t11d0 ONLINE 0 0 0 c11t11d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t12d0 ONLINE 0 0 0 c7t12d0 ONLINE 0 0 0 c3t12d0 ONLINE 0 0 0 c8t12d0 ONLINE 0 0 0 c11t12d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c9t2d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 replacing ONLINE 0 0 0 c11t8d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c10t2d0 ONLINE 0 0 0 spares c6t4d0 INUSE currently in use c3t13d0 AVAIL c7t13d0 AVAIL c11t13d0 INUSE currently in use errors: No known data errors -- This message posted from opensolaris.org
I''m trying to add some additional devices to my existing pool, but it''s not working. I''m adding a raidz group of 5 300 GB drives, but the command always fails: root at kronos:/ # zpool add raid raidz c8t8d0 c8t13d0 c7t8d0 c3t8d0 c5t8d0 Assertion failed: nvlist_lookup_string(cnv, "path", &path) == 0, file zpool_vdev.c, line 631 Abort (core dumped) The disks all work, were labeled easily using ''format'' after zfs and other tools refused to look at them. Creating a UFS filesystem with newfs on them runs with no issues, but I can''t add them to the existing zpool. I can use the same devices to create a NEW zpool without issue. I fully patched up this system after encountering this problem, no change. The zpool to which I am adding them is fairly large and in a degraded state (three resilvers running, one that never seems to complete and two related to trying to add these new disks), but I didn''t think that should prevent me from adding another vdev. For those who suggest waiting 20 minutes for the resilver to finish, it''s been estimating less than 30 minutes for the last 12 hours, and we''re running out of space, so I wanted to add the new devices sooner rather than later. Can anyone help? extra details below: root at kronos:/ # uname -a SunOS kronos 5.10 Generic_137137-09 sun4u sparc SUNW,Sun-Fire-480R root at kronos:/ # smpatch analyze 137276-01 SunOS 5.10: uucico patch 122470-02 Gnome 2.6.0: GNOME Java Help Patch 121430-31 SunOS 5.8 5.9 5.10: Live Upgrade Patch 121428-11 SunOS 5.10: Live Upgrade Zones Support Patch root at kronos:patch # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT raid 4.32T 4.23T 92.1G 97% DEGRADED - root at kronos:patch # zpool status pool: raid state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use ''zpool clear'' to mark the device repaired. scrub: resilver in progress for 12h22m, 97.25% done, 0h20m to go config: NAME STATE READ WRITE CKSUM raid DEGRADED 0 0 0 raidz1 ONLINE 0 0 0 c9t0d0 ONLINE 0 0 0 c6t0d0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c10t0d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c9t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c10t1d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c9t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c10t3d0 ONLINE 0 0 0 raidz1 DEGRADED 0 0 0 c9t4d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c5t13d0 ONLINE 0 0 0 c6t4d0 FAULTED 0 12.3K 0 too many errors c2t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 c10t4d0 ONLINE 0 0 0 raidz1 DEGRADED 0 0 0 c9t5d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 replacing DEGRADED 0 0 0 c6t5d0s0/o UNAVAIL 0 0 0 cannot open c6t5d0 ONLINE 0 0 0 c11t13d0 ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c10t5d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t9d0 ONLINE 0 0 0 c7t9d0 ONLINE 0 0 0 c3t9d0 ONLINE 0 0 0 c8t9d0 ONLINE 0 0 0 c11t9d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t10d0 ONLINE 0 0 0 c7t10d0 ONLINE 0 0 0 c3t10d0 ONLINE 0 0 0 c8t10d0 ONLINE 0 0 0 c11t10d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t11d0 ONLINE 0 0 0 c7t11d0 ONLINE 0 0 0 c3t11d0 ONLINE 0 0 0 c8t11d0 ONLINE 0 0 0 c11t11d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c5t12d0 ONLINE 0 0 0 c7t12d0 ONLINE 0 0 0 c3t12d0 ONLINE 0 0 0 c8t12d0 ONLINE 0 0 0 c11t12d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c9t2d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 replacing ONLINE 0 0 0 c11t8d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c10t2d0 ONLINE 0 0 0 spares c6t4d0 INUSE currently in use c3t13d0 AVAIL c7t13d0 AVAIL c11t13d0 INUSE currently in use errors: No known data errors -- This message posted from opensolaris.org
Problem solved... after the resilvers completed, the status reported that the filesystem needed an upgrade. I did a zpool upgrade -a, and after that completed and there was no resilvering going on, the zpool add ran successfully. I would like to suggest, however, that the behavior be fixed -- it should report something more intelligent, either "cannot add to pool during resilver", or "cannot add to pool until the filesystem is upgraded", whichever is correct, instead of dumping core. -- This message posted from opensolaris.org
Brad Plecs wrote:> Problem solved... after the resilvers completed, the status reported that the filesystem needed an upgrade. > > I did a zpool upgrade -a, and after that completed and there was no resilvering going on, the zpool add ran successfully. > > I would like to suggest, however, that the behavior be fixed -- it should report something more intelligent, either "cannot add to pool during resilver", or "cannot add to pool until the filesystem is upgraded", whichever is correct, instead of dumping core. >Are you sure this isn''t a case of CR 6433264 which was fixed long ago, but arrived in patch 118833-36 to Solaris 10? http://bugs.opensolaris.org/view_bug.do?bug_id=6433264 http://docs.sun.com/app/docs/doc/820-1259/apa-sparc-23587?a=view -- richard
> Are you sure this isn''t a case of CR 6433264 which > was fixed > long ago, but arrived in patch 118833-36 to Solaris > 10?It certainly looks similar, but this system already had 118833-36 when the error occurred, so if this bug is truly fixed, it must be something else. Then again, I wasn''t adding spares, I was adding a raidz1 group, so maybe it was patched for adding spares but not other vdevs. I looked at the bug ID but couldn''t tell if there was a simple test I could perform to determine if this was the same or a related bug, or something completely new. The error message is the same, except for the reported line number. Here''s some mdb output similar to what was in the original bug report: root at kronos:/ # mdb core Loading modules: [ libumem.so.1 libnvpair.so.1 libuutil.so.1 libc.so.1 libavl.so.1 libsysevent.so.1 ld.so.1 ]> $clibc.so.1`_lwp_kill+8(6, 0, ff1c3058, ff12bed8, ffffffff, 6) libc.so.1`abort+0x110(ffbfb760, 1, 0, fcba0, ff1c13d8, 0) libc.so.1`_assert+0x64(213a8, 213d8, 277, 8d990, fc8bc, 32008) 0x1afe8(11, 0, 1a2d78, dff40, 16f2a400, 4) 0x1b028(8df60, 8cfd0, 0, 0, 0, 4) make_root_vdev+0x9c(abe48, 0, 1, 0, 8df60, 8cfd0) 0x1342c(8, abe48, 0, 7, 0, ffbffdca) main+0x154(9, ffbffce4, 9, 3, 33400, ffbffdc6) _start+0x108(0, 0, 0, 0, 0, 0) I''m happy to further poke at the core file or provide other data if anyone''s interested... -- This message posted from opensolaris.org
Brad Plecs wrote:>> Are you sure this isn''t a case of CR 6433264 which >> was fixed >> long ago, but arrived in patch 118833-36 to Solaris >> 10? >> > > It certainly looks similar, but this system already had 118833-36 when the error occurred, so if this bug is truly fixed, it must be something else. Then again, I wasn''t adding spares, I was adding a raidz1 group, so maybe it was patched for adding spares but not other vdevs. > > I looked at the bug ID but couldn''t tell if there was a simple test I could perform to determine if this was the same or a related bug, or something completely new. The error message is the same, except for the reported line number. >Assertion failures are always bugs. http://en.wikipedia.org/wiki/Assertion_(computing) The question is whether this is a new one or not. It sounds like a new one, so please file a bug. http://bugs.opensolaris.org -- richard