thr3ads.net - zfs discuss - [zfs-discuss] cannot destroy snapshot [Apr 2011]

If this information is useful, please help other people find it:
Share via:

Paul Kraus

2011-Apr-05 12:28 UTC

[zfs-discuss] cannot destroy snapshot

I have a zpool with one dataset and a handful of snapshots. I
cannot delete two of the snapshots. The message I get is "dataset is
busy". Neither fuser or lsof show anything holding open the
.zfs/snapshot/<sanpshot name> directory. What can cause this ?

xxx> uname -a
SunOS nyc-sed3 5.10 Generic_142909-17 sun4u sparc SUNW,SPARC-Enterprise
xxx> zpool upgrade
This system is currently running ZFS pool version 22.

All pools are formatted using this version.
xxx> zpool get all zpool-01
NAME       PROPERTY       VALUE       SOURCE
zpool-01  size           74.9T       -
zpool-01  capacity       10%         -
zpool-01  altroot        -           default
zpool-01  health         ONLINE      -
zpool-01  guid           6976165213827467407  default
zpool-01  version        22          default
zpool-01  bootfs         -           default
zpool-01  delegation     on          default
zpool-01  autoreplace    off         default
zpool-01  cachefile      -           default
zpool-01  failmode       wait        default
zpool-01  listsnapshots  on          default
zpool-01  autoexpand     off         default
zpool-01  free           67.2T       -
zpool-01  allocated      7.75T       -
xxx> zfs upgrade
This system is currently running ZFS filesystem version 4.

All filesystems are formatted with the current version.
xxx> zfs get all zpool-01/dataset-01
NAME                    PROPERTY              VALUE                    SOURCE
zpool-01/dataset-01  type                  filesystem               -
zpool-01/dataset-01  creation              Tue Jan 25 10:02 2011    -
zpool-01/dataset-01  used                  4.60T                    -
zpool-01/dataset-01  available             39.3T                    -
zpool-01/dataset-01  referenced            1.09M                    -
zpool-01/dataset-01  compressratio         1.54x                    -
zpool-01/dataset-01  mounted               yes                      -
zpool-01/dataset-01  quota                 none                     default
zpool-01/dataset-01  reservation           none                     default
zpool-01/dataset-01  recordsize            32K
inherited from zpool-01
zpool-01/dataset-01  mountpoint            /zpool-01/dataset-01  default
zpool-01/dataset-01  sharenfs              off                      default
zpool-01/dataset-01  checksum              on                       default
zpool-01/dataset-01  compression           on
inherited from zpool-01
zpool-01/dataset-01  atime                 on                       default
zpool-01/dataset-01  devices               on                       default
zpool-01/dataset-01  exec                  on                       default
zpool-01/dataset-01  setuid                on                       default
zpool-01/dataset-01  readonly              off                      default
zpool-01/dataset-01  zoned                 off                      default
zpool-01/dataset-01  snapdir               hidden                   default
zpool-01/dataset-01  aclmode               passthrough
inherited from zpool-01
zpool-01/dataset-01  aclinherit            passthrough
inherited from zpool-01
zpool-01/dataset-01  canmount              on                       default
zpool-01/dataset-01  shareiscsi            off                      default
zpool-01/dataset-01  xattr                 on                       default
zpool-01/dataset-01  copies                1                        default
zpool-01/dataset-01  version               4                        -
zpool-01/dataset-01  utf8only              off                      -
zpool-01/dataset-01  normalization         none                     -
zpool-01/dataset-01  casesensitivity       sensitive                -
zpool-01/dataset-01  vscan                 off                      default
zpool-01/dataset-01  nbmand                off                      default
zpool-01/dataset-01  sharesmb              off                      default
zpool-01/dataset-01  refquota              none                     default
zpool-01/dataset-01  refreservation        none                     default
zpool-01/dataset-01  primarycache          all                      default
zpool-01/dataset-01  secondarycache        all                      default
zpool-01/dataset-01  usedbysnapshots       4.60T                    -
zpool-01/dataset-01  usedbydataset         1.09M                    -
zpool-01/dataset-01  usedbychildren        0                        -
zpool-01/dataset-01  usedbyrefreservation  0                        -
zpool-01/dataset-01  logbias               latency                  default
xxx> zfs list | grep zpool-01/dataset-01
zpool-01/dataset-01                                           4.60T
39.3T  1.09M  /zpool-01/dataset-01
zpool-01/dataset-01 at 1299636001                                 117G
  -  1.12T  -
zpool-01/dataset-01 at 1300233615                                3.48T
  -  4.48T  -
zpool-01/dataset-01 at 1301950939                                    0
  -  1.02M  -
zpool-01/dataset-01 at 1301951162                                    0
  -  1.02M  -
zpool-01/dataset-01 at 1302004805                                    0
  -  1.09M  -
zpool-01/dataset-01 at 1302005162                                    0
  -  1.09M  -
zpool-01/dataset-01 at 1302005414                                    0
  -  1.09M  -
xxx> sudo zfs destroy zpool-01/dataset-01 at 1299636001
Password:
cannot destroy ''zpool-01/dataset-01 at 1299636001'': dataset is
busy
xxx> sudo zfs destroy zpool-01/dataset-01 at 1300233615
cannot destroy ''zpool-01/dataset-01 at 1300233615'': dataset is
busy
xxx>

-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players

Ian Collins

2011-Apr-05 21:29 UTC

head link

[zfs-discuss] cannot destroy snapshot

On 04/ 6/11 12:28 AM, Paul Kraus wrote:>      I have a zpool with one dataset and a handful of snapshots. I
> cannot delete two of the snapshots. The message I get is "dataset is
> busy". Neither fuser or lsof show anything holding open the
> .zfs/snapshot/<sanpshot name>  directory. What can cause this ?
>Do you have any clones?

-- 
Ian.

Paul Kraus

2011-Apr-05 21:57 UTC

head link

[zfs-discuss] cannot destroy snapshot

On Tue, Apr 5, 2011 at 5:29 PM, Ian Collins <ian at ianshome.com>
wrote:> ?On 04/ 6/11 12:28 AM, Paul Kraus wrote:
>>
>> ? ? I have a zpool with one dataset and a handful of snapshots. I
>> cannot delete two of the snapshots. The message I get is "dataset
is
>> busy". Neither fuser or lsof show anything holding open the
>> .zfs/snapshot/<sanpshot name> ?directory. What can cause this ?
>>
> Do you have any clones?
Nope. Just a basic snapshot.

I did a `zfs destroy -d` and that did not complain, so I''ll if they
magically disappear at some point in the future.

I just can''t figure out what can be holding those snapshots open to
prevent destruction. It reminds me of the first time I could not
umount a UFS and fuser/lsof showed nothing... it was NFS shared and
the kernel does not show up in fuser/lsof.

-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players

Rich Morris

2011-Apr-05 22:56 UTC

head link

[zfs-discuss] cannot destroy snapshot

On 04/05/11 17:29, Ian Collins wrote:>  On 04/ 6/11 12:28 AM, Paul Kraus wrote:
>>      I have a zpool with one dataset and a handful of snapshots. I
>> cannot delete two of the snapshots. The message I get is "dataset
is
>> busy". Neither fuser or lsof show anything holding open the
>> .zfs/snapshot/<sanpshot name>  directory. What can cause this ?
>>
> Do you have any clones?
If there are clones then zfs destroy should report that.  The error being
reported is "dataset is busy" which would be reported if there are
user
holds on the snapshots that can''t be deleted.

Try running "zfs holds zpool-01/dataset-01 at 1299636001"

-- Rich

Edward Ned Harvey

2011-Apr-06 01:26 UTC

head link

[zfs-discuss] cannot destroy snapshot

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Paul Kraus
> 
>     I have a zpool with one dataset and a handful of snapshots. I
> cannot delete two of the snapshots. The message I get is "dataset is
> busy". Neither fuser or lsof show anything holding open the
> .zfs/snapshot/<sanpshot name> directory. What can cause this ?
This may not apply to you, but in some other unrelated situation it was
useful...

Try zdb -d poolname
In an older version of zpool, under certain conditions, there would
sometimes be "hidden" clones listed with a % in the name.  Maybe the %
won''t
be there in your case, but maybe you have some other manifestation of the
hidden clone problem?

Paul Kraus

2011-Apr-06 16:43 UTC

head link

[zfs-discuss] cannot destroy snapshot

On Tue, Apr 5, 2011 at 6:56 PM, Rich Morris <rich.morris at oracle.com>
wrote:> On 04/05/11 17:29, Ian Collins wrote:
> If there are clones then zfs destroy should report that. ?The error being
> reported is "dataset is busy" which would be reported if there
are user
> holds on the snapshots that can''t be deleted.
>
> Try running "zfs holds zpool-01/dataset-01 at 1299636001"
xxx> zfs holds zpool-01/dataset-01 at 1299636001
NAME                               TAG            TIMESTAMP
zpool-01/dataset-01 at 1299636001  .send-18440-0  Tue Mar 15 20:00:39 2011
xxx> zfs holds zpool-01/dataset-01 at 1300233615
NAME                               TAG            TIMESTAMP
zpool-01/dataset-01 at 1300233615  .send-18440-0  Tue Mar 15 20:00:47 2011
xxx>

    That is what I was looking for. Looks like when a zfs send got
killed it left a hanging lock (hold) around. I assume the next
export/import (not likely as this is a production zpool) or a reboot
(will happen eventually, and I can wait) these will clear. Unless
there is a way to force clear the hold.

Thanks Rich.

-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players

Paul Kraus

2011-Apr-06 16:51 UTC

head link

[zfs-discuss] cannot destroy snapshot

On Tue, Apr 5, 2011 at 9:26 PM, Edward Ned Harvey
<opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote:
> This may not apply to you, but in some other unrelated situation it was
> useful...
>
> Try zdb -d poolname
> In an older version of zpool, under certain conditions, there would
> sometimes be "hidden" clones listed with a % in the name. ?Maybe
the % won''t
> be there in your case, but maybe you have some other manifestation of the
> hidden clone problem?
    I have seen the dataset with a ''%'' in the name, but that
was
during a zfs recv (and if the zfs recv dies, then it sometimes hangs
around and has to be destroyed, and the zfs destroy claims to fail
even though it succeeds ;-), but not in this case. The snapshots are
all valid (I just can''t destroy two of them), we are snapshotting on a
fairly frequent basis as we are loading data.

Thanks for the suggestion.

xxx> zdb -d zpool-01
Dataset mos [META], ID 0, cr_txg 4, 18.7G, 745 objects
Dataset zpool-01/dataset-01 at 1302019202 [ZPL], ID 140, cr_txg 654658,
38.9G, 990842 objects
Dataset zpool-01/dataset-01 at 1302051600 [ZPL], ID 158, cr_txg 655776,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302062401 [ZPL], ID 189, cr_txg 656162,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1301951162 [ZPL], ID 108, cr_txg 652292,
1.02M, 478 objects
Dataset zpool-01/dataset-01 at 1302087601 [ZPL], ID 254, cr_txg 657065,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302105601 [ZPL], ID 291, cr_txg 657710,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302058800 [ZPL], ID 164, cr_txg 656033,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1299636001 [ZPL], ID 48, cr_txg 560375,
1.12T, 28468324 objects
Dataset zpool-01/dataset-01 at 1302007173 [ZPL], ID 125, cr_txg 654202,
1.09M, 506 objects
Dataset zpool-01/dataset-01 at 1302055201 [ZPL], ID 161, cr_txg 655905,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302080401 [ZPL], ID 248, cr_txg 656807,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302044400 [ZPL], ID 152, cr_txg 655518,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1301950939 [ZPL], ID 106, cr_txg 652280,
1.02M, 478 objects
Dataset zpool-01/dataset-01 at 1302015602 [ZPL], ID 137, cr_txg 654530,
10.3G, 175879 objects
Dataset zpool-01/dataset-01 at 1302030001 [ZPL], ID 143, cr_txg 655029,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1300233615 [ZPL], ID 79, cr_txg 594951,
4.48T, 99259515 objects
Dataset zpool-01/dataset-01 at 1302094801 [ZPL], ID 282, cr_txg 657323,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302066001 [ZPL], ID 214, cr_txg 656291,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302006933 [ZPL], ID 120, cr_txg 654181,
1.09M, 506 objects
Dataset zpool-01/dataset-01 at 1302098401 [ZPL], ID 285, cr_txg 657452,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302007755 [ZPL], ID 131, cr_txg 654240,
1.09M, 506 objects
Dataset zpool-01/dataset-01 at 1302048001 [ZPL], ID 155, cr_txg 655647,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302005414 [ZPL], ID 116, cr_txg 654119,
1.09M, 506 objects
Dataset zpool-01/dataset-01 at 1302007469 [ZPL], ID 128, cr_txg 654221,
1.09M, 506 objects
Dataset zpool-01/dataset-01 at 1302084001 [ZPL], ID 251, cr_txg 656936,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302076801 [ZPL], ID 245, cr_txg 656678,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302069601 [ZPL], ID 217, cr_txg 656420,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302073201 [ZPL], ID 242, cr_txg 656549,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302102001 [ZPL], ID 288, cr_txg 657581,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 at 1302005162 [ZPL], ID 112, cr_txg 654101,
1.09M, 506 objects
Dataset zpool-01/dataset-01 at 1302012001 [ZPL], ID 134, cr_txg 654391,
1.18G, 63312 objects
Dataset zpool-01/dataset-01 at 1302004805 [ZPL], ID 110, cr_txg 654085,
1.09M, 506 objects
Dataset zpool-01/dataset-01 at 1302006769 [ZPL], ID 118, cr_txg 654171,
1.09M, 506 objects
Dataset zpool-01/dataset-01 at 1302091201 [ZPL], ID 257, cr_txg 657194,
71.1G, 1845553 objects
Dataset zpool-01/dataset-01 [ZPL], ID 84, cr_txg 439406, 71.1G, 1845553 objects
Dataset zpool-01 [ZPL], ID 16, cr_txg 1, 39.3K, 5 objects
xxx>

-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players

Rich Morris

2011-Apr-06 17:58 UTC

head link

[zfs-discuss] cannot destroy snapshot

On 04/06/11 12:43, Paul Kraus wrote:> xxx> zfs holds zpool-01/dataset-01 at 1299636001
> NAME                               TAG            TIMESTAMP
> zpool-01/dataset-01 at 1299636001  .send-18440-0  Tue Mar 15 20:00:39 2011
> xxx> zfs holds zpool-01/dataset-01 at 1300233615
> NAME                               TAG            TIMESTAMP
> zpool-01/dataset-01 at 1300233615  .send-18440-0  Tue Mar 15 20:00:47 2011
> xxx>
>
>     That is what I was looking for. Looks like when a zfs send got
> killed it left a hanging lock (hold) around. I assume the next
> export/import (not likely as this is a production zpool) or a reboot
> (will happen eventually, and I can wait) these will clear. Unless
> there is a way to force clear the hold.
The user holds won''t be released by an export/import or a reboot.

"zfs get defer_destroy snapname" will show whether this snapshot is 
marked for
deferred destroy and "zfs release .send-18440-0 snapname" will clear 
that hold.
If the snapshot is marked for deferred destroy then the release of the 
last tag
will also destroy it.

-- Rich

Paul Kraus

2011-Apr-11 21:22 UTC

head link

[zfs-discuss] cannot destroy snapshot

On Wed, Apr 6, 2011 at 1:58 PM, Rich Morris <rich.morris at oracle.com>
wrote:> On 04/06/11 12:43, Paul Kraus wrote:
>>
>> xxx> zfs holds zpool-01/dataset-01 at 1299636001
>> NAME ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? TAG ? ? ? ? ? ?TIMESTAMP
>> zpool-01/dataset-01 at 1299636001 ?.send-18440-0 ?Tue Mar 15 20:00:39
2011
>> xxx> zfs holds zpool-01/dataset-01 at 1300233615
>> NAME ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? TAG ? ? ? ? ? ?TIMESTAMP
>> zpool-01/dataset-01 at 1300233615 ?.send-18440-0 ?Tue Mar 15 20:00:47
2011
>> xxx>
>>
>> ? ?That is what I was looking for. Looks like when a zfs send got
>> killed it left a hanging lock (hold) around. I assume the next
>> export/import (not likely as this is a production zpool) or a reboot
>> (will happen eventually, and I can wait) these will clear. Unless
>> there is a way to force clear the hold.
>
> The user holds won''t be released by an export/import or a reboot.
>
> "zfs get defer_destroy snapname" will show whether this snapshot
is marked
> for
> deferred destroy and "zfs release .send-18440-0 snapname" will
clear that
> hold.
> If the snapshot is marked for deferred destroy then the release of the last
> tag
> will also destroy it.
    Sorry I did not get back on this last week, it got busy late in the week.

    I tried the `zfs release` and it appeared to hang, so I just let
it be. A few hours later the server experienced a resource crunch of
some type (fork errors about unable to allocate resources). The load
also varied between about 16 and 50 (it is a 16 CPU M4000).

    Users who had an open SAMBA connection seemed OK, but eventually
we needed to reboot the box (I did let it sit in that state as long as
I could). Since I could not even get on the XSCF console, I had to
`break` it to the OK prompt and sync it. The first boot hung. I then
did a boot -rv and that also hung (I was hoping to see a device probe
that caused the hang, but it looked like it was getting past all the
device discovery). That also hung. Finally a boot -srv got me to a
login prompt. I logged in as root, then logged out and it came up to
mulltiuser-server without a hitch.

    I do not know what the root cause of the initial resource problem
was, as I did not get a good core dump. I *hope* it was not the `zfs
release`, but it may have been.

    After the boot cycle(s) the zfs snapshots are no longer held and I
could destroy them.

    Thanks to all those who helped. This discussion is one of the best
sources, if not THE best source, of zfs support and knowledge.

-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players

Matt Banks

2011-Sep-16 19:06 UTC

head link

[zfs-discuss] cannot destroy snapshot

On Apr 11, 2011, at 3:22 PM, Paul Kraus wrote:
> On Wed, Apr 6, 2011 at 1:58 PM, Rich Morris <rich.morris at
oracle.com> wrote:
>> On 04/06/11 12:43, Paul Kraus wrote:
>>> 
>>> xxx> zfs holds zpool-01/dataset-01 at 1299636001
>>> NAME                               TAG            TIMESTAMP
>>> zpool-01/dataset-01 at 1299636001  .send-18440-0  Tue Mar 15
20:00:39 2011
>>> xxx> zfs holds zpool-01/dataset-01 at 1300233615
>>> NAME                               TAG            TIMESTAMP
>>> zpool-01/dataset-01 at 1300233615  .send-18440-0  Tue Mar 15
20:00:47 2011
>>> xxx>
>>> 
>>>    That is what I was looking for. Looks like when a zfs send got
>>> killed it left a hanging lock (hold) around. I assume the next
>>> export/import (not likely as this is a production zpool) or a
reboot
>>> (will happen eventually, and I can wait) these will clear. Unless
>>> there is a way to force clear the hold.
>> 
>> The user holds won''t be released by an export/import or a
reboot.
>> 
>> "zfs get defer_destroy snapname" will show whether this
snapshot is marked
>> for
>> deferred destroy and "zfs release .send-18440-0 snapname"
will clear that
>> hold.
>> If the snapshot is marked for deferred destroy then the release of the
last
>> tag
>> will also destroy it.
> 
>    Sorry I did not get back on this last week, it got busy late in the
week.
> 
>    I tried the `zfs release` and it appeared to hang, so I just let
> it be. A few hours later the server experienced a resource crunch of
> some type (fork errors about unable to allocate resources). The load
> also varied between about 16 and 50 (it is a 16 CPU M4000).
> 
>    Users who had an open SAMBA connection seemed OK, but eventually
> we needed to reboot the box (I did let it sit in that state as long as
> I could). Since I could not even get on the XSCF console, I had to
> `break` it to the OK prompt and sync it. The first boot hung. I then
> did a boot -rv and that also hung (I was hoping to see a device probe
> that caused the hang, but it looked like it was getting past all the
> device discovery). That also hung. Finally a boot -srv got me to a
> login prompt. I logged in as root, then logged out and it came up to
> mulltiuser-server without a hitch.
> 
>    I do not know what the root cause of the initial resource problem
> was, as I did not get a good core dump. I *hope* it was not the `zfs
> release`, but it may have been.
> 
>    After the boot cycle(s) the zfs snapshots are no longer held and I
> could destroy them.
> 
>    Thanks to all those who helped. This discussion is one of the best
> sources, if not THE best source, of zfs support and knowledge.

I hate to drudge up this "old" email thread, but I just wanted to:

a) say thanks ("thanks!") as I had exactly this same issue just crop
up on Sol10u9 (zpool rev22) and sure enough, it had a hold from a previous send.

b) mention (for those that may find this thread in the future) that once I found
the hold, the "zfs release [hold] [snapname]" method mentioned above
worked swimmingly for me. I was nervous doing this during production hours, but
the release command returned in about 5-7 seconds with no apparent adverse
effects. I was then able to destroy the snap.

I was initially afraid that it was somehow the "memory bug" mentioned
in the current thread (when things are fresh in your mind, they seem more
likely), so I''m glad this thread was out there.

matt

Fred Liu

2011-Sep-19 07:10 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

Hi,

For my carelessness, I added two disks into a raid-z2 zpool as normal data disk,
but in fact
I want to make them as zil devices.

Any remedy solutions?


Many thanks.

Fred

Edward Ned Harvey

2011-Sep-19 11:16 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Fred Liu
> 
> For my carelessness, I added two disks into a raid-z2 zpool as normal data
> disk, but in fact
> I want to make them as zil devices.
That''s a huge bummer, and it''s the main reason why device
removal has been a
priority request for such a long time...  There is no solution.  You can
only destroy & recreate your pool, or learn to live with it that way.

Sorry...

Fred Liu

2011-Sep-19 11:25 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> 
> That''s a huge bummer, and it''s the main reason why device
removal has
> been a
> priority request for such a long time...  There is no solution.  You
> can
> only destroy & recreate your pool, or learn to live with it that way.
> 
> Sorry...
> 
Yeah, I also realized this when I send out this message. In NetApp, it is so
easy to change raid group size. There is still a long way for zfs to go.
Hope I can see that in the future.

I also did another huge "mistake" which really brings me into the deep
pain.
I physically removed these two added devices for I though raidz2 can afford it.
But now the whole pool corrupts. I don''t know where I can go ...
Any help will be tremendously appreciated.

Thanks.

Fred

Tomas Forsman

2011-Sep-19 12:07 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

On 19 September, 2011 - Fred Liu sent me these 0,9K bytes:
> > 
> > That''s a huge bummer, and it''s the main reason why
device removal has
> > been a
> > priority request for such a long time...  There is no solution.  You
> > can
> > only destroy & recreate your pool, or learn to live with it that
way.
> > 
> > Sorry...
> > 
> 
> Yeah, I also realized this when I send out this message. In NetApp, it is
so
> easy to change raid group size. There is still a long way for zfs to go.
> Hope I can see that in the future.
> 
> I also did another huge "mistake" which really brings me into the
deep pain.
> I physically removed these two added devices for I though raidz2 can afford
it.
> But now the whole pool corrupts. I don''t know where I can go ...
> Any help will be tremendously appreciated.
You can add mirrors to those lonely disks.

/Tomas
-- 
Tomas Forsman, stric at acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Ume?
`- Sysadmin at {cs,acc}.umu.se

Edward Ned Harvey

2011-Sep-19 12:07 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> From: Fred Liu [mailto:Fred_Liu at issi.com]
> 
> Yeah, I also realized this when I send out this message. In NetApp, it is
so> easy to change raid group size. There is still a long way for zfs to go.
> Hope I can see that in the future.
This one missing feature of ZFS, IMHO, does not result in "a long way for
zfs to go" in relation to netapp.  I shut off my netapp 2 years ago in
favor
of ZFS, because ZFS performs so darn much better, and has such immensely
greater robustness.  Try doing ndmp, cifs, nfs, iscsi on netapp (all extra
licenses).  Try experimenting with the new version of netapp to see how good
it is (you can''t unless you buy a whole new box.)  Try mirroring a
production box onto a lower-cost secondary backup box (there is no such
thing).  Try storing your backup on disk and rotating your disks offsite.
Try running any "normal" utilities - iostat, top, wireshark - you
can''t.
Try backing up with commercial or otherwise modular (agent-based) backup
software.  You can''t.  You have to use CIFS/NFS/NDMP.  

Just try finding a public mailing list like this one where you can even so
much as begin such a conversation about netapp...  Been there done that,
it''s not even in the same ballpark.

etc etc.  (end rant.)  I hate netapp.

> I also did another huge "mistake" which really brings me into the
deep
pain.> I physically removed these two added devices for I though raidz2 can
afford> it.
> But now the whole pool corrupts. I don''t know where I can go ...
> Any help will be tremendously appreciated.
Um...

Wanna post your "zpool status" and "cat /etc/release" and
"zpool upgrade"

Fred Liu

2011-Sep-19 12:14 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> 
> This one missing feature of ZFS, IMHO, does not result in "a long way
> for
> zfs to go" in relation to netapp.  I shut off my netapp 2 years ago in
> favor
> of ZFS, because ZFS performs so darn much better, and has such
> immensely
> greater robustness.  Try doing ndmp, cifs, nfs, iscsi on netapp (all
> extra
> licenses).  Try experimenting with the new version of netapp to see how
> good
> it is (you can''t unless you buy a whole new box.)  Try mirroring a
> production box onto a lower-cost secondary backup box (there is no such
> thing).  Try storing your backup on disk and rotating your disks
> offsite.
> Try running any "normal" utilities - iostat, top, wireshark - you
can''t.
> Try backing up with commercial or otherwise modular (agent-based)
> backup
> software.  You can''t.  You have to use CIFS/NFS/NDMP.
> 
> Just try finding a public mailing list like this one where you can even
> so
> much as begin such a conversation about netapp...  Been there done that,
> it''s not even in the same ballpark.
> 
> etc etc.  (end rant.)  I hate netapp.
> 
> 
Yeah, It is  kind of touchy topic, we may discuss more in the future.
I want to focus on how to repair my pool first. ;-(
> 
> Um...
> 
> Wanna post your "zpool status" and "cat /etc/release"
and "zpool
> upgrade"
> 
I exported the pool for I want to use zpool import -F to fix it.
But now I get " one or more devices is currently unavailable        
Destroy and re-create the pool from  a backup source."

I use opensolaris b134 and zpool version 22.


Thanks.

Fred

Fred Liu

2011-Sep-19 12:16 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> 
> You can add mirrors to those lonely disks.
> 
Can it repair the pool?

Thanks.

Fred

David Magda

2011-Sep-19 12:42 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

On Mon, September 19, 2011 08:07, Edward Ned Harvey wrote:
> This one missing feature of ZFS, IMHO, does not result in "a long way
for
> zfs to go" in relation to netapp.  I shut off my netapp 2 years ago in
> favor of ZFS, because ZFS performs so darn much better, and has such
> immensely greater robustness.  Try doing ndmp, cifs, nfs, iscsi on netapp
> (all extra licenses).  Try experimenting with the new version of netapp to
> see how good it is (you can''t unless you buy a whole new box.)
As another datum, at $WORK we''re going to Isilon. Our NetApp is being
retired by the end of the year as it just can''t handle the load of HPC.
We
also have the regular assortment of web, mail, code repositories, etc.,
VMs that also live on Isilon. We''re quite happy, especially with the
more
recent Isilon hardware that uses SSDs to store/cache metadata. NFS and
CIFS are quite good, but we haven''t really tried their iSCSI stuff yet;
they don''t have FC at all.

We also have a bunch of Blue Arc, but find it much more finicky than
Isilon. Perhaps Hitachi will help them stabilize things a bit.

As for experimenting with NetApp, they do have a "simulator" that you
can
run in a VM if you wish (or actual hardware AFAICT).

A bit more on topic, bp* rewrite has been a long-time coming, and AFAICT,
it  won''t be in Solaris 11. As it stands, I don''t care much
about changing
RAID levels, but not being able to remove a mistakenly added device is
something is becoming more and more conspicuous. For better or worse
I''m
not doing as much Solaris stuff (esp. with the new Ellison pricing model),
but still pay attention to what''s going on, and this (missing) feature
is
one of those "WTF?" things that is the fly in the otherwise very tasty
soup that is ZFS.

Edward Ned Harvey

2011-Sep-19 13:25 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Fred Liu
> 
> For my carelessness, I added two disks into a raid-z2 zpool as normal data
> disk, 
> -----Original Message-----
> From: Fred Liu [mailto:Fred_Liu at issi.com]
>
> I also did another huge "mistake" which really brings me into the
deep
pain.> I physically removed these two added devices for I though raidz2 can
afford> it.
So...  You accidentally added non-redundant disks to a pool.  They were not
part of the raidz2, so the redundancy in the raidz2 did not help you.  You
removed the non-redundant disks, and now the pool is faulted.

The only thing you can do is:
Add the disks back to the pool (re-insert them to the system).  Then you
should be able to import the pool.

Now, you don''t want these devices in the pool.  You must either destroy
&
recreate your pool, or add redundancy to your non-redundant devices.

Fred Liu

2011-Sep-19 13:29 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> 
> So...  You accidentally added non-redundant disks to a pool.  They were
> not
> part of the raidz2, so the redundancy in the raidz2 did not help you.
> You
> removed the non-redundant disks, and now the pool is faulted.
> 
> The only thing you can do is:
> Add the disks back to the pool (re-insert them to the system).  Then
> you
> should be able to import the pool.
> 
> Now, you don''t want these devices in the pool.  You must either
destroy
> &
> recreate your pool, or add redundancy to your non-redundant devices.
> 
Yes. I have connected them back to server. But it does not help.
I am really sad now...

Krunal Desai

2011-Sep-19 13:42 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

On Mon, Sep 19, 2011 at 9:29 AM, Fred Liu <Fred_Liu at issi.com>
wrote:> Yes. I have connected them back to server. But it does not help.
> I am really sad now...
I cringed a little when I read the thread title. I did this on
accident once as well, but "lucky" for me, I had enough scratch
storage around in various sizes to cobble together a JBOD (risky) and
use it as a holding area for my data while I remade the pool.

I''m a home user and only have around 21TB or so, so it was feasible
for me. Probably not so feasible for you enterprise guys with 1000s of
users and 100s of filesystems!

--khd

Edward Ned Harvey

2011-Sep-19 13:48 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> From: Krunal Desai [mailto:movszx at gmail.com]
> 
> On Mon, Sep 19, 2011 at 9:29 AM, Fred Liu <Fred_Liu at issi.com>
wrote:
> > Yes. I have connected them back to server. But it does not help.
> > I am really sad now...
I''ll tell you what does not help.  This email.  Now that you know what
you''re trying to do, why don''t you post the results of your
"zpool import" command?  How about an error message, and how
you''re trying to go about fixing your pool?  Nobody here can help you
without information.

> I cringed a little when I read the thread title. I did this on
> accident once as well, but "lucky" for me, I had enough scratch
> storage around in various sizes to cobble together a JBOD (risky) and
> use it as a holding area for my data while I remade the pool.
> 
> I''m a home user and only have around 21TB or so, so it was
feasible
> for me. Probably not so feasible for you enterprise guys with 1000s of
> users and 100s of filesystems!
No enterprise guys with 1000s of users and 100s of filesystems are making this
mistake.  Even if it does happen, on a pool that significant, the obvious
response is to add redundancy instead of recreating the pool.

Fred Liu

2011-Sep-19 13:54 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> 
> I''ll tell you what does not help.  This email.  Now that you know
what
> you''re trying to do, why don''t you post the results of
your "zpool
> import" command?  How about an error message, and how you''re
trying to
> go about fixing your pool?  Nobody here can help you without
> information.
> 
> User     tty           login@  idle   JCPU   PCPU  what                         
root     console       9:25pm                      w                            
root at cn03:~# df
Filesystem           1K-blocks      Used Available Use% Mounted on              
rpool/ROOT/opensolaris                                                          
                      94109412   6880699  87228713   8% /                       
swap                 108497952       344 108497608   1% /etc/svc/volatile       
/usr/lib/libc/libc_hwcap1.so.1                                                  
                      94109412   6880699  87228713   8% /lib/libc.so.1          
swap                 108497616         8 108497608   1% /tmp                    
swap                 108497688        80 108497608   1% /var/run                
rpool/export             46864        23     46841   1% /export                 
rpool/export/home        46864        23     46841   1% /export/home            
rpool/export/home/fred                                                          
                         48710      5300     43410  11% /export/home/fred       
rpool                102155158        80 102155078   1% /rpool                  
root at cn03:~# !z
zpool import cn03                                                               
cannot import ''cn03'': one or more devices is currently
unavailable
        Destroy and re-create the pool from                                     
        a backup source.                                                        

Thanks.

Fred

Fred Liu

2011-Sep-19 14:00 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

I also used zpool import -fFX cn03 in b134 and b151a(via live SX11 live cd). It
resulted a core dump and reboot after about 15 min.
I can see all the leds are blinking on the HDD within  this 15 min.
Can replacing  empty ZIL devices help?

Thanks.

Fred> -----Original Message-----
> From: Fred Liu
> Sent: ???, ?? 19, 2011 21:54
> To: ''Edward Ned Harvey''; ''Krunal Desai''
> Cc: zfs-discuss at opensolaris.org
> Subject: RE: [zfs-discuss] remove wrongly added device from zpool
> 
> >
> > I''ll tell you what does not help.  This email.  Now that you
know
> what
> > you''re trying to do, why don''t you post the results
of your "zpool
> > import" command?  How about an error message, and how
you''re trying
> to
> > go about fixing your pool?  Nobody here can help you without
> > information.
> >
> >
> User     tty           login@  idle   JCPU   PCPU  what
> root     console       9:25pm                      w
> root at cn03:~# df
> Filesystem           1K-blocks      Used Available Use% Mounted on
> rpool/ROOT/opensolaris
>                       94109412   6880699  87228713   8% /
> swap                 108497952       344 108497608   1%
> /etc/svc/volatile
> /usr/lib/libc/libc_hwcap1.so.1
>                       94109412   6880699  87228713   8% /lib/libc.so.1
> swap                 108497616         8 108497608   1% /tmp
> swap                 108497688        80 108497608   1% /var/run
> rpool/export             46864        23     46841   1% /export
> rpool/export/home        46864        23     46841   1% /export/home
> rpool/export/home/fred
>                          48710      5300     43410  11%
> /export/home/fred
> rpool                102155158        80 102155078   1% /rpool
> root at cn03:~# !z
> zpool import cn03
> cannot import ''cn03'': one or more devices is currently
unavailable
>         Destroy and re-create the pool from
>         a backup source.
> 
> Thanks.
> 
> Fred
> 
> 
>

Fred Liu

2011-Sep-19 14:18 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

The core dump:

        r10: ffffff19a5592000 r11:                0 r12:                0       
        r13:                0 r14:                0 r15: ffffff00ba4a5c60       
        fsb: fffffd7fff172a00 gsb: ffffff19a5592000  ds:                0       
         es:                0  fs:                0  gs:                0       
        trp:                e err:                0 rip: fffffffff782f81a       
         cs:               30 rfl:            10246 rsp: ffffff00b9bf0a40       
         ss:               38                                                   
                                                                                
ffffff00b9bf0830 unix:die+10f ()                                                
ffffff00b9bf0940 unix:trap+177b ()                                              
ffffff00b9bf0950 unix:cmntrap+e6 ()                                             
ffffff00b9bf0ab0 procfs:prchoose+72 ()                                          
ffffff00b9bf0b00 procfs:prgetpsinfo+2b ()                                       
ffffff00b9bf0ce0 procfs:pr_read_psinfo+4e ()                                    
ffffff00b9bf0d30 procfs:prread+72 ()                                            
ffffff00b9bf0da0 genunix:fop_read+6b ()                                         
ffffff00b9bf0f00 genunix:pread+22c ()                                           
ffffff00b9bf0f10 unix:brand_sys_syscall+20d ()                                  
                                                                                
syncing file systems... done                                                    
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel              
 0:17 100% done                                                                 
100% done: 1041082 pages dumped, dump succeeded                                 
rebooting...                                                                    
> -----Original Message-----
> From: Fred Liu
> Sent: ???, ?? 19, 2011 22:00
> To: Fred Liu; ''Edward Ned Harvey''; ''Krunal
Desai''
> Cc: ''zfs-discuss at opensolaris.org''
> Subject: RE: [zfs-discuss] remove wrongly added device from zpool
> 
> I also used zpool import -fFX cn03 in b134 and b151a(via live SX11 live
> cd). It resulted a core dump and reboot after about 15 min.
> I can see all the leds are blinking on the HDD within  this 15 min.
> Can replacing  empty ZIL devices help?
> 
> Thanks.
> 
> Fred
> > -----Original Message-----
> > From: Fred Liu
> > Sent: ???, ?? 19, 2011 21:54
> > To: ''Edward Ned Harvey''; ''Krunal
Desai''
> > Cc: zfs-discuss at opensolaris.org
> > Subject: RE: [zfs-discuss] remove wrongly added device from zpool
> >
> > >
> > > I''ll tell you what does not help.  This email.  Now that
you know
> > what
> > > you''re trying to do, why don''t you post the
results of your "zpool
> > > import" command?  How about an error message, and how
you''re trying
> > to
> > > go about fixing your pool?  Nobody here can help you without
> > > information.
> > >
> > >
> > User     tty           login@  idle   JCPU   PCPU  what
> > root     console       9:25pm                      w
> > root at cn03:~# df
> > Filesystem           1K-blocks      Used Available Use% Mounted on
> > rpool/ROOT/opensolaris
> >                       94109412   6880699  87228713   8% /
> > swap                 108497952       344 108497608   1%
> > /etc/svc/volatile
> > /usr/lib/libc/libc_hwcap1.so.1
> >                       94109412   6880699  87228713   8%
> /lib/libc.so.1
> > swap                 108497616         8 108497608   1% /tmp
> > swap                 108497688        80 108497608   1% /var/run
> > rpool/export             46864        23     46841   1% /export
> > rpool/export/home        46864        23     46841   1% /export/home
> > rpool/export/home/fred
> >                          48710      5300     43410  11%
> > /export/home/fred
> > rpool                102155158        80 102155078   1% /rpool
> > root at cn03:~# !z
> > zpool import cn03
> > cannot import ''cn03'': one or more devices is
currently unavailable
> >         Destroy and re-create the pool from
> >         a backup source.
> >
> > Thanks.
> >
> > Fred
> >
> >
> >

Richard Elling

2011-Sep-19 14:20 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

On Sep 19, 2011, at 12:10 AM, Fred Liu <Fred_Liu at issi.com> wrote:
> Hi,
> 
> For my carelessness, I added two disks into a raid-z2 zpool as normal data
disk, but in fact
> I want to make them as zil devices.
You don''t mention which OS you are using, but for the past 5 years of
[Open]Solaris
releases, the system prints a warning message and will not allow this to occur
without using the force option (-f). 
  -- richard

Fred Liu

2011-Sep-19 14:25 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

I use opensolaris b134.

Thanks.

Fred
> -----Original Message-----
> From: Richard Elling [mailto:richard.elling at gmail.com]
> Sent: ???, ?? 19, 2011 22:21
> To: Fred Liu
> Cc: zfs-discuss at opensolaris.org
> Subject: Re: [zfs-discuss] remove wrongly added device from zpool
> 
> On Sep 19, 2011, at 12:10 AM, Fred Liu <Fred_Liu at issi.com> wrote:
> 
> > Hi,
> >
> > For my carelessness, I added two disks into a raid-z2 zpool as normal
> data disk, but in fact
> > I want to make them as zil devices.
> 
> You don''t mention which OS you are using, but for the past 5 years
of
> [Open]Solaris
> releases, the system prints a warning message and will not allow this
> to occur
> without using the force option (-f).
>   -- richard
>

Fred Liu

2011-Sep-19 14:27 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> 
> You don''t mention which OS you are using, but for the past 5 years
of
> [Open]Solaris
> releases, the system prints a warning message and will not allow this
> to occur
> without using the force option (-f).
>   -- richard
>  Yes. There is a warning message, I used zpool add -f.

Thanks.

Fred

Fred Liu

2011-Sep-19 15:34 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

I get some good progress like following:

zpool import
  pool: cn03
    id: 1907858070511204110
 state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-6X
config:

        cn03                       UNAVAIL  missing device
          raidz2-0                 ONLINE
            c4t5000C5000970B70Bd0  ONLINE
            c4t5000C5000972C693d0  ONLINE
            c4t5000C500097009DBd0  ONLINE
            c4t5000C500097040BFd0  ONLINE
            c4t5000C5000970727Fd0  ONLINE
            c4t5000C50009707487d0  ONLINE
            c4t5000C50009724377d0  ONLINE
            c4t5000C50039F0B447d0  ONLINE
          c22t3d0                  ONLINE
          c4t50015179591C238Fd0    ONLINE
        logs
          c22t4d0                  ONLINE
          c22t5d0                  ONLINE

        Additional devices are known to be part of this pool, though their
        exact configuration cannot be determined.

Any suggestions?

Thanks.

Fred
> -----Original Message-----
> From: Fred Liu
> Sent: ???, ?? 19, 2011 22:28
> To: ''Richard Elling''
> Cc: zfs-discuss at opensolaris.org
> Subject: RE: [zfs-discuss] remove wrongly added device from zpool
> 
> 
> >
> > You don''t mention which OS you are using, but for the past 5
years of
> > [Open]Solaris
> > releases, the system prints a warning message and will not allow this
> > to occur
> > without using the force option (-f).
> >   -- richard
> >
>  Yes. There is a warning message, I used zpool add -f.
> 
> Thanks.
> 
> Fred

Fred Liu

2011-Sep-19 15:36 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

And:

format
Searching for disks...done

c22t2d0: configured with capacity of 1.77GB


AVAILABLE DISK SELECTIONS:
       0. c4t5000C5003AC39D5Fd0 <SEAGATE-ST3600057SS-ES64-558.91GB>
          /scsi_vhci/disk at g5000c5003ac39d5f
       1. c4t5000C50039F0B447d0 <SEAGATE-ST3600057SS-ES64-558.91GB>
          /scsi_vhci/disk at g5000c50039f0b447
       2. c4t5000C5000970B70Bd0 <SEAGATE-ST3600057SS-ES62-558.91GB>
          /scsi_vhci/disk at g5000c5000970b70b
       3. c4t5000C5000972C693d0 <SEAGATE-ST3600057SS-ES62-558.91GB>
          /scsi_vhci/disk at g5000c5000972c693
       4. c4t5000C500097009DBd0 <SEAGATE-ST3600057SS-ES62-558.91GB>
          /scsi_vhci/disk at g5000c500097009db
       5. c4t5000C500097040BFd0 <SEAGATE-ST3600057SS-ES62-558.91GB>
          /scsi_vhci/disk at g5000c500097040bf
       6. c4t5000C5000970727Fd0 <SEAGATE-ST3600057SS-ES62-558.91GB>
          /scsi_vhci/disk at g5000c5000970727f
       7. c4t5000C50009724377d0 <SEAGATE-ST3600057SS-ES62-558.91GB>
          /scsi_vhci/disk at g5000c50009724377
       8. c4t5000C50009707487d0 <SEAGATE-ST3600057SS-ES62-558.91GB>
          /scsi_vhci/disk at g5000c50009707487
       9. c4t50015179591C238Fd0 <ATA-INTEL SSDSA2M160-02HA-149.05GB>
          /scsi_vhci/disk at g50015179591c238f
      10. c4t500151795910D221d0 <DEFAULT cyl 24915 alt 2 hd 224 sec 56>
          /scsi_vhci/disk at g500151795910d221
      11. c22t2d0 <ATA-ANS9010_2NNN2NNN-_200 cyl 908 alt 2 hd 128 sec 32>
          /pci at 0,0/pci15d9,400 at 1f,2/disk at 2,0
      12. c22t3d0 <ATA-ANS9010_2NNN2NNN-_200-1.78GB>
          /pci at 0,0/pci15d9,400 at 1f,2/disk at 3,0
      13. c22t4d0 <ATA-ANS9010_2NNN2NNN-_200-1.78GB>
          /pci at 0,0/pci15d9,400 at 1f,2/disk at 4,0
      14. c22t5d0 <ATA-ANS9010_2NNN2NNN-_200-1.78GB>
          /pci at 0,0/pci15d9,400 at 1f,2/disk at 5,0
> -----Original Message-----
> From: Fred Liu
> Sent: ???, ?? 19, 2011 23:35
> To: Fred Liu; Richard Elling
> Cc: zfs-discuss at opensolaris.org
> Subject: RE: [zfs-discuss] remove wrongly added device from zpool
> 
> I get some good progress like following:
> 
> zpool import
>   pool: cn03
>     id: 1907858070511204110
>  state: UNAVAIL
> status: One or more devices are missing from the system.
> action: The pool cannot be imported. Attach the missing
>         devices and try again.
>    see: http://www.sun.com/msg/ZFS-8000-6X
> config:
> 
>         cn03                       UNAVAIL  missing device
>           raidz2-0                 ONLINE
>             c4t5000C5000970B70Bd0  ONLINE
>             c4t5000C5000972C693d0  ONLINE
>             c4t5000C500097009DBd0  ONLINE
>             c4t5000C500097040BFd0  ONLINE
>             c4t5000C5000970727Fd0  ONLINE
>             c4t5000C50009707487d0  ONLINE
>             c4t5000C50009724377d0  ONLINE
>             c4t5000C50039F0B447d0  ONLINE
>           c22t3d0                  ONLINE
>           c4t50015179591C238Fd0    ONLINE
>         logs
>           c22t4d0                  ONLINE
>           c22t5d0                  ONLINE
> 
>         Additional devices are known to be part of this pool, though
> their
>         exact configuration cannot be determined.
> 
> Any suggestions?
> 
> Thanks.
> 
> Fred
> 
> > -----Original Message-----
> > From: Fred Liu
> > Sent: ???, ?? 19, 2011 22:28
> > To: ''Richard Elling''
> > Cc: zfs-discuss at opensolaris.org
> > Subject: RE: [zfs-discuss] remove wrongly added device from zpool
> >
> >
> > >
> > > You don''t mention which OS you are using, but for the
past 5 years
> of
> > > [Open]Solaris
> > > releases, the system prints a warning message and will not allow
> this
> > > to occur
> > > without using the force option (-f).
> > >   -- richard
> > >
> >  Yes. There is a warning message, I used zpool add -f.
> >
> > Thanks.
> >
> > Fred

Richard Elling

2011-Sep-19 15:43 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

On Sep 19, 2011, at 8:34 AM, Fred Liu wrote:
> I get some good progress like following:
> 
> zpool import
>  pool: cn03
>    id: 1907858070511204110
> state: UNAVAIL
> status: One or more devices are missing from the system.
> action: The pool cannot be imported. Attach the missing
>        devices and try again.
>   see: http://www.sun.com/msg/ZFS-8000-6X
> config:
> 
>        cn03                       UNAVAIL  missing device
>          raidz2-0                 ONLINE
>            c4t5000C5000970B70Bd0  ONLINE
>            c4t5000C5000972C693d0  ONLINE
>            c4t5000C500097009DBd0  ONLINE
>            c4t5000C500097040BFd0  ONLINE
>            c4t5000C5000970727Fd0  ONLINE
>            c4t5000C50009707487d0  ONLINE
>            c4t5000C50009724377d0  ONLINE
>            c4t5000C50039F0B447d0  ONLINE
>          c22t3d0                  ONLINE
>          c4t50015179591C238Fd0    ONLINE
>        logs
>          c22t4d0                  ONLINE
>          c22t5d0                  ONLINE
> 
>        Additional devices are known to be part of this pool, though their
>        exact configuration cannot be determined.
> 
> Any suggestions?
For each disk, look at the output of "zdb -l /dev/rdsk/DISKNAMEs0". 
1. Confirm that each disk provides 4 labels.
2. Build the vdev tree by hand and look to see which disk is missing

This can be tedious and time consuming.
 -- richard
> 
> Thanks.
> 
> Fred
> 
>> -----Original Message-----
>> From: Fred Liu
>> Sent: ???, ?? 19, 2011 22:28
>> To: ''Richard Elling''
>> Cc: zfs-discuss at opensolaris.org
>> Subject: RE: [zfs-discuss] remove wrongly added device from zpool
>> 
>> 
>>> 
>>> You don''t mention which OS you are using, but for the past
5 years of
>>> [Open]Solaris
>>> releases, the system prints a warning message and will not allow
this
>>> to occur
>>> without using the force option (-f).
>>>  -- richard
>>> 
>> Yes. There is a warning message, I used zpool add -f.
>> 
>> Thanks.
>> 
>> Fred
-- 

ZFS and performance consulting
http://www.RichardElling.com
VMworld Copenhagen, October 17-20
OpenStorage Summit, San Jose, CA, October 24-27
LISA ''11, Boston, MA, December 4-9

Fred Liu

2011-Sep-19 16:16 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> 
> For each disk, look at the output of "zdb -l
/dev/rdsk/DISKNAMEs0".
> 1. Confirm that each disk provides 4 labels.
> 2. Build the vdev tree by hand and look to see which disk is missing
> 
> This can be tedious and time consuming.
Do I need to export the pool first?
Can you give more details about #2 -- " Build the vdev tree by hand and
look to see which disk is missing"?


Thanks.

Fred

Richard Elling

2011-Sep-19 16:47 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

On Sep 19, 2011, at 9:16 AM, Fred Liu wrote:>> 
>> For each disk, look at the output of "zdb -l
/dev/rdsk/DISKNAMEs0".
>> 1. Confirm that each disk provides 4 labels.
>> 2. Build the vdev tree by hand and look to see which disk is missing
>> 
>> This can be tedious and time consuming.
> 
> Do I need to export the pool first?
No, but your pool is not imported.
> Can you give more details about #2 -- " Build the vdev tree by hand
and look to see which disk is missing"?
The label, as displayed by "zdb -l" contains the heirarchy of the
expected pool config.
The contents are used to build the output you see in the "zpool
import" or "zpool status"
commands. zpool is complaining that it cannot find one of these disks, so look
at the
labels on the disks to determine what is or is not missing. The next steps
depend on
this knowledge.
-- richard

-- 

ZFS and performance consulting
http://www.RichardElling.com
VMworld Copenhagen, October 17-20
OpenStorage Summit, San Jose, CA, October 24-27
LISA ''11, Boston, MA, December 4-9

Fred Liu

2011-Sep-19 16:51 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> 
> No, but your pool is not imported.
> 
YES. I see.> and look to see which disk is missing"?
> 
> The label, as displayed by "zdb -l" contains the heirarchy of the
> expected pool config.
> The contents are used to build the output you see in the "zpool
import"
> or "zpool status"
> commands. zpool is complaining that it cannot find one of these disks,
> so look at the
> labels on the disks to determine what is or is not missing. The next
> steps depend on
> this knowledge.
zdb -l /dev/rdsk/c22t2d0s0
cannot open ''/dev/rdsk/c22t2d0s0'': I/O error

root at cn03:~# zdb -l /dev/rdsk/c22t3d0s0
--------------------------------------------
LABEL 0
--------------------------------------------
    version: 22
    name: ''cn03''
    state: 0
    txg: 18269872
    pool_guid: 1907858070511204110
    hostid: 13564652
    hostname: ''cn03''
    top_guid: 11074483144412112931
    guid: 11074483144412112931
    vdev_children: 6
    vdev_tree:
        type: ''disk''
        id: 1
        guid: 11074483144412112931
        path: ''/dev/dsk/c22t3d0s0''
        devid: ''id1,sd at
s4154412020202020414e53393031305f324e4e4e324e4e4e2020202020202020353632383637390000005f31/a''
        phys_path: ''/pci at 0,0/pci15d9,400 at 1f,2/disk at
3,0:a''
        whole_disk: 1
        metaslab_array: 37414
        metaslab_shift: 24
        ashift: 9
        asize: 1895563264
        is_log: 0
        create_txg: 18269863
--------------------------------------------
LABEL 1
--------------------------------------------
    version: 22
    name: ''cn03''
    state: 0
    txg: 18269872
    pool_guid: 1907858070511204110
    hostid: 13564652
    hostname: ''cn03''
    top_guid: 11074483144412112931
    guid: 11074483144412112931
    vdev_children: 6
    vdev_tree:
        type: ''disk''
        id: 1
        guid: 11074483144412112931
        path: ''/dev/dsk/c22t3d0s0''
        devid: ''id1,sd at
s4154412020202020414e53393031305f324e4e4e324e4e4e2020202020202020353632383637390000005f31/a''
        phys_path: ''/pci at 0,0/pci15d9,400 at 1f,2/disk at
3,0:a''
        whole_disk: 1
        metaslab_array: 37414
        metaslab_shift: 24
        ashift: 9
    asize: 1895563264
        is_log: 0
        create_txg: 18269863
--------------------------------------------
LABEL 1
--------------------------------------------
    version: 22
    name: ''cn03''
    state: 0
    txg: 18269872
    pool_guid: 1907858070511204110
    hostid: 13564652
    hostname: ''cn03''
    top_guid: 11074483144412112931
    guid: 11074483144412112931
    vdev_children: 6
    vdev_tree:
        type: ''disk''
        id: 1
        guid: 11074483144412112931
        path: ''/dev/dsk/c22t3d0s0''
        devid: ''id1,sd at
s4154412020202020414e53393031305f324e4e4e324e4e4e2020202020202020353632383637390000005f31/a''
        phys_path: ''/pci at 0,0/pci15d9,400 at 1f,2/disk at
3,0:a''
        whole_disk: 1
        metaslab_array: 37414
        metaslab_shift: 24
        ashift: 9
        asize: 1895563264
        is_log: 0
        create_txg: 18269863
--------------------------------------------
LABEL 2
--------------------------------------------
failed to unpack label 2
--------------------------------------------
LABEL 3
--------------------------------------------
failed to unpack label 3


c22t2d0 and c22t3d0 are the devices I physically removed and connected back to
the server.
How can I fix them?

Thanks.

Fred

Richard Elling

2011-Sep-19 19:57 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

more below?

On Sep 19, 2011, at 9:51 AM, Fred Liu wrote:
>> 
>> No, but your pool is not imported.
>> 
> 
> YES. I see.
>> and look to see which disk is missing"?
>> 
>> The label, as displayed by "zdb -l" contains the heirarchy of
the
>> expected pool config.
>> The contents are used to build the output you see in the "zpool
import"
>> or "zpool status"
>> commands. zpool is complaining that it cannot find one of these disks,
>> so look at the
>> labels on the disks to determine what is or is not missing. The next
>> steps depend on
>> this knowledge.
> 
> zdb -l /dev/rdsk/c22t2d0s0
> cannot open ''/dev/rdsk/c22t2d0s0'': I/O error
Is this disk supposed to be available?
You might need to check the partition table, if one exists, to determine if
s0 has a non-zero size.
> root at cn03:~# zdb -l /dev/rdsk/c22t3d0s0
> --------------------------------------------
> LABEL 0
> --------------------------------------------
>    version: 22
>    name: ''cn03''
>    state: 0
>    txg: 18269872
>    pool_guid: 1907858070511204110
>    hostid: 13564652
>    hostname: ''cn03''
>    top_guid: 11074483144412112931
>    guid: 11074483144412112931
>    vdev_children: 6
>    vdev_tree:
>        type: ''disk''
>        id: 1
>        guid: 11074483144412112931
>        path: ''/dev/dsk/c22t3d0s0''
>        devid: ''id1,sd at
s4154412020202020414e53393031305f324e4e4e324e4e4e2020202020202020353632383637390000005f31/a''
>        phys_path: ''/pci at 0,0/pci15d9,400 at 1f,2/disk at
3,0:a''
>        whole_disk: 1
>        metaslab_array: 37414
>        metaslab_shift: 24
>        ashift: 9
>        asize: 1895563264
>        is_log: 0
>        create_txg: 18269863
> --------------------------------------------
> LABEL 1
> --------------------------------------------
>    version: 22
>    name: ''cn03''
>    state: 0
>    txg: 18269872
>    pool_guid: 1907858070511204110
>    hostid: 13564652
>    hostname: ''cn03''
>    top_guid: 11074483144412112931
>    guid: 11074483144412112931
>    vdev_children: 6
>    vdev_tree:
>        type: ''disk''
>        id: 1
>        guid: 11074483144412112931
>        path: ''/dev/dsk/c22t3d0s0''
>        devid: ''id1,sd at
s4154412020202020414e53393031305f324e4e4e324e4e4e2020202020202020353632383637390000005f31/a''
>        phys_path: ''/pci at 0,0/pci15d9,400 at 1f,2/disk at
3,0:a''
>        whole_disk: 1
>        metaslab_array: 37414
>        metaslab_shift: 24
>        ashift: 9
>    asize: 1895563264
>        is_log: 0
>        create_txg: 18269863
> --------------------------------------------
> LABEL 1
> --------------------------------------------
>    version: 22
>    name: ''cn03''
>    state: 0
>    txg: 18269872
>    pool_guid: 1907858070511204110
>    hostid: 13564652
>    hostname: ''cn03''
>    top_guid: 11074483144412112931
>    guid: 11074483144412112931
>    vdev_children: 6
>    vdev_tree:
>        type: ''disk''
>        id: 1
>        guid: 11074483144412112931
>        path: ''/dev/dsk/c22t3d0s0''
>        devid: ''id1,sd at
s4154412020202020414e53393031305f324e4e4e324e4e4e2020202020202020353632383637390000005f31/a''
>        phys_path: ''/pci at 0,0/pci15d9,400 at 1f,2/disk at
3,0:a''
>        whole_disk: 1
>        metaslab_array: 37414
>        metaslab_shift: 24
>        ashift: 9
>        asize: 1895563264
>        is_log: 0
>        create_txg: 18269863
> --------------------------------------------
> LABEL 2
> --------------------------------------------
> failed to unpack label 2
> --------------------------------------------
> LABEL 3
> --------------------------------------------
> failed to unpack label 3
This is a bad sign, but can be recoverable, depending on how you got here. zdb
is saying
that it could not find labels at the end of the disk. Label 2 and label 3 are
256KB each, located
at the end of the disk, aligned to 256KB boundary. zpool import is smarter than
zdb in these
cases, and can often recover from it -- up to the loss of all 4 labels, but you
need to make sure
that the partition tables look reasonable and haven''t changed.
> c22t2d0 and c22t3d0 are the devices I physically removed and connected back
to the server.
> How can I fix them?
Unless I''m mistaken, these are ACARD SSDs that have an optional CF
backup. Let''s hope
that the CF backup worked.
 -- richard

-- 

ZFS and performance consulting
http://www.RichardElling.com
VMworld Copenhagen, October 17-20
OpenStorage Summit, San Jose, CA, October 24-27
LISA ''11, Boston, MA, December 4-9

Fred Liu

2011-Sep-19 20:06 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

> -----Original Message-----
> From: Richard Elling [mailto:richard.elling at gmail.com]
> Sent: ???, ?? 20, 2011 3:57
> To: Fred Liu
> Cc: zfs-discuss at opensolaris.org
> Subject: Re: [zfs-discuss] remove wrongly added device from zpool
> 
> more below?
> 
> On Sep 19, 2011, at 9:51 AM, Fred Liu wrote:
> 
> Is this disk supposed to be available?
> You might need to check the partition table, if one exists, to
> determine if
> s0 has a non-zero size.
> 
Yes. I use format to write an EFI label to it. Now this error is gone.
But all four label are failed to unpack under "zdb -l" now.

> 
> This is a bad sign, but can be recoverable, depending on how you got
> here. zdb is saying
> that it could not find labels at the end of the disk. Label 2 and label
> 3 are 256KB each, located
> at the end of the disk, aligned to 256KB boundary. zpool import is
> smarter than zdb in these
> cases, and can often recover from it -- up to the loss of all 4 labels,
> but you need to make sure
> that the partition tables look reasonable and haven''t changed.
> 
I have tried zpool import -fFX cn03. But it will do core-dump and reboot about 1
hour later.
> 
> Unless I''m mistaken, these are ACARD SSDs that have an optional CF
> backup. Let''s hope
> that the CF backup worked.
Yes. It is ACARD. You mean push the "restore from CF" button to see
what will happen?


Thanks for your nice help!


Fred

Fred Liu

2011-Sep-19 20:19 UTC

head link

[zfs-discuss] remove wrongly added device from zpool

zdb -l /dev/rdsk/c22t2d0s0
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
failed to unpack label 1
--------------------------------------------
LABEL 2
--------------------------------------------
failed to unpack label 2
--------------------------------------------
LABEL 3
--------------------------------------------
failed to unpack label 3
> -----Original Message-----
> From: Fred Liu
> Sent: ???, ?? 20, 2011 4:06
> To: ''Richard Elling''
> Cc: zfs-discuss at opensolaris.org
> Subject: RE: [zfs-discuss] remove wrongly added device from zpool
> 
> 
> 
> > -----Original Message-----
> > From: Richard Elling [mailto:richard.elling at gmail.com]
> > Sent: ???, ?? 20, 2011 3:57
> > To: Fred Liu
> > Cc: zfs-discuss at opensolaris.org
> > Subject: Re: [zfs-discuss] remove wrongly added device from zpool
> >
> > more below?
> >
> > On Sep 19, 2011, at 9:51 AM, Fred Liu wrote:
> >
> > Is this disk supposed to be available?
> > You might need to check the partition table, if one exists, to
> > determine if
> > s0 has a non-zero size.
> >
> 
> Yes. I use format to write an EFI label to it. Now this error is gone.
> But all four label are failed to unpack under "zdb -l" now.
> 
> 
> >
> > This is a bad sign, but can be recoverable, depending on how you got
> > here. zdb is saying
> > that it could not find labels at the end of the disk. Label 2 and
> label
> > 3 are 256KB each, located
> > at the end of the disk, aligned to 256KB boundary. zpool import is
> > smarter than zdb in these
> > cases, and can often recover from it -- up to the loss of all 4
> labels,
> > but you need to make sure
> > that the partition tables look reasonable and haven''t
changed.
> >
> 
> I have tried zpool import -fFX cn03. But it will do core-dump and
> reboot about 1 hour later.
> 
> >
> > Unless I''m mistaken, these are ACARD SSDs that have an
optional CF
> > backup. Let''s hope
> > that the CF backup worked.
> 
> Yes. It is ACARD. You mean push the "restore from CF" button to
see
> what will happen?
> 
> 
> Thanks for your nice help!
> 
> 
> Fred

zfs discuss - Apr 2011 - cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] cannot destroy snapshot

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool

[zfs-discuss] remove wrongly added device from zpool