thr3ads.net - zfs discuss - [zfs-discuss] Recover ZFS destroyed dataset? [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Jim Klimov

2009-Jun-05 14:11 UTC

[zfs-discuss] Recover ZFS destroyed dataset?

I was asked by a coworker about recovering destroyed datasets on ZFS - and 
whether it is possible at all? As a related question, if a filesystem dataset
was
recursively destroyed along with all its snapshots, is there some means to at 
least find some pointers whether it existed at all?

I remember "zpool import -D" can be used to import whole destroyed
pools.
But crawling around the disk with zdb did not yield any results so far.

The colleague''s situation in whole is as follows (although slightly an
offtopic
from ZFS forum''s subject):

1) They have an OpenSolaris machine with some zones set up, each zone root
being a filesystem dataset. Some zones also have delegated datasets for data.

2) The system was to be upgraded with liveupgrade. Apparently something did
not go well during lucreate, so the botched ABE was ludelete''d.

3) During ludelete my colleague noticed some messages about inability to destroy
some zones'' ZFS pools because they are mounted (luckily, zones were
booted),
and aborted the ludelete operation. Apparently, ludelete attempts to roll back 
and destroy zfs-cloned zone roots. However, since they were not created by
lucreate, it seems ludelete worked on the most recent FSes - being the live 
datasets.

4) Now, after the second liveupgrade went well, my colleague noticed that one of
the zone''s directories which had data, is now empty (directory
mtime/ctime is
dated last year, approximately when the system was set up). The zone has a 
delegated dataset and some sub-filesystems mounted off it. It is possible
(he''s
not certain) that this now-empty directory was also a mounted sub-filesystem 
dataset in previous life, now all destroyed by ludelete.

This zone was not mentioned among the errors of ludelete (inability to destroy),
but that might not mean much: successful destructions would not give errors.
On the other hand, (at least some) other FS''es of the delegated dataset
are
intact. Either he was lucky to abort ludelete before the whole zone was wasted,
or ludelete hadn''t approached it.

Problem is, we seem to have no way of knowing whether there was a filesystem
dataset in the first place. Nor whether there were some other such destroyed 
datasets, which haven''t been noticed yet.

Since the zfs mounts are not listed in vfstab, and zone snapshots were done
when zones were down (so all available historical /etc/mnttabs don''t
list zfs
mounts either), no files we have looked at point to these delegated datasets.

The base delegated dataset was never mounted by itself, so it has no
children''s
mountpoints either.

So, to reiterate, the questions stand as:

1) Is it possible to find (with zdb or any other means) whether a specific zfs 
dataset has ever existed on the importable valid pool?

2) Is it possible to try and find and list all destroyed datasets on a pool?

3) Is it possible to recover a destroyed dataset as a whole, or its files (zdb
-R)
and/or pointers to file data (triplets which go as parameters to zdb -R)?

//Thanks in advance, we''re expecting a busy weekend ;(
//Jim Klimov
-- 
This message posted from opensolaris.org

Mark J Musante

2009-Jun-05 14:25 UTC

head link

[zfs-discuss] Recover ZFS destroyed dataset?

Hi Jim,

See if ''zpool history'' gives you what you''re looking
for.


Regards,
markm

Darren J Moffat

2009-Jun-05 14:48 UTC

head link

[zfs-discuss] Recover ZFS destroyed dataset?

Jim Klimov wrote:> 1) Is it possible to find (with zdb or any other means) whether a specific
zfs
> dataset has ever existed on the importable valid pool?
''zpool history -il'' should tell you that, plus it should tell
you who
deleted them and when.

I don''t know how to go about recovering a deleted dataset other than to
suggest the obvious which is to restore from backup or a previous zfs 
send stream.

-- 
Darren J Moffat

Jim Klimov

2009-Jun-05 15:04 UTC

head link

[zfs-discuss] Recover ZFS destroyed dataset?

"zpool history" has shed a little light. Lots actually.

The sub-dataset in question was indeed created, and at the time ludelete was run
there are some entries along the lines of "zfs destroy -r
pond/zones/zonename".
There''s no precise details (names, mountpoints) about the destroyed
datasets -
and I think they should be included in the future.

However the detailed log has some pointers to "txg" and
"dataset" numbers.
Can they help in recovering data? Perhaps, named transaction groups can be
rolled back?

According to the same zpool history, my target is recovery of "dataset =
370"
which has the required mountpoint. Others are snapshots which are secondary
targets.

Namely, the detailed log displays this:

2009-05-27.22:34:24 [internal snapshot txg:710732] dataset = 1627 [user root on
thumper]
2009-05-27.22:34:24 zfs snapshot pond/zones/DUMMY-server-java at snv_114 [user
root on thumper:global]
2009-05-27.22:34:24 [internal create txg:710734] dataset = 1632 [user root on
thumper]
2009-05-27.22:34:24 zfs clone pond/zones/DUMMY-server-java at snv_114
pond/zones/DUMMY-server-java-snv_114 [user root on thumper:global]

Here the lucreate operation froze up until it was found hanging in the morning.
As known from previous post, ludelete was issued, and it first destroyed the
clone and snapshot created by lucreate; then it went on to massacre civilian
datasets ;)

2009-05-28.11:43:49 [internal destroy_begin_sync txg:712314] dataset = 1632
[user root on thumper]
2009-05-28.11:43:52 [internal destroy txg:712316] dataset = 1632 [user root on
thumper]
2009-05-28.11:43:52 [internal reservation set txg:712316] 0 dataset = 0 [user
root on thumper]
2009-05-28.11:43:52 zfs destroy -r pond/zones/DUMMY-server-java-snv_114 [user
root on thumper:global]
2009-05-28.11:43:52 [internal destroy txg:712318] dataset = 1627 [user root on
thumper]
2009-05-28.11:43:53 zfs destroy -r pond/zones/DUMMY-server-java at snv_114 [user
root on thumper:global]

Main destruction was here: pond/zones/las 

2009-05-28.11:43:58 [internal destroy txg:712320] dataset = 425 [user root on
thumper]
2009-05-28.11:43:58 zfs destroy -r pond/zones/las [user root on thumper:global]
2009-05-28.11:43:58 [internal destroy txg:712322] dataset = 459 [user root on
thumper]
2009-05-28.11:43:59 [internal destroy_begin_sync txg:712323] dataset = 370 [user
root on thumper]
2009-05-28.11:44:03 [internal destroy txg:712325] dataset = 370 [user root on
thumper]
2009-05-28.11:44:03 [internal reservation set txg:712325] 0 dataset = 0 [user
root on thumper]
2009-05-28.11:44:04 [internal destroy txg:712326] dataset = 421 [user root on
thumper]
2009-05-28.11:44:04 [internal destroy txg:712327] dataset = 455 [user root on
thumper]
2009-05-28.11:44:05 [internal destroy txg:712328] dataset = 411 [user root on
thumper]

It also took a bite at pond/zones/ldap03, but the zone is intact (possibly
missing
a snapshot though - the set of snapshots differs from those available to other
zones); then the massacre was aborted:

2009-05-28.11:44:05 zfs destroy -r pond/zones/ldap03 [user root on
thumper:global]
2009-05-28.11:44:06 [internal destroy txg:712330] dataset = 445 [user root on
thumper]

//Jim

PS: I guess I''m up to an RFE: "zfs destroy" should have an
interactive option,
perhaps (un-)set by default with an environment variable or presense of a
terminal console (vs. automated scripted usage in installers, patches, crontabs,
etc.). Then ludelete would not be so stupid as to destroy user''s data. 
How "enterprise" is that? :(
-- 
This message posted from opensolaris.org

Jim Klimov

2009-Jun-05 17:25 UTC

head link

[zfs-discuss] Recover ZFS destroyed dataset?

Hello Mark, Darren,

Thank you guys for suggesting "zpool history", upon which we stumbled
before
receiving your comments. Nonetheless, the history results are posted above.

Still no luck trying to dig out the dataset data, so far.

As I get it, there are no (recent) backups which is a poor practice, I know.
However, considering the sheer volume of a single Thumper (which they also 
have), they''d need a not-so-small tape library to make a level-0 backup
of one
server.

People kinda hoped the promise of Solaris being so Enterprise and friendly to
users in cases anyhow related to data loss; they didn''t expect a
systems utility
to make such destructive actions without even asking. Without even a flag to
force it into asking.

Besides, they expected (wrongly it seems) ZFS to be loss-resilient as touted on
forums and blogs, and relied on snapshots to retrieve mis-deleted files, etc.

Mixed technical expertise buffooned by marketing yells led to poor decisions ;(

//Jim
-- 
This message posted from opensolaris.org

Apparently Analagous Threads

Search for more possibly parallel threads

zfs discuss - Jun 2009 - Recover ZFS destroyed dataset?

[zfs-discuss] Recover ZFS destroyed dataset?

[zfs-discuss] Recover ZFS destroyed dataset?

[zfs-discuss] Recover ZFS destroyed dataset?

[zfs-discuss] Recover ZFS destroyed dataset?

[zfs-discuss] Recover ZFS destroyed dataset?

Apparently Analagous Threads