thr3ads.net - zfs discuss - [zfs-discuss] zfs destroy <snapshot> takes hours [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Miroslav Pendev

2007-Apr-01 00:24 UTC

[zfs-discuss] zfs destroy <snapshot> takes hours

Hello,

I am having a problem destroying zfs snapshots. The machine is almost not
responding for more than 4 hours, after I started the command and I
can''t run anything else during that time -
I get (bash): fork: Resource temporarily unavailable - errors. 

The machine is still responding somewhat, but very, very slow. 

It is: P4, 2.4 GHz with 512 MB RAM, 8 x 750 GB disks as raidZ, running Solaris
11 06.

Creating, renaming snapshots seem to be ok - takes < 2 seconds.

There are two ZFS file systems in the pool, both using ~3.5TB out of 4.7 TB. 
About 5 snapshots for each of the file system were created.

I am tryign to destroy one of the snapshots and the machine just dies - no panic
or reboot, but I can''t start anything. I have ''top''
still running on a ssh terminal that shows kernel using ~20% and ~2MB free
memory.

How long should usualy take to destroy ~2TB snapshot (actually using ~30GB)?

Is this delay expected behavior or am I hitting a bug of some kind?

Any help is appreciated.

Thanks,

Miro
 
 
This message posted from opensolaris.org

Miroslav Pendev

2007-Apr-02 14:53 UTC

head link

[zfs-discuss] Re: zfs destroy <snapshot> takes hours

I did some more testing, here is what I found:

- I can destroy older and newer snapshots, just not that particular snapshot

- I added some more memory total 1GB, now after I start the destroy command,
~500MB RAM are taken right away, there is still ~200MB or so left.

o The machine is responsive, 

o If I run ''zfs list''   it shows the snapshot as destroyed (it
is gone from the list).

o There is some zfs activity for about 20 seconds - I can see the lights of the
HDDs of the pool blinking, then it stops

o If I try to access any of the file systems of that pool - to read a file for
example - the machine stops responding (or it is very slow) and I can''t
run anything.
This is what ''top'' shows when the above happens:

last pid:  1168;  load avg:  0.10,  0.17,  0.25;       up 0+00:17:42   10:14:02
76 processes: 73 sleeping, 2 running, 1 on cpu
CPU states: 97.4% idle,  0.0% user,  2.6% kernel,  0.0% iowait,  0.0% swap
Memory: 1024M phys mem, 264M free mem, 2048M swap, 2048M free swap


   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
  1034 root       1  59    0 1768K 1284K cpu      0:01  0.40% top
  1101 root      15  59    0  116M   45M sleep    0:08  0.08% java
  1107 root       1  59    0 1964K 1068K sleep    0:00  0.07% iostat
  1026 miro       1  59    0 7076K 1840K run      0:00  0.04% sshd
  1053 root      12  59    0   86M   13M sleep    0:00  0.03% java
     7 root      12  59    0   10M 8984K sleep    0:02  0.01% svc.startd
   276 root       1  59    0 1064K  548K sleep    0:00  0.00% utmpd
     9 root      23  59    0 9480K 8328K sleep    0:05  0.00% svc.configd
   488 root       4  59    0 2500K 1780K sleep    0:04  0.00% vold
   397 root      11  59    0 8996K 7236K sleep    0:01  0.00% cctransport
   384 root      16  59    0   10M 6952K sleep    0:00  0.00% fmd

No disk activity... It can stay in this state for hours, nothing changes.
If I ''cold reset'' it, the snapshot I was trying to destroy is
back on the ''zfs list'' command.

Any ideas what else to test or change are welcome.

Is there a way to get rid of that snapshot, or to check its consistency somehow?

Thanks,

Miro
 
 
This message posted from opensolaris.org

Eric Haycraft

2007-Apr-02 15:39 UTC

head link

[zfs-discuss] Re: zfs destroy <snapshot> takes hours

You are definitely hitting a bug.. Not sure which one (hopefully someone else
will chime in on that.) It should take mere milliseconds to destroy a snapshot
regardless of size.

Do you have any disk errors? 
What would happen if you scrubbed the pool?

Eric
 
 
This message posted from opensolaris.org

Matthew Ahrens

2007-Apr-03 00:44 UTC

head link

[zfs-discuss] Re: zfs destroy <snapshot> takes hours

Miroslav Pendev wrote:> I did some more testing, here is what I found:
> 
> - I can destroy older and newer snapshots, just not that particular
snapshot
> 
> - I added some more memory total 1GB, now after I start the destroy
command, ~500MB RAM are taken right away, there is still ~200MB or so left.
> 
> o The machine is responsive, 
> 
> o If I run ''zfs list''   it shows the snapshot as
destroyed (it is gone from the list).
> 
> o There is some zfs activity for about 20 seconds - I can see the lights of
the HDDs of the pool blinking, then it stops
Can you take a crash dump when the system is "hung" (ie. after there
is
no more disk activity), and make it available to me?

Also, can you send the output of ''zdb -vvv <pool>
<pool>'' (repeat the
poolname twice), and the name of the snapshot that can''t be deleted?

--matt

Matthew Ahrens

2007-Apr-04 22:03 UTC

head link

[zfs-discuss] Re: zfs destroy <snapshot> takes hours

Matthew Ahrens wrote:> Miroslav Pendev wrote:
>> I did some more testing, here is what I found:
>>
>> - I can destroy older and newer snapshots, just not that particular 
>> snapshot
>>
>> - I added some more memory total 1GB, now after I start the destroy 
>> command, ~500MB RAM are taken right away, there is still ~200MB or so 
>> left.
>> o The machine is responsive,
>> o If I run ''zfs list''   it shows the snapshot as
destroyed (it is gone
>> from the list).
>>
>> o There is some zfs activity for about 20 seconds - I can see the 
>> lights of the HDDs of the pool blinking, then it stops
> 
> Can you take a crash dump when the system is "hung" (ie. after
there is
> no more disk activity), and make it available to me?
Miro supplied the dump which I examined and filed bug 6542681.  The root 
cause is that the machine is out of memory (in this case, kernel virtual 
address space).

As a workaround, you can change kernelbase to allow the kernel to use 
more virtual address space.

--matt

Miroslav Pendev

2007-Apr-05 14:01 UTC

head link

[zfs-discuss] Re: Re: zfs destroy <snapshot> takes hours

After some discussions with Matt I removed all the previous snapshots before the
one causing the memory issues.

Guess what - it worked. I was able to remove that snapshot after I removed all
previous ones. It took 2 seconds.

I will definitely have to upgrade that machine these days to 64 bit and more
memory.

Once again, thanks for your help Matt! 

ZFS rocks :)
 
 
This message posted from opensolaris.org

2007-Apr-10 18:53 UTC

head link

[zfs-discuss] Re: Re: zfs destroy <snapshot> takes hours

i am having a similar problem - system hung on zfs destroy snapshot - 50% cpu
utilization - running for hours - how can i know if i have the same problem? can
you be specific about hpw to set the kernelbase?
 
 
This message posted from opensolaris.org

2007-Apr-10 19:38 UTC

head link

[zfs-discuss] Re: Re: zfs destroy <snapshot> takes hours

the release notes:
http://docs.sun.com/app/docs/doc/817-0552/6mgbi4fgg?a=view
say an alternative to fixing the kernelbase is to upgrade to 64 bit -
i''m already running on a 64 bit sparc. maybe i have a different problem
- my drives have spun down to sleepy mode - zfs is still burning coal.
 
 
This message posted from opensolaris.org

2007-Apr-11 20:42 UTC

head link

[zfs-discuss] Re: Re: zfs destroy <snapshot> takes hours

rebooting fixed it - before rebooting, i ran the zdb script suggested above - it
created a 114MB file.
 
 
This message posted from opensolaris.org

Seemingly Similar Threads

Search for more seemingly similar threads

zfs discuss - Apr 2007 - zfs destroy <snapshot> takes hours

[zfs-discuss] zfs destroy <snapshot> takes hours

[zfs-discuss] Re: zfs destroy <snapshot> takes hours

[zfs-discuss] Re: zfs destroy <snapshot> takes hours

[zfs-discuss] Re: zfs destroy <snapshot> takes hours

[zfs-discuss] Re: zfs destroy <snapshot> takes hours

[zfs-discuss] Re: Re: zfs destroy <snapshot> takes hours

[zfs-discuss] Re: Re: zfs destroy <snapshot> takes hours

[zfs-discuss] Re: Re: zfs destroy <snapshot> takes hours

[zfs-discuss] Re: Re: zfs destroy <snapshot> takes hours

Seemingly Similar Threads