Joep Vesseur
2009-Apr-17 14:55 UTC
[zfs-discuss] Much room for improvement for "zfs destroy -r" ...
All, I was wondering why "zfs destroy -r" is so excruciatingly slow compared to parallel destroys. On my x4500, for example, after having created 1000 filesystems named pool/blub2/0000 [...] pool/blub2/0999 and keeping them empty, a subsequent destroy with # time zfs destroy -r pool-0/blub2 yields real 5m17.657s user 0m0.184s sys 0m1.831s while a little handy-work with # time for i in `zfs list | awk ''/blub2\\// {print $1}''` ;\ do ( zfs destroy $i & ) ; done yields real 0m8.191s user 0m6.037s sys 0m16.096s An 38.8 time improvement (at the cost of some extra CPU load) Why is there so much overhead in the sequential case? Or have I oversimplified the issues at hand with this simple test? Joep
Kyle McDonald
2009-Apr-17 19:19 UTC
[zfs-discuss] Much room for improvement for "zfs destroy -r" ...
Joep Vesseur wrote:> All, > > I was wondering why "zfs destroy -r" is so excruciatingly slow compared to > parallel destroys. > >< SNIP>> while a little handy-work with > > # time for i in `zfs list | awk ''/blub2\\// {print $1}''` ;\ > do ( zfs destroy $i & ) ; done > > yields > > real 0m8.191s > user 0m6.037s > sys 0m16.096s > > An 38.8 time improvement (at the cost of some extra CPU load) > > Why is there so much overhead in the sequential case? Or have I oversimplified > the issues at hand with this simple test? > >One reason is that you''re not timing how long it takes for the destroy''s to complete. You''re only timing how long it takes to start all the jobs in the background. -Kyle> Joep > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Joep Vesseur
2009-Apr-17 23:01 UTC
[zfs-discuss] Much room for improvement for "zfs destroy -r" ...
On 04/17/09 21:19, Kyle McDonald wrote:> One reason is that you''re not timing how long it takes for the destroy''s > to complete. You''re only timing how long it takes to start all the jobs > in the background.Right, I''m sorry, my example was an oversimplification of a script I made. That script included a wait after the for-loop. The same example again: # date ; for i in `zfs list | awk ''/blub2\\// {print $1}''` ; \ do ( zfs destroy $i & ) ; done ; wait ; date Yields Fri Apr 17 22:56:32 UTC 2009 Fri Apr 17 22:56:40 UTC 2009 Still 8 seconds total, including waiting for all the "zfs destroy"s to complete. I can''t tell whether the kernel is post processing any of the destroys after the zfs command exits, but that''s true for the "destroy -r" case as well, and that still takes 38 times as long. Joep
Casper.Dik at Sun.COM
2009-Apr-18 10:47 UTC
[zfs-discuss] Much room for improvement for "zfs destroy -r" ...
>On 04/17/09 21:19, Kyle McDonald wrote: > >> One reason is that you''re not timing how long it takes for the destroy''s >> to complete. You''re only timing how long it takes to start all the jobs >> in the background. > >Right, I''m sorry, my example was an oversimplification of a script I made. >That script included a wait after the for-loop. The same example again: > > # date ; for i in `zfs list | awk ''/blub2\\// {print $1}''` ; \ > do ( zfs destroy $i & ) ; done ; wait ; dateThat still doesn''t wait for zfs destroy. Zfs destroy is run in a sub-shell "()" and you wait for that sub-shell but not for its children. Casper
Joep Vesseur
2009-Apr-18 11:26 UTC
[zfs-discuss] Much room for improvement for "zfs destroy -r" ...
> That still doesn''t wait for zfs destroy. Zfs destroy is run in a > sub-shell "()" and you wait for that sub-shell but not for its children.If I extend the command with an additional "zfs list", all the filesystems that I told the system to destroy are gone... Joep
Matthew Ahrens
2009-May-10 00:10 UTC
[zfs-discuss] Much room for improvement for "zfs destroy -r" ...
Joep Vesseur wrote:> I was wondering why "zfs destroy -r" is so excruciatingly slow compared to > parallel destroys.This issue is bug # 6631178. The problem is that "zfs destroy -r <filesystem>" destroys each filesystem and snapshot individually, and each one must wait for a txg to sync (0.1 - 10 seconds). (Note that "zfs destroy -r <snapshot>" does not suffer from this; it does all its work in one txg.) If you issue multiple "zfs destroy" commands in parallel, many of them will end up waiting for the same txg to finish, so many fewer txgs must pass to complete all the destroys. In the most general case, this problem is nontrivial due to the error and depencency handling -- we need to collect error values from each individual destroy. However, there is much improvement that can be made for the common cases. --matt