Anantha N. Srirama
2006-Dec-12 18:31 UTC
[zfs-discuss] Performance problems during ''destroy'' (and bizzare Zone problem as well)
[b]Setting:[/b] We''ve operating in the following setup for well over 60 days. - E2900 (24 x 92) - 2 2Gbps FC to EMC SAN - Solaris 10 Update 2 (06/06) - ZFS with compression turned on - Global zone + 1 local zone (sparse) - Local zone is fed ZFS clones from the global Zone [b]Daily Routine[/b] - Shutdown local Zone - Recreate ZFS clones - Restart local Zone - End to end timing for this refresh is anywhere between 5 to 30 minutes. Bulk of the time is spent in the ZFS ''destroy'' phase. [b]Problem[/b] - We had extensive read/write activity in the global and local Zones yesterday. I estimate that we wrote 1/4 of one large ZFS filesystem, ~ 160GB of write. - This morning we had a fair amount of activity on the system when the refresh started, zpool was reporting around 150MB/S of write. - Our ''zfs destroy'' commands took what I considere ''normal'', the FS that was fielding the bulk of the I/O took 15 minutes. During this time everything was crawling or more accurately come to a dead stop. A simple ''rm'' would hang. I''ve reported this problem to the forum in the past. I also believe the fix for the problem is in Update 3 for Solaris 10, right? -[b]Surprisingly today the ZFS ''snapshot & clone'' took an inordinate amount of time. I observed each snapshot & clone activity together took 10+ minutes. In the past the same activity has taken no more than a few seconds even during busy times. The total end-to-end timing for all snapshots/clones was a whopping 1:44:00!!![/b] - Even more surprising was that local Zone refused to startup (zoneadm -z bluenile boot) with no error messages. - I was able to start the Zone only after an hour or so after the completion of the ZFS commands. [b]Questions:[/b] - Why is the destroy phase taking so long? - What can explain the unduly long snapshot/clone times - Why didn''t the Zone startup? - More surprisingly why did the Zone startup after an hour? Thanks in advance. This message posted from opensolaris.org
Matthew Ahrens
2006-Dec-12 23:45 UTC
[zfs-discuss] Performance problems during ''destroy'' (and bizzare Zone problem as well)
Anantha N. Srirama wrote:> - Why is the destroy phase taking so long?Destroying clones will be much faster with build 53 or later (or the unreleased s10u4 or later) -- see bug 6484044.> - What can explain the unduly long snapshot/clone times > - Why didn''t the Zone startup? > - More surprisingly why did the Zone startup after an hour?Perhaps there was so much activity on the system that we couldn''t push out transaction groups in the usual < 5 seconds. ''zfs snapshot'' and ''zfs clone'' take at least 1 transaction group to complete, so this could explain it. We''ve seen this problem as well and are working on a fix... --mat