Brandon High
2010-Apr-30 05:11 UTC
[zfs-discuss] Panic when deleting a large dedup snapshot
I tried destroying a large (710GB) snapshot from a dataset that had been written with dedup on. The host locked up almost immediately, but there wasn''t a stack trace on the console and the host required a power cycle, but seemed to reboot normally. Once up, the snapshot was still there. I was able to get a dump from this. The data was written with b129, and the system is currently at b134. I tried destroying it again, and the host started behaving badly. ''less'' would hang, and there were several zfs-auto-snapshot processes that were over an hour old, and the ''zfs snapshot'' processes were stuck on the first dataset of the pool. Eventually the host became unusable and I rebooted again. The host seems to be fine now, and is currently running a scrub. Any ideas on how to avoid this in the future? I''m no longer using dedup due to performance issues with it, which implies that the DDT is very large. bhigh at basestar:~$ pfexec zdb -DD tank DDT-sha256-zap-duplicate: 5339247 entries, size 348 on disk, 162 in core DDT-sha256-zap-unique: 1479972 entries, size 1859 on disk, 1070 in core -B -- Brandon High : bhigh at freaks.com
Cindy Swearingen
2010-Apr-30 16:53 UTC
[zfs-discuss] Panic when deleting a large dedup snapshot
Brandon, You''re probably hitting this CR: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6924824 I''m tracking the existing dedup issues here: http://hub.opensolaris.org/bin/view/Community+Group+zfs/dedup Thanks, Cindy On 04/29/10 23:11, Brandon High wrote:> I tried destroying a large (710GB) snapshot from a dataset that had > been written with dedup on. The host locked up almost immediately, but > there wasn''t a stack trace on the console and the host required a > power cycle, but seemed to reboot normally. Once up, the snapshot was > still there. I was able to get a dump from this. The data was written > with b129, and the system is currently at b134. > > I tried destroying it again, and the host started behaving badly. > ''less'' would hang, and there were several zfs-auto-snapshot processes > that were over an hour old, and the ''zfs snapshot'' processes were > stuck on the first dataset of the pool. Eventually the host became > unusable and I rebooted again. > > The host seems to be fine now, and is currently running a scrub. > > Any ideas on how to avoid this in the future? I''m no longer using > dedup due to performance issues with it, which implies that the DDT is > very large. > > bhigh at basestar:~$ pfexec zdb -DD tank > DDT-sha256-zap-duplicate: 5339247 entries, size 348 on disk, 162 in core > DDT-sha256-zap-unique: 1479972 entries, size 1859 on disk, 1070 in core > > -B >
Looks like I am hitting the same issue now from the earlier post that you responded. http://opensolaris.org/jive/thread.jspa?threadID=128532&tstart=15 Continue my test migration with the dedup=off and synced couple more file systems. I decided the merge two of the file systems together by copying the one file system into another one. Then when I try to delete directories in the 1st file system, the whole system hung. The file system was done with dedup turn on half way through the sync then turned off since I wasn''t able to finish the initial sync with dedup turned on. But now looks like there is no way to get rid of dedup file system safely. -- This message posted from opensolaris.org
Roy Sigurd Karlsbakk
2010-May-01 09:41 UTC
[zfs-discuss] Panic when deleting a large dedup snapshot
----- "Cindy Swearingen" <cindy.swearingen at oracle.com> skrev:> Brandon, > > You''re probably hitting this CR: > > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6924824Interesting - reported in february and still no fix? roy