Steve Radich, BitShop, Inc.
2010-Jul-03 00:09 UTC
[zfs-discuss] Hang ZFS due to asyncronous delete?
As most others have - I''ve been having issues with dedup. Here''s my situation, 4TB pool for daily backups of sql server - dedup enabled - so a typical directory has 100+ files that are mostly identical (some all are identical). If I do rm * OpenSolaris is dead, zfs hung, etc. sometimes it comes back after hours, sometimes I reboot (only if a directory I don''t care if a scrub loses files, I know the reboot is likely risky). ZFS seems to asyncronously return the file is deleted, so potentially I have thousands of tens of thousands of i/os outstanding to actually do a delete and the system moves onto the next file which adds to this queue. If I interrupt a delete the disk i/o stays high for some time afterwards hence this theory. If I can keep the i/o queue depth reasonable things seem to work fine. This requires deleting a file files, waiting, then proceeding with more files. This script has delays between the files and uses find <directory> | grep to find which files to remove - not overly sophisticated but this seems to help a lot since it goes slow enough to also keep an eye on disk i/o load. Is there some reason that delete would return before the operation completes? It seems like a simple change in this behavior to block until the file is actually deleted would possibly resolve this issue. --------- Perl script to sleep between deletes ------------- #!/usr/bin/perl cleanup("/tankmir1/sqlbackups/", "_2010_05", ".trn"); sub cleanup { my $search=$_[1]; my $dir=$_[0]; my $type=$_[2]; print "Searching for $search in $dir\n"; print " Command: " . "find $dir \| grep \"$type\" \| grep \"$search\" |"; open(IN, "find $dir \| grep \"$type\" \| grep \"$search\" |"); while ($file = <IN>) { chop $file; print "$file - Delete.."; unlink($file); # You may want to vary this sleep time.. usleep(500000*4); print "\n"; } -- This message posted from opensolaris.org
Maybe Matching Threads
- Asyncronous Connection.
- NexentaStor 3.0.3 vs OpenSolaris - Patches more up to date?
- FW: neural network not using all observations
- How do I extract the scoring equations for neural networks and support vector machines?
- Disaster recovery option for file server