Torsten "Paul" Eichstädt
2008-Jun-26 17:58 UTC
[zfs-discuss] zfs send/receive locks system threads (Bug?)
Dear experts, when migrating from a 6 disk to a 4 disk raidz1 zpool with ''zfs send -R|zfs receive'', the system''s response time dropped noticeably. The run queue increased to 28-32, and ''prstat -Lm'' showed that several system threads were locked repeatedly - noticeably fmd, devfsd and many of the svc.{configd,startd}. The system is an E4500 with 4 US II CPU, so I added 6 CPU on the fly which helped a bit but the run queue was still about 10. Reasonably, things were worst when the migration was processing gzip-compressed FS''s, but surprisingly the run queue was >2 with uncompressed FS''s, too. OS is SXDE snv79b. 2nd observation was that even when copying gzip''ed FS''s, the CPUs had about 30% idle cycles and zero wio%. How come? Is this (locking of system threads -> increased run queue) a bug or expected behaviour? 3rd: The zpool property ''delegation'' was not copied. Same Q: bug or expected? Note: migration was from zpool version 8 to zpool version 10. 4th: Performance was way beyond my expectations. I set up regular recursive snapshots via cron, so every FS had many (42) snapshots, many empty or of small size. Even the empty snaps took 5 seconds each (34 bytes/5 sec), and max. throughput was 8 MB/s with the few snaps that had some GB of data. I understand that even to copy an empty snapshot needs some time to process, and I guess the reason for the poor performance is the locking described above. Still the same Q arises: Is a max. throughput of 8 MB/s to write to a raidz1 of 4 FC disks ok with send/receive? Workaround: If you recursively send/receive a zpool with many snapshots, consider to decrease the # of snapshots first. E.g. when you have rotating snapshots like every 10 minutes (data@{min-00,min-10,min-20,...} and every hour (data@{hourly-00,hourly-01,...}, delete these snapshots prior to the send/receive operation. Thanks in advance, Paul This message posted from opensolaris.org