Torsten "Paul" Eichstädt
2008-Jun-26  17:58 UTC
[zfs-discuss] zfs send/receive locks system threads (Bug?)
Dear experts, 
when migrating from a 6 disk to a 4 disk raidz1 zpool with ''zfs send
-R|zfs receive'', the system''s response time dropped
noticeably.
The run queue increased to 28-32, and ''prstat -Lm'' showed that
several system threads were locked repeatedly - noticeably fmd, devfsd and many
of the svc.{configd,startd}.
The system is an E4500 with 4 US II CPU, so I added 6 CPU on the fly which
helped a bit but the run queue was still about 10. Reasonably, things were worst
when the migration was processing gzip-compressed FS''s, but
surprisingly the run queue was >2 with uncompressed FS''s, too.
OS is SXDE snv79b.
2nd observation was that even when copying gzip''ed FS''s, the
CPUs had about 30% idle cycles
and zero wio%.
How come? Is this (locking of system threads -> increased run queue) a bug or
expected behaviour?
3rd: The zpool property ''delegation'' was not copied. Same Q:
bug or expected? Note: migration
was from zpool version 8 to zpool version 10.
4th: Performance was way beyond my expectations.  I set up regular recursive
snapshots via cron,
so every FS had many (42) snapshots, many empty or of small size. Even the empty
snaps took
5 seconds each (34 bytes/5 sec), and max. throughput was 8 MB/s with the few
snaps that had
some GB of data. I understand that even to copy an empty snapshot needs some
time to process,
and I guess the reason for the poor performance is the locking described above.
Still the same Q
arises: Is a max. throughput of 8 MB/s to write to a raidz1 of 4 FC disks ok
with send/receive?
Workaround: If you recursively send/receive a zpool with many snapshots,
consider to decrease
the # of snapshots first. E.g. when you have rotating snapshots like every 10
minutes
(data@{min-00,min-10,min-20,...} and every hour (data@{hourly-00,hourly-01,...},
delete these
snapshots prior to the send/receive operation.
Thanks in advance, Paul
 
 
This message posted from opensolaris.org