Richard Lee
2010-Jul-11 18:46 UTC
Serious zfs slowdown when mixed with another file system (ufs/msdosfs/etc.).
This is on clean FreeBSD 8.1 RC2, amd64, with 4GB memory. The closest I found by Googling was this: http://forums.freebsd.org/showthread.php?t=9935 And it talks about all kinds of little tweaks, but in the end, the only thing that actually works is the stupid 1-line perl code that forces the kernal to free the memory allocated to (non-zfs) disk cache, which is the "Inact"ive memory in "top." I have a 4-disk raidz pool, but that's unlikely to matter. Try to copy large files from non-zfs disk to zfs disk. FreeBSD will cache the data read from non-zfs disk in memory, and free memory will go down. This is as expected, obviously. Once there's very little free memory, one would expect whatever is more important to kick out the cached data (Inact) and make memory available. But when almost all of the memory is taken by disk cache (of non-zfs file system), ZFS disks start threshing like mad and the write throughput goes down in 1-digit MB/second. I believe it should be extremely easy to duplicate. Just plug in a big USB drive formatted in UFS (msdosfs will likely do the same), and copy large files from that USB drive to zfs pool. Right after clean boot, gstat will show something like 20+MB/s movement from USB device (da*), and occasional bursts of activity on zpool devices at very high rate. Once free memory is exhausted, zpool devices will change to constant low-speed activity, with disks threshing about constantly. I tried enabling/disabling prefetch, messing with vnode counts, zfs.vdev.min/max_pending, etc. The only thing that works is that stupid perl 1-liner (perl -e '$x="x"x1500000000'), which returns the activity to that seen right after a clean boot. It doesn't last very long, though, as the disk cache again consumes all the memory. Copying files between zfs devices doesn't seem to affect anything. I understand zfs subsystem has its own memory/cache management. Can a zfs expert please comment on this? And is there a way to force the kernel to not cache non-zfs disk data? --rich
Jeremy Chadwick
2010-Jul-11 20:50 UTC
Serious zfs slowdown when mixed with another file system (ufs/msdosfs/etc.).
On Sun, Jul 11, 2010 at 11:25:12AM -0700, Richard Lee wrote:> This is on clean FreeBSD 8.1 RC2, amd64, with 4GB memory. > > The closest I found by Googling was this: > http://forums.freebsd.org/showthread.php?t=9935 > > And it talks about all kinds of little tweaks, but in the end, the > only thing that actually works is the stupid 1-line perl code that > forces the kernal to free the memory allocated to (non-zfs) disk > cache, which is the "Inact"ive memory in "top." > > I have a 4-disk raidz pool, but that's unlikely to matter. > > Try to copy large files from non-zfs disk to zfs disk. FreeBSD will > cache the data read from non-zfs disk in memory, and free memory will > go down. This is as expected, obviously. > > Once there's very little free memory, one would expect whatever is > more important to kick out the cached data (Inact) and make memory > available. > > But when almost all of the memory is taken by disk cache (of non-zfs > file system), ZFS disks start threshing like mad and the write > throughput goes down in 1-digit MB/second. > > I believe it should be extremely easy to duplicate. Just plug in a > big USB drive formatted in UFS (msdosfs will likely do the same), and > copy large files from that USB drive to zfs pool. > > Right after clean boot, gstat will show something like 20+MB/s > movement from USB device (da*), and occasional bursts of activity on > zpool devices at very high rate. Once free memory is exhausted, zpool > devices will change to constant low-speed activity, with disks > threshing about constantly. > > I tried enabling/disabling prefetch, messing with vnode counts, > zfs.vdev.min/max_pending, etc. The only thing that works is that > stupid perl 1-liner (perl -e '$x="x"x1500000000'), which returns the > activity to that seen right after a clean boot. It doesn't last very > long, though, as the disk cache again consumes all the memory. > > Copying files between zfs devices doesn't seem to affect anything. > > I understand zfs subsystem has its own memory/cache management. > Can a zfs expert please comment on this? > > And is there a way to force the kernel to not cache non-zfs disk data?I believe you may be describing two separate issues: 1) ZFS using a lot of memory but not freeing it as you expect 2) Lack of disk I/O scheduler For (1), try this in /boot/loader.conf and reboot: # Disable UMA (uma(9)) for ZFS; amd64 was moved to exclusively use UMA # on 2010/05/24. # http://lists.freebsd.org/pipermail/freebsd-stable/2010-June/057162.html vfs.zfs.zio.use_uma="0" For (2), may try gsched_rr: http://svnweb.freebsd.org/viewvc/base/releng/8.1/sys/geom/sched/README?view=markup -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Peter Jeremy
2010-Jul-12 09:38 UTC
Serious zfs slowdown when mixed with another file system (ufs/msdosfs/etc.).
On 2010-Jul-11 11:25:12 -0700, Richard Lee <ricky@csua.berkeley.edu> wrote:>But when almost all of the memory is taken by disk cache (of non-zfs >file system), ZFS disks start threshing like mad and the write >throughput goes down in 1-digit MB/second.It can go a lot lower than that... Yes, this is a known problem. The underlying problem is a disconnect between the ZFS cache (ARC) and the VM cache used by everything else, preventing ZFS reclaiming RAM from the VM cache. For several months, I was running a regular cron job that was a slightly fancier version of the perl one-liner. I have been using the attached arc.patch1 based on a patch written by Artem Belevich <fbsdlist@src.cx> (see http://pastebin.com/ZCkzkWcs ) for about a month. I have had reasonable success with it (and junked my cronjob) but have managed to wedge my system a couple of times whilst doing zfs send|recv. Whilst looking at that diff, I just noticed a nasty signed/unsigned bug that could bite in low memory conditions and have revised it to arc.patch2 (untested as yet). Independently, Martin Matuska <mm@FreeBSD.org> committed r209227 that corrects a number of ARC bugs reported on OpenSolaris. Whilst this patch doesn't add checks on "inactive" or "cache", some quick checks suggest it also helps (though I need to do further checks). See http://people.freebsd.org/~mm/patches/zfs/head-12636.patch -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100712/1a90ab4b/attachment.pgp
Peter Jeremy
2010-Jul-12 09:39 UTC
Serious zfs slowdown when mixed with another file system (ufs/msdosfs/etc.).
Skipped content of type multipart/mixed-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20100712/84fa0a83/attachment.pgp