Steve Radich, BitShop, Inc.
2008-May-21 21:27 UTC
[zfs-discuss] Pause Solaris with ZFS compression busy by doing a cp?
Hardware: Supermicro server with Adaptec 5405 SAS controller, LSI expander -> 24 drives. Currently using 2x 1tb SAS drives striped and 1x750gb SATA as another pool. I don''t think hardware is related though as if I turn off zfs compression it''s fine - I seem to get same behavior on either pool. The ONLY thing I can think of distinct is I use a USB flash drive for root, performance on root pool is horrible but system works fine. If I do a copy with ZFS compression=gzip-9 then I''ll get Solaris hung for several seconds. I have iostat -xcnCXTdz 5 running, so it SHOULD be displaying stats every 5 seconds. The results: 06:01:20, then 06:02:04 (44 seconds). Thu May 22 06:01:20 2008 cpu us sy wt id 0 13 0 86 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 253.5 0.0 16524.7 0.0 0.0 14.1 0.0 55.6 0 55 c4 121.0 0.0 8140.8 0.0 0.0 8.5 0.0 70.2 0 30 c4t0d0 132.6 0.0 8383.9 0.0 0.0 5.6 0.0 42.2 0 25 c4t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t3d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t4d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0 Thu May 22 06:02:04 2008 cpu us sy wt id 0 98 0 2 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 42.4 38.7 2590.2 2752.5 0.0 1.9 0.0 24.0 0 8 c4 21.5 19.1 1313.4 1353.3 0.0 1.1 0.0 26.2 0 4 c4t0d0 20.8 19.5 1276.9 1399.2 0.0 0.9 0.0 21.7 0 4 c4t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t3d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t4d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0 0.0 0.0 0.9 0.0 0.0 0.0 0.1 11.8 0 0 c6 0.0 0.0 0.9 0.0 0.0 0.0 0.1 11.8 0 0 c6t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0 Thu May 22 06:02:09 2008 cpu us sy wt id 0 6 0 94 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 27.4 249.4 2164.0 14078.1 0.0 68.9 0.0 249.1 0 200 c4 15.0 128.8 1238.5 7252.9 0.0 34.3 0.0 238.8 0 100 c4t0d0 12.4 120.6 925.5 6825.2 0.0 34.6 0.0 260.1 0 100 c4t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t3d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t4d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0 Thu May 22 06:02:16 2008 cpu us sy wt id 0 82 0 18 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 54.4 14.8 3907.3 558.2 0.0 9.0 0.0 129.7 0 41 c4 26.0 7.2 1891.3 282.6 0.0 4.2 0.0 126.7 0 18 c4t0d0 28.3 7.6 2016.0 275.6 0.0 4.8 0.0 132.5 0 22 c4t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t3d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t4d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0 I notice the copy is still going, but I''m back to semi-responsive possibly when the second file starts (7 seconds instead of 5). This seems like the compression thread(s) are too high of priority. The files I''m copying for my test are: -rw-r--r-- 1 root root 2240902488 2008-05-21 19:32 it-20080106.zfs -rw-r--r-- 1 root root 1381914720 2008-05-21 19:40 it-20080131.zfs They are zfs send logs, so pretty large. They are also compressed. What concerns me about this isn''t that I''ve successfully overloaded the cpu, that''s to be expected - But that NOTHING seems to run at that point. The scheduler IMHO should be taking care of other requests instead of giving zfs compression all the cpu - i.e. if I try to ssh to the box I can''t log in while this runs, for almost a minute - it''s just unresponsive. I didn''t test enough other things, but I assume the entire system is hung. I also noticed (perhaps by design) that a copy with compression off almost instantly returns, but the writes continue LONG after the cp process claims to be done. Is this normal? Wouldn''t closing the file ensure it was written to disk? Is that tunable somewhere? This message posted from opensolaris.org
Neil Perrin
2008-May-22 16:10 UTC
[zfs-discuss] Pause Solaris with ZFS compression busy by doing a cp?
> I also noticed (perhaps by design) that a copy with compression off almost > instantly returns, but the writes continue LONG after the cp process claims > to be done. Is this normal?Yes this is normal. Unless the application is doing synchronous writes (eg DB) the file will be written to disk at the convenience of the FS. Most fs operate this way. It''s too expensive to synchronously write out data, so it''s batched up and written asynchronously.> Wouldn''t closing the file ensure it was written to disk?No.> Is that tunable somewhere?No. For ZFS you can use sync(1M) which will force out all transactions for all files in the pool. That is expensive though. Neil.
Bart Smaalders
2008-May-22 21:05 UTC
[zfs-discuss] Pause Solaris with ZFS compression busy by doing a cp?
Neil Perrin wrote:>> I also noticed (perhaps by design) that a copy with compression off almost >> instantly returns, but the writes continue LONG after the cp process claims >> to be done. Is this normal? > > Yes this is normal. Unless the application is doing synchronous writes > (eg DB) the file will be written to disk at the convenience of the FS. > Most fs operate this way. It''s too expensive to synchronously write > out data, so it''s batched up and written asynchronously. > >> Wouldn''t closing the file ensure it was written to disk? > > No. > >> Is that tunable somewhere? > > No. For ZFS you can use sync(1M) which will force out all transactions > for all files in the pool. That is expensive though. > > Neil.Your application can call f[d]sync when it''s done writing the file and before it does the close if it wants all the data on disk. This has been standard operating procedure for many, many years. From TFMP: DESCRIPTION The fsync() function moves all modified data and attributes of the file descriptor fildes to a storage device. When fsync() returns, all in-memory modified copies of buffers associated with fildes have been written to the physical medium. The fsync() function is different from sync(), which schedules disk I/O for all files but returns before the I/O completes. The fsync() function forces all outstanding data operations to synchronized file integrity completion (see fcntl.h(3HEAD) definition of O_SYNC.) ... USAGE The fsync() function should be used by applications that require that a file be in a known state. For example, an application that contains a simple transaction facility might use fsync() to ensure that all changes to a file or files caused by a given transaction were recorded on a storage medium. - Bart -- Bart Smaalders Solaris Kernel Performance barts at cyber.eng.sun.com http://blogs.sun.com/barts "You will contribute more with mercurial than with thunderbird."
Steve Radich, BitShop, Inc.
2008-May-27 13:31 UTC
[zfs-discuss] Pause Solaris with ZFS compression busy by doing a cp?
My bigger question doesn''t have to do with the file close / sync - But why can I hang Solaris with a simple write to a zfs location with gzip-9 for example? To reproduce copy some files to your box (gigs of files), create another zfs mount point with compression=gzip-9 (any level, 9 demonstrates easiest). cp /originallocation/* /gzip9location/* Try running iostat in another ssh window, you''ll see it can''t even gather stats every 5 seconds (below is iostats every 5 seconds): Tue May 27 09:26:41 2008 Tue May 27 09:26:57 2008 Tue May 27 09:27:34 2008 That''s on a Core 2 Quad Q6600 cpu, 8gb ram, with NOTHING ELSE on the box other than cp running. The prioritization of the compression seems wrong - i.e. other tasks on the system need some cpu time. Even doing a directory via a share (cifs in my case) takes forever with this running - A single operation shouldn''t be able to slow the system to a crawl like this. As to file close - I''d have to grab a NT resource kit and read - But I believe the proper behavior for CIFS is to sync upon close. That definitely happens on a network share on a genuine Windows server, if I set compression=on for the folder, create a file, and close it then the compression finishes before the file closes - This has issues on large files sometimes (basically same test as this) - This however is a curious behavior, the pause problem is a major concern that a single user can almost hang a quad core cpu with a simple copy. This message posted from opensolaris.org
roland
2008-Jun-03 22:16 UTC
[zfs-discuss] Pause Solaris with ZFS compression busy by doing a cp?
>Try running iostat in another ssh window, you''ll see it can''t even gather stats every 5 seconds >(below is iostats every 5 seconds): >Tue May 27 09:26:41 2008 >Tue May 27 09:26:57 2008 >Tue May 27 09:27:34 2008that should not happen! i`d call that a bug! how does vmstat behave with lzjb compression? This message posted from opensolaris.org