Anantha N. Srirama
2006-Aug-16 22:02 UTC
[zfs-discuss] ZFS write performance problem with compression set to ON
Test setup: - E2900 with 12 US-IV+ 1.5GHz processor, 96GB memory, 2x2Gbps FC HBAs, MPxIO in round-robbin config. - 50x64GB EMC disks presented on both 2 FCs. - ZFS pool defined using all 50 disks - Multiple ZFS filesystems built on the above pool. I''m observing the following: - When the filesystems have compress=OFF and I do bulk reads/writes (8 parallel ''cp''s running between ZFS filesystems) I observe approximately 200-250MB/S consolidated I/O; writes in the 100MB/S range. I get these numbers running ''zpool iostat 5''. I see the same read/write ratio for the duration of the test. - When the filesystems have compress=ON I see the following: reads from compressed filesystems come in waves; zpool will report for long durations (60+ seconds) no read activity while the write activity is consistently reported at 20MB/S (no variation in the write rate throughtout the test.) - The machine is mostly idling during the entire test; both cases. - ZFS reports 4:1 compresson ratio for my filesystem. I''m puzzled by the following: - Why do reads comes in waves with compression=ON; it almost feels like ZFS reads a bunch of data and then proceeds to compress it before writing it out. This tells me there is not a read bottleneck; meaning there is no starvation of the compress routine due to the following facts: CPU/Machine/IO is not saturated in any shape or form. - Why then does ZFS generate substantially lower write throughput (magical 20MB/S spread evenly across the 50 disks, 0.4MB/S each)? Can anybody shed any light on this anomoloy (?). Mr. Bonwick I hope you''re reading this post. BTW, we love the ZFS and are looking forward to rolling out aggresively in our new project. I''d like to take advantage of the compression since we''re mostly I/O bound and we''ve plenty of CPU/Memory. Thanks. This message posted from opensolaris.org
Anantha N. Srirama
2006-Aug-16 22:05 UTC
[zfs-discuss] Re: ZFS write performance problem with compression set to ON
Completely forgot to mention the OS in my previous post; Solaris 10 06/06. This message posted from opensolaris.org
Jeff Bonwick
2006-Aug-17 05:28 UTC
[zfs-discuss] ZFS write performance problem with compression set to ON
> - When the filesystems have compress=ON I see the following: reads fromcompressed filesystems come in waves; zpool will report for long durations (60+ seconds) no read activity while the write activity is consistently reported at 20MB/S (no variation in the write rate throughtout the test). My guess is that your benchmark is writing blocks of zeroes. With compression enabled, ZFS will not merely compress blocks of zeroes, it will eliminate them -- i.e. turn them into holes. The only thing left to write is znode updates (mtime, etc). This will appear as a very a low disk I/O rate because it is; to measure the logical I/O rate, I''d suggest just timing it.> BTW, we love the ZFS and are looking forward to rolling out aggresively in ournew project. I''d like to take advantage of the compression since we''re mostly I/O bound and we''ve plenty of CPU/Memory. Great! Please let us know if this explains the mystery. Jeff
Anantha N. Srirama
2006-Aug-17 11:40 UTC
[zfs-discuss] Re: ZFS write performance problem with compression set to ON
Therein lies my dillemma: - We know the I/O sub-system is capable of much higher I/O rates - Under the test setup I''ve SAS datasets which are lending themselves to compression. This should manifest itself as lots of read I/O resulting in much smaller (4x) write I/O due to compression. This means my read rates should be driven higher to keep the compression. I don''t see this, as I said in my original post I see reads comes in waves. I''m beginning to think my write rates are hitting a a bottleneck in compression as follows: - ZFS issues reads - ZFS starts compressing the data before the write and cannot drain the input buffers fast enough; this results in reads to stop. - ZFS completes compression and writes out data at a much smaller rate due to the smaller compressed data stream. I''m not a filesystem wizard but shouldn''t ZFS take advantage of my available CPUs to drain the input buffer faster (parallel)? It is possible that you''ve some internal throttles in place to make ZFS a good citizen in the Solaris landscape; a la algorithms in place to prevent cache flooding by one host/device in EMC/Hitachi. I''ll perform some more tests with different datasets and report to the forum. Now if only I can convince my storage administrator to provision me raw disks instead of the mirrored disks so I can let ZFS do the same for me, another battle another day ;-) Thanks. This message posted from opensolaris.org
Roch
2006-Aug-17 12:13 UTC
[zfs-discuss] Re: ZFS write performance problem with compression set to ON
Anantha N. Srirama writes: > Therein lies my dillemma: > > - We know the I/O sub-system is capable of much higher I/O rates > - Under the test setup I''ve SAS datasets which are lending > themselves to compression. This should manifest itself as lots of read > I/O resulting in much smaller (4x) write I/O due to compression. This > means my read rates should be driven higher to keep the compression. I > don''t see this, as I said in my original post I see reads comes in > waves. > > I''m beginning to think my write rates are hitting a a bottleneck in compression as follows: > > - ZFS issues reads > > - ZFS starts compressing the data before the write and cannot drain > the input buffers fast enough; this results in reads to stop. > > - ZFS completes compression and writes out data at a much smaller > rate due to the smaller compressed data stream. > > I''m not a filesystem wizard but shouldn''t ZFS take advantage of my > available CPUs to drain the input buffer faster (parallel)? It is > possible that you''ve some internal throttles in place to make ZFS a > good citizen in the Solaris landscape; a la algorithms in place to > prevent cache flooding by one host/device in EMC/Hitachi. > > I''ll perform some more tests with different datasets and report to the > forum. Now if only I can convince my storage administrator to > provision me raw disks instead of the mirrored disks so I can let ZFS > do the same for me, another battle another day ;-) > > Thanks. > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Yes, compression runs in the context of a single thread per pool so it seems quite possible that it will cause the behavior you see. The issue is being tracked here: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6460622 -r
Anantha N. Srirama
2006-Aug-21 16:58 UTC
[zfs-discuss] Re: ZFS write performance problem with compression set to ON
I''ve a few questions: - Does ''zpool iostat'' report numbers from the top of the ZFS stack or at the bottom? I''ve corelated the zpool iostat numbers with the system iostat numbers and they matchup. This tells me the numbers are from the ''bottom'' of the ZFS stack, right? Having said that it''d be nice to have zpool iostat return numbers at the top of the stack. This becomes relevant when we''ve compression =ON. - Secondly, I did some more tests and I find the same read waves and the consistent write throughput. I''ve been reading another thread on this forum about Niagara and the compression where Matt Ahrens noted that the compression at this time is single-threaded. Further, he stated that there maybe a bugfix released to use multiple threads. I eagerly await the fix. Thanks again for a great feature. Looking forward to more fun stuff out of Sun and you Mr. Bonwick. This message posted from opensolaris.org