Using a Galaxy 4200 with one zpool striped on 3 73GB SAS 2.5 inch drives (no raid-z). I create one file system in the pool and measure write performance to a single file (I do enough i/o to discount buffer cache befor taking measurements). Sequential write performance = ~155MB/sec - seems ok since each drive tops out at about 55-60 MB/sec. Random write performance = ~6MB/sec. This seems really slow especially since I thought ZFS would do better at sequentiallizing random writes. Is this consistent with what others have experienced? Is there something I can do to improve random write performance? Chuck This message posted from opensolaris.org
Roch Bourbonnais - Performance Engineering
2006-Apr-28 15:06 UTC
[zfs-discuss] random i/o performance
Chuck Gehr writes: > Using a Galaxy 4200 with one zpool striped on 3 73GB SAS 2.5 inch drives (no raid-z). > > I create one file system in the pool and measure write performance to > a single file (I do enough i/o to discount buffer cache befor taking > measurements). > > Sequential write performance = ~155MB/sec - seems ok since each drive > tops out at about 55-60 MB/sec. ok. > > Random write performance = ~6MB/sec. This seems really slow > especially since I thought ZFS would do better at sequentiallizing > random writes. I would have expected 155MB/sec here. > > Is this consistent with what others have experienced? Is there > something I can do to improve random write performance? Can you describe your test and timing method ? > > Chuck > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I have written my own test program that uses open(), write(). All writes are a multiple of 512 in size. For the random test, I use rand() to pick a starting block number. I''ve tried write sizes between 8k and 128k - for the random i/o test, write size didn''t seem to have much effect on performance. For timing, I use gettimeofday(). Chuck -----Original Message----- From: Roch Bourbonnais - Performance Engineering [mailto:Roch.Bourbonnais at Sun.Com] Sent: Friday, April 28, 2006 9:06 AM To: Gehr, Chuck R Cc: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] random i/o performance Chuck Gehr writes: > Using a Galaxy 4200 with one zpool striped on 3 73GB SAS 2.5 inch drives (no raid-z). > > I create one file system in the pool and measure write performance to> a single file (I do enough i/o to discount buffer cache befor taking > measurements).> > Sequential write performance = ~155MB/sec - seems ok since each drive> tops out at about 55-60 MB/sec.ok. > > Random write performance = ~6MB/sec. This seems really slow > especially since I thought ZFS would do better at sequentiallizing > random writes. I would have expected 155MB/sec here. > > Is this consistent with what others have experienced? Is there > something I can do to improve random write performance? Can you describe your test and timing method ? > > Chuck > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Roch Bourbonnais - Performance Engineering
2006-Apr-28 16:01 UTC
[zfs-discuss] random i/o performance
So something is wrong. Can you shoot me your test (that''ll save me a few minutes) and I''ll see what''s up (but not before tuesday). The 128K random writes to an existing file (no O_DSYNC) should go at the top speed of the I/O subsystem. With O_DSYNC should be a good fraction of a single spindle. Bob Sneed mentionned to me to try to enable the disk write cache using ''format -e''. -r
On Fri, Apr 28, 2006 at 09:32:00AM -0600, Gehr, Chuck R wrote:> I have written my own test program that uses open(), write(). All writes > are a multiple of 512 in size. For the random test, I use rand() to > pick a starting block number. I''ve tried write sizes between 8k and > 128k - for the random i/o test, write size didn''t seem to have much > effect on performance. For timing, I use gettimeofday().You will get the best performance doing 128k-aligned, 128k i/o''s. Can you try your random test like that, and without O_DSYNC, without fsync()? It sounds like you might be doing 128k i/o to only 512-aligned offsets. If that''s the case, then ZFS will have to read in the two blocks that your write partially covers. --matt