Recently, I''m benchmarking all kinds of stuff on my systems. And one question I can''t intelligently answer is what blocksize I should use in these tests. I assume there is something which monitors present disk activity, that I could run on my production servers, to give me some statistics of the block sizes that the users are actually performing on the production server. And then I could use that information to make an informed decision about block size to use while benchmarking. Is there a man page I should read, to figure out how to monitor and get statistics on my real life users'' disk activity? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100306/ef4989d3/attachment.html>
On Mar 6, 2010, at 1:02 PM, Edward Ned Harvey wrote:> Recently, I?m benchmarking all kinds of stuff on my systems. And one question I can?t intelligently answer is what blocksize I should use in these tests. > > I assume there is something which monitors present disk activity, that I could run on my production servers, to give me some statistics of the block sizes that the users are actually performing on the production server. And then I could use that information to make an informed decision about block size to use while benchmarking. > > Is there a man page I should read, to figure out how to monitor and get statistics on my real life users? disk activity?It all depends on how they are connecting to the storage. iSCSI, CIFS, NFS, database, rsync, ...? The reason I say this is because ZFS will coalesce writes, so just looking at iostat data (ops versus size) will not be appropriate. You need to look at the data flowing between ZFS and the users. fsstat works for file systems, but won''t work for zvols, as an example. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance http://nexenta-atlanta.eventbrite.com (March 16-18, 2010)
> It all depends on how they are connecting to the storage. iSCSI, CIFS, > NFS, > database, rsync, ...? > > The reason I say this is because ZFS will coalesce writes, so just > looking at > iostat data (ops versus size) will not be appropriate. You need to > look at the > data flowing between ZFS and the users. fsstat works for file systems, > but > won''t work for zvols, as an example. > -- richardActually, maybe that is right. Since the users are connecting via CIFS and NFS and ssh to use a ZFS volume, it stands to reason that ZFS is ultimately the thing which is performing all the read/write operations on the physical disks, right? So if I use iostat, and I see coalesced data ... I get statistics on ops and size ... which is truly the real world usage scenario for my system, right? Thus, when I am trying to optimize my disk configuration, benchmarking with iozone or whatever, those statistics will be the best measurement for me to use, when I tell iozone the blocksize it should test. Right?
On Mar 8, 2010, at 5:11 AM, Edward Ned Harvey wrote:>> It all depends on how they are connecting to the storage. iSCSI, CIFS, >> NFS, >> database, rsync, ...? >> >> The reason I say this is because ZFS will coalesce writes, so just >> looking at >> iostat data (ops versus size) will not be appropriate. You need to >> look at the >> data flowing between ZFS and the users. fsstat works for file systems, >> but >> won''t work for zvols, as an example. >> -- richard > > Actually, maybe that is right. Since the users are connecting via CIFS and > NFS and ssh to use a ZFS volume, it stands to reason that ZFS is ultimately > the thing which is performing all the read/write operations on the physical > disks, right? So if I use iostat, and I see coalesced data ... I get > statistics on ops and size ... which is truly the real world usage scenario > for my system, right? Thus, when I am trying to optimize my disk > configuration, benchmarking with iozone or whatever, those statistics will > be the best measurement for me to use, when I tell iozone the blocksize it > should test. Right?Unfortunately, tools like iozone aren''t at all like the real world. If they were, then life would be simple and sweet :-). Consider the case where your application can take advantage of the cache for some items and prefetching for others. It is possible that these workloads don''t hit the physical disks. In other words, it is rare that an application acts like a 100% cache miss random workload or 100% sequential workload. This is one reason why more complex load generators, like filebench, exist -- a problem that has dogged capacity planners for a long time. But at the end of the day, your real workload will look different. Oh, and add compression and dedup to really spoil your capacity planner''s day :-). One thing is sure: YMMV. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance http://nexenta-atlanta.eventbrite.com (March 16-18, 2010)