besson3c
2010-Jun-07 23:59 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
Hello, I''m wondering if somebody can kindly direct me to a sort of newbie way of assessing whether my ZFS pool performance is a bottleneck that can be improved upon, and/or whether I ought to invest in a SSD ZIL mirrored pair? I''m a little confused by what the output of iostat, fsstat, the zilstat script, and other diagnostic tools illuminates, and I''m definitely not completely confident with what I think I do understand. I''d like to sort of start over from square one with my understanding of all of this. So, instead of my posting a bunch of numbers, could you please help me with some basic tactics and techniques for making these assessments? I have some reason to believe that there are some performance problems, as the loads on the machine writing to these ZFS NFS shares can get pretty high during heavy writing of small files. Throw in the ZFS queue parameters in addition to all of these others numbers and variables and I''m a little confused as to where best to start. It is also a possibility that the ZFS server is not the bottleneck here, but I would love it if I can feel a little more confident in my assessments. Thanks for your help! I expect that this conversation will get pretty technical and that''s cool (that''s what I want too), but hopefully this is enough to get the ball rolling! -- This message posted from opensolaris.org
Khyron
2010-Jun-08 06:33 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
It would be helpful if you posted more information about your configuration. Numbers *are* useful too, but minimally, describing your setup, use case, the hardware and other such facts would provide people a place to start. There are much brighter stars on this list than myself, but if you are sharing your ZFS dataset(s) via NFS with a heavy traffic load (particularly writes), a mirrored SLOG will probably be useful. (The ZIL is a component of every ZFS pool. A SLOG is a device, usually an SSD or mirrored pair of SSDs, on which you can locate your ZIL for enhanced *synchronous* write performance.) Since ZFS does sync writes, that might be a win for you, but again it depends on a lot of factors. Help us (or rather, the community) help you by providing real information and data. On Mon, Jun 7, 2010 at 19:59, besson3c <joe at netmusician.org> wrote:> Hello, > > I''m wondering if somebody can kindly direct me to a sort of newbie way of > assessing whether my ZFS pool performance is a bottleneck that can be > improved upon, and/or whether I ought to invest in a SSD ZIL mirrored pair? > I''m a little confused by what the output of iostat, fsstat, the zilstat > script, and other diagnostic tools illuminates, and I''m definitely not > completely confident with what I think I do understand. I''d like to sort of > start over from square one with my understanding of all of this. > > So, instead of my posting a bunch of numbers, could you please help me with > some basic tactics and techniques for making these assessments? I have some > reason to believe that there are some performance problems, as the loads on > the machine writing to these ZFS NFS shares can get pretty high during heavy > writing of small files. Throw in the ZFS queue parameters in addition to all > of these others numbers and variables and I''m a little confused as to where > best to start. It is also a possibility that the ZFS server is not the > bottleneck here, but I would love it if I can feel a little more confident > in my assessments. > > Thanks for your help! I expect that this conversation will get pretty > technical and that''s cool (that''s what I want too), but hopefully this is > enough to get the ball rolling! > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it''s a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/ea87f7a5/attachment.html>
besson3c
2010-Jun-08 17:33 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
<blockquote>It would be helpful if you posted more information about your configuration. Numbers *are* useful too, but minimally, describing your setup, use case, the hardware and other such facts would provide people a place to start. There are much brighter stars on this list than myself, but if you are sharing your ZFS dataset(s) via NFS with a heavy traffic load (particularly writes), a mirrored SLOG will probably be useful. (The ZIL is a component of every ZFS pool. A SLOG is a device, usually an SSD or mirrored pair of SSDs, on which you can locate your ZIL for enhanced *synchronous* write performance.) Since ZFS does sync writes, that might be a win for you, but again it depends on a lot of factors.</blockquote> Sure! The pool consists of 6 SATA drives configured as RAID-Z. There are no special read or write cache drives. This pool is shared to several VMs via NFS, these VMs manage email, web, and a Quickbooks server running on FreeBSD, Linux, and Windows. On heavy reads or writes (writes seem to be more problematic) my load averages on my VM host shoot up and overall performance is bogged down. I suspect that I do need a mirrored SLOG, but I''m wondering what the best way is to go about assessing this so that I can be more certain about this? I''m also wondering what other sorts of things can be tweaked software-wise on either the VM host (running CentOS) or Solaris side to give me a little more headroom? The thought has crossed my mind that a dedicated SLOG pair of SSDs might be overkill for my needs, this is not a huge business (yet :) Thanks for your help! -- This message posted from opensolaris.org
Brandon High
2010-Jun-08 18:08 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
On Tue, Jun 8, 2010 at 10:33 AM, besson3c <joe at netmusician.org> wrote:> On heavy reads or writes (writes seem to be more problematic) my load averages on my VM host shoot up and overall performance is bogged down. I suspect that I do need a mirrored SLOG, but I''m wondering what the best way isThe load that you''re seeing is probably iowait. If that''s the case, it''s almost certainly the write speed of your pool. A raidz will be slow for your purposes, and adding a zil may help. There''s been lots of discussion in the archives about how to determine if a log device will help, such as using zilstat or disabling the zil and testing. You may want to set the recordsize smaller for the datasets that contain vmdk files as well. With the default recordsize of 128k, a 4k write by the VM host can result in 128k being read from and written to the dataset. What VM software are you using? There are a few knobs you can turn in VBox which will help with slow storage. See http://www.virtualbox.org/manual/ch12.html#id2662300 for instructions on reducing the flush interval. -B -- Brandon High : bhigh at freaks.com
Joe Auty
2010-Jun-08 18:27 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
Brandon High wrote:> On Tue, Jun 8, 2010 at 10:33 AM, besson3c <joe at netmusician.org> wrote: > >> On heavy reads or writes (writes seem to be more problematic) my load averages on my VM host shoot up and overall performance is bogged down. I suspect that I do need a mirrored SLOG, but I''m wondering what the best way is >> > > The load that you''re seeing is probably iowait. If that''s the case, > it''s almost certainly the write speed of your pool. A raidz will be > slow for your purposes, and adding a zil may help. There''s been lots > of discussion in the archives about how to determine if a log device > will help, such as using zilstat or disabling the zil and testing. > > You may want to set the recordsize smaller for the datasets that > contain vmdk files as well. With the default recordsize of 128k, a 4k > write by the VM host can result in 128k being read from and written to > the dataset. > > What VM software are you using? There are a few knobs you can turn in > VBox which will help with slow storage. See > http://www.virtualbox.org/manual/ch12.html#id2662300 for instructions > on reducing the flush interval. > > -B > >I''d love to use Virtualbox, but right now it (3.2.2 commercial which I''m evaluating, I haven''t been able to compile OSE on the CentOS 5.5 host yet) is giving me kernel panics on the host while starting up VMs which are obviously bothersome, so I''m exploring continuing to use VMWare Server and seeing what I can do on the Solaris/ZFS side of things. I''ve also read this on a VMWare forum, although I don''t know if this correct? This is in context to me questioning why I don''t seem to have these same load average problems running Virtualbox:> The problem with the comparison VirtualBox comparison is that caching > is known to be broken in VirtualBox (ignores cache flush, which, by > continuing to cache, can "speed up" IO at the expense of data > integrity or loss). This could be playing in your favor from a > performance perspective, but puts your data at risk. Disabling disk > caching altogether would be a bit hit on the Virtualbox side... > Neither solution is ideal.If this is incorrect and I can get Virtualbox working stably, I''m happy to switch to it. It has definitely performed better prior to my panics, and others on the internet seem to agree that it outperforms VMWare products in general. I''m definitely not opposed to this idea. I''ve actually never seen much, if any iowait (%w in iostat output, right?). I''ve run the zilstat script and am happy to share that output with you if you wouldn''t mind taking a look at it? I''m not sure I''m understanding its output correctly... As far as the recordsizes, the evil tuning guide says this:> Depending on workloads, the current ZFS implementation can, at times, > cause much more I/O to be requested than other page-based file > systems. If the throughput flowing toward the storage, as observed by > iostat, nears the capacity of the channel linking the storage and the > host, tuning down the zfs recordsize should improve performance. This > tuning is dynamic, but only impacts new file creations. Existing files > keep their old recordsize.Will this tuning have an impact on my existing VMDK files? Can you kindly tell me more about this, how I can observe my current recordsize and play around with this setting if it will help? Will adjusting ZFS compression on my share hosting my VMDKs be of any help too? Compression is disabled on my ZFS share where my VMDKs are hosted. This ZFS host hosts regular data shares in addition to the VMDKs. All user data on my VM guests that is subject to change is hosted on a ZFS share, only the OS and basic OS applications are saved to my VMDKs. -- Joe Auty, NetMusician NetMusician helps musicians, bands and artists create beautiful, professional, custom designed, career-essential websites that are easy to maintain and to integrate with popular social networks. www.netmusician.org <http://www.netmusician.org> joe at netmusician.org <mailto:joe at netmusician.org> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/60f9dd71/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: nmtwitter.png Type: image/png Size: 1674 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/60f9dd71/attachment.png>
Brandon High
2010-Jun-08 18:41 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
On Tue, Jun 8, 2010 at 11:27 AM, Joe Auty <joe at netmusician.org> wrote:> things. I''ve also read this on a VMWare forum, although I don''t know if > this correct? This is in context to me questioning why I don''t seem to have > these same load average problems running Virtualbox: > > The problem with the comparison VirtualBox comparison is that caching is > known to be broken in VirtualBox (ignores cache flush, which, by continuing > to cache, can "speed up" IO at the expense of data integrity or loss). This > could be playing in your favor from a performance perspective, but puts your > data at risk. Disabling disk caching altogether would be a bit hit on the > Virtualbox side... Neither solution is ideal. > >Check the link that I posted earlier, under "Responding to guest IDE/SATA flush requests". Setting IgnoreFlush to 0 will turn off the extra caching.> I''ve actually never seen much, if any iowait (%w in iostat output, right?). > I''ve run the zilstat script and am happy to share that output with you if > you wouldn''t mind taking a look at it? I''m not sure I''m understanding its > output correctly... >You''ll see iowait on the VM, not on the zfs server.> Will this tuning have an impact on my existing VMDK files? Can you kindly > tell me more about this, how I can observe my current recordsize and play > around with this setting if it will help? Will adjusting ZFS compression on > my share hosting my VMDKs be of any help too? Compression is disabled on my > ZFS share where my VMDKs are hosted. >No, your existing files will keep whatever recordsize they were created with. You can view or change the recordsize property the same as any other zfs property. You''ll have to recreate the files to re-write them with a different recordsize. (eg: copy file.vmdk file.vmdk.foo ; if $?; then mv file.vmdk.foo file.vmdk; fi)> This ZFS host hosts regular data shares in addition to the VMDKs. All user > data on my VM guests that is subject to change is hosted on a ZFS share, > only the OS and basic OS applications are saved to my VMDKs. >The property is per dataset. If the vmdk files are in separate datasets (which I recommend) you can adjust the properties or take snapshots of each VM''s data separately. -B -- Brandon High : bhigh at freaks.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/54beacff/attachment.html>
Joe Auty
2010-Jun-08 19:04 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
Brandon High wrote:> On Tue, Jun 8, 2010 at 11:27 AM, Joe Auty <joe at netmusician.org > <mailto:joe at netmusician.org>> wrote: > > things. I''ve also read this on a VMWare forum, although I don''t > know if this correct? This is in context to me questioning why I > don''t seem to have these same load average problems running > Virtualbox: > >> The problem with the comparison VirtualBox comparison is that >> caching is known to be broken in VirtualBox (ignores cache flush, >> which, by continuing to cache, can "speed up" IO at the expense >> of data integrity or loss). This could be playing in your favor >> from a performance perspective, but puts your data at risk. >> Disabling disk caching altogether would be a bit hit on the >> Virtualbox side... Neither solution is ideal. > > > Check the link that I posted earlier, under "Responding to guest > IDE/SATA flush requests". Setting IgnoreFlush to 0 will turn off the > extra caching. >Cool, so maybe this guy was going off of earlier information? Was there a time when there was no way to enable cache flushing in Virtualbox?> I''ve actually never seen much, if any iowait (%w in iostat output, > right?). I''ve run the zilstat script and am happy to share that > output with you if you wouldn''t mind taking a look at it? I''m not > sure I''m understanding its output correctly... > > > You''ll see iowait on the VM, not on the zfs server. >My mistake, yes I see pretty significant iowait times on the host... Right now "iostat" is showing 9.30% wait times.> > > Will this tuning have an impact on my existing VMDK files? Can you > kindly tell me more about this, how I can observe my current > recordsize and play around with this setting if it will help? Will > adjusting ZFS compression on my share hosting my VMDKs be of any > help too? Compression is disabled on my ZFS share where my VMDKs > are hosted. > > > No, your existing files will keep whatever recordsize they were > created with. You can view or change the recordsize property the same > as any other zfs property. You''ll have to recreate the files to > re-write them with a different recordsize. (eg: copy file.vmdk > file.vmdk.foo ; if $?; then mv file.vmdk.foo file.vmdk; fi) > > > This ZFS host hosts regular data shares in addition to the VMDKs. > All user data on my VM guests that is subject to change is hosted > on a ZFS share, only the OS and basic OS applications are saved to > my VMDKs. > > > The property is per dataset. If the vmdk files are in separate > datasets (which I recommend) you can adjust the properties or take > snapshots of each VM''s data separately. > >Ahhh! Yes, my VMDKs are on a separate dataset, and recordsizes are set to 128k: # zfs get recordsize nm/myshare NAME PROPERTY VALUE SOURCE nm/myshare recordsize 128K default Do you have a recommendation for a good size to start with for the dataset hosting VMDKs? Half of 128K? A third? In general large files are better served with smaller recordsizes, whereas small files are better served with the 128k default? -- Joe Auty, NetMusician NetMusician helps musicians, bands and artists create beautiful, professional, custom designed, career-essential websites that are easy to maintain and to integrate with popular social networks. www.netmusician.org <http://www.netmusician.org> joe at netmusician.org <mailto:joe at netmusician.org> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/96536b8f/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: nmtwitter.png Type: image/png Size: 1674 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/96536b8f/attachment.png>
Brandon High
2010-Jun-08 20:10 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
On Tue, Jun 8, 2010 at 12:04 PM, Joe Auty <joe at netmusician.org> wrote:> > Cool, so maybe this guy was going off of earlier information? Was there > a time when there was no way to enable cache flushing in Virtualbox? >The default is to ignore cache flushes, so he was correct for the default setting. The IgnoreFlush command has existed since 2.0 at least. My mistake, yes I see pretty significant iowait times on the host... Right> now "iostat" is showing 9.30% wait times. >That''s not too bad, but not great. Here''s from a system at work: avg-cpu: %user %nice %system %iowait %steal %idle 2.99 0.00 3.98 92.54 0.50 0.00 The problem is that io gets bursty, so you''ll have good speeds for the most part, followed by some large waits. Small writes to the vmdk will have the worst performance, since the 128k block has to be read and written out with the change. Because your guest has /var on the vmdk, there are constant small writes going to the pool.> Do you have a recommendation for a good size to start with for the dataset > hosting VMDKs? Half of 128K? A third? >There are inherit tradeoffs using smaller blocks, notably more overhead for checksums. zvols use an 8k volblocksize by default, which is probably a decent size.> In general large files are better served with smaller recordsizes, whereas > small files are better served with the 128k default? >Files that have random small writes in the middle of the data will have poor performance. Things such as database files, vmdk files, etc. Other than specific cases like what you''ve run into, you shouldn''t ever need to adjust the recordsize. -B -- Brandon High : bhigh at freaks.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/3991f87d/attachment.html>
Joe Auty
2010-Jun-08 22:00 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
I''m also noticing that I''m a little short on RAM. I have 6 320 gig drives and 4 gig of RAM. If the formula is POOL_SIZE/250, this would mean that I need at least 6.4 gig of RAM. What role does RAM play with queuing and caching and other things which might impact overall disk performance? How much more RAM should I get?
Ross Walker
2010-Jun-09 14:05 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
On Jun 8, 2010, at 1:33 PM, besson3c <joe at netmusician.org> wrote:> > Sure! The pool consists of 6 SATA drives configured as RAID-Z. There > are no special read or write cache drives. This pool is shared to > several VMs via NFS, these VMs manage email, web, and a Quickbooks > server running on FreeBSD, Linux, and Windows.Ok, well RAIDZ is going to be a problem here. Because each record is spread across the whole pool (each read/write will hit all drives in the pool) which has the side effect of making the total number of IOPS equal to the total number of IOPS of the slowest drive in the pool. Since these are SATA let''s say the total number of IOPS will be 80 which is not good enough for what is a mostly random workload. If it were a 6 drive pool of mirrors then it would be able to handle 240 IOPS write and up to 480 IOPS read (can read from either side of mirror). I would probably rethink the setup. ZIL wil not buy you much here and if your VM software is like VMware then each write over NFS will be marked FSYNC which will force the lack of IOPS to the surface. -Ross
Travis Tabbal
2010-Jun-09 15:31 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
NFS writes on ZFS blows chunks performance wise. The only way to increase the write speed is by using an slog, the problem is that a "proper" slog device (one that doesn''t lose transactions) does not exist for a reasonable price. The least expensive SSD that will work is the Intel X25-E, and even then you have to disable the write cache, which kills performance. And if you lose transactions in the ZIL, you may as well not have one. Switching to a pool configuration with mirrors might help some. You will still get hit with sync write penalties on NFS though. Before messing with that, try disabling the ZIL entirely and see if that''s where your problems are. Note that running without a ZIL can cause you to lose about 30secs of uncommitted data and if the server crashes without the clients rebooting, you can get corrupted data (from the client''s perspective). However, it solved the performance issue for me. If that works, you can then decide how important the ZIL is to you. Personally, I like things to be correct, but that doesn''t help me if performance is in the toilet. In my case, the server is on a UPS, the clients aren''t. And most of the clients use netboot anyway, so they will crash and have to be rebooted if the server goes down. So for me, the drawback is small while the performance gain is huge. That''s not the case for everyone, and it''s up to the admin to decide what they can live with. Thankfully, the next release of OpenSolaris will have the ability to set ZIL on/off per filesystem. Note that the ZIL only effects sync write speed, so if your workload isn''t sync heavy, it might not matter in your case. However, with NFS in the mix, it probably is. The ZFS on-disk data state is not effected by ZIL on/off, so your pool''s data IS safe. You might lose some data that a client THINKS is safely written, but the ZFS pool will come back properly on reboot. So the client will be wrong about what is and is not written, thus the possible "corruption" from the client perspective. I run ZFS on 2 6-disk raidz2 arrays in the same pool and performance is very good locally. With ZIL enabled, NFS performance was so bad it was near unusable. With it disabled, I can saturate the single gigabit link and performance in the Linux VM (xVM) running on that server improved significantly, to near local speed, when using the NFS mounts to the main pool. My 5400RPM drives were not up to ZIL''s needs, though they are plenty fast in general, and a working slog was out of budget for a home server. -- This message posted from opensolaris.org
Edward Ned Harvey
2010-Jun-09 15:41 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of besson3c > > I''m wondering if somebody can kindly direct me to a sort of newbie way > of assessing whether my ZFS pool performance is a bottleneck that can > be improved upon, and/or whether I ought to invest in a SSD ZIL > mirrored pair? I''m a little confused by what the output of iostat,There are a few generalities I can state, which may be of use: * If you are serving NFS, then it''s likely you''re doing sync write operations, and therefore likely that a dedicated zil log device could benefit your write performance. To find out, you can disable your ZIL (requires dismounting & remounting filesystem) temporarily and test performance with the ZIL disabled. If there is anything less than a huge performance gain, then there''s no need for a dedicated log device. * If you are doing large sequential read/write, then the performance of striping/mirroring/raidz are all comparable given similar numbers of usable disks. That is, specifically: o If you do a large sequential read, with 3 mirrors (6 disks) then you get 6x performance of a single disk. o If you do a large sequential read, with 7-disk raidz (capacity of 6 disks) then you get 6x performance of a single disk. o If you do a large sequential write, with 3 mirrors (6 disks) then you get 3x performance of a single disk. o If you do a large sequential write, with 7-disk raidz (capacity of 6 disks) then you get 6x performance of a single disk. * So, for large sequential operations, the raidz would be cheaper and probably slightly faster. * If you do small random operations, then striping/mirroring can vastly outperform raidz. Specifically: o If you do random reads, with 3 mirrors (6 disks) then you get 4x-5x performance of a single disk. (Assuming you have multiple threads or processes issuing those reads, or your read requests are queueable in any way.) o If you do random reads, with 7-disk raidz (capacity of 6 disks) you get about 50% faster than a single disk o If you do random writes, with 3 mirrors, then you get about 2x performance of a single disk o If you do random writes, with 7-disk raidz, you get about 50% faster than a single disk * So, for small operations, the striping/mirroring would certainly be faster.
Geoff Nordli
2010-Jun-09 16:20 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
> On Behalf Of Joe Auty >Sent: Tuesday, June 08, 2010 11:27 AM > > >I''d love to use Virtualbox, but right now it (3.2.2 commercial which I''m >evaluating, I haven''t been able to compile OSE on the CentOS 5.5 host yet)is>giving me kernel panics on the host while starting up VMs which areobviously>bothersome, so I''m exploring continuing to use VMWare Server and seeingwhat I>can do on the Solaris/ZFS side of things. I''ve also read this on a VMWareforum,>although I don''t know if this correct? This is in context to me questioningwhy I>don''t seem to have these same load average problems running Virtualbox: > >Hi Joe. One thing about Vbox is they are rapidly adding new features which cause some instability and regressions. Unless there is a real need for one of the new features in the 3.2 branch, I would recommend working with the 3.0 branch in a production environment. They will announce when they feel that 3.2 becomes production ready. VirtualBox is a great type 2 hypervisor, and I can''t believe how much it has improved over the last year. Have a great day! Geoff
Geoff Nordli
2010-Jun-09 16:20 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
> >Brandon High wrote: >On Tue, Jun 8, 2010 at 10:33 AM, besson3c <joe at netmusician.org> wrote: > > >What VM software are you using? There are a few knobs you can turn in VBox >which will help with slow storage. See >http://www.virtualbox.org/manual/ch12.html#id2662300 for instructions on >reducing the flush interval. > >-BHi Brandon. Have you played with the flush interval? I am using iscsi based zvols, and I am thinking about not using the caching in vbox and instead rely on the comstar/zfs side. What do you think? Geoff
Garrett D''Amore
2010-Jun-09 17:13 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
You can hardly have too much. At least 8 GB, maybe 16 would be good. The benefit will depend on your workload, but zfs and buffer cache will use it all if you have a big enough read working set. -- Garrett Joe Auty <joe at netmusician.org> wrote:>I''m also noticing that I''m a little short on RAM. I have 6 320 gig >drives and 4 gig of RAM. If the formula is POOL_SIZE/250, this would >mean that I need at least 6.4 gig of RAM. > >What role does RAM play with queuing and caching and other things which >might impact overall disk performance? How much more RAM should I get? > > >_______________________________________________ >zfs-discuss mailing list >zfs-discuss at opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Joe Auty
2010-Jun-09 17:46 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
Garrett D''Amore wrote:> You can hardly have too much. At least 8 GB, maybe 16 would be good. > > The benefit will depend on your workload, but zfs and buffer cache will use it all if you have a big enough read working set. >Could lack of RAM be contributing to some of my problems, do you think?> -- Garrett > > Joe Auty <joe at netmusician.org> wrote: > > >> I''m also noticing that I''m a little short on RAM. I have 6 320 gig >> drives and 4 gig of RAM. If the formula is POOL_SIZE/250, this would >> mean that I need at least 6.4 gig of RAM. >> >> What role does RAM play with queuing and caching and other things which >> might impact overall disk performance? How much more RAM should I get? >> >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >>-- Joe Auty, NetMusician NetMusician helps musicians, bands and artists create beautiful, professional, custom designed, career-essential websites that are easy to maintain and to integrate with popular social networks. www.netmusician.org <http://www.netmusician.org> joe at netmusician.org <mailto:joe at netmusician.org> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100609/dcb9ffc9/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: nmtwitter.png Type: image/png Size: 1674 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100609/dcb9ffc9/attachment.png>
Brandon High
2010-Jun-09 18:49 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
On Wed, Jun 9, 2010 at 9:20 AM, Geoff Nordli <geoffn at grokworx.com> wrote:> Have you played with the flush interval? > > I am using iscsi based zvols, and I am thinking about not using the caching > in vbox and instead rely on the comstar/zfs side. > > What do you think?If you care about your data, IgnoreFlush should always be set for all the drives. This ensure that a flush request from the guest actually writes data to disk. FlushInterval is a little different, in that it prevents the amount of buffered write data from from stalling the guest when the host finally does write it out. It''s OK to let VBox cache some data if you have fast enough storage. If your storage is reasonably fast, you shouldn''t need to touch FlushInterval. As far as my experience, my zpool is an 8 disk raidz2 comprised of 5400 rpm drives, so it''s definitely at the low end of the performance spectrum. The OpenSolaris machine is hosting 3 linux guests in VirtualBox 3.0. Initially I was using disk images in a zfs filesystem. I was having trouble with IO failing stalling and guests remounting their disks read-only. Setting FlushInterval to 10MB (as recommended in the VBox manual) prevented the host from hanging but disk performance was still poor. I''ve moved to using raw disks mapped to zvols (/dev/zvol/rdsk) and removed the FlushInterval settings. The io stalls that I encountered using image files went away. -B -- Brandon High : bhigh at freaks.com
Edward Ned Harvey
2010-Jun-09 21:04 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Joe Auty > > I''m also noticing that I''m a little short on RAM. I have 6 320 gig > drives and 4 gig of RAM. If the formula is POOL_SIZE/250, this would > mean that I need at least 6.4 gig of RAM. > > What role does RAM play with queuing and caching and other things which > might impact overall disk performance? How much more RAM should I get?Excess ram accelerates everything. There is no such thing as a "rule of thumb." The OS will cache everything it''s read before, whenever it can (the memory isn''t requested for some other purpose). To avoid having to fetch it a 2nd time from disk. The OS will also buffer all writes in memory, and attempt to optimize for the type of storage available, before pushing it out to disk. (Sync writes will hit the ZIL, stay in ram, and then hit the disk again.) If you have compression or dedup enabled, these benefit enormously from additional ram. Everything is faster with more ram. There is no limit, unless the total used disk in your system is smaller than the available ram in your system ... which seems very improbable. The more ram, the better. Choose how much money you''re willing to spend, and spend that much on ram.
Kyle McDonald
2010-Jun-09 21:11 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 6/9/2010 5:04 PM, Edward Ned Harvey wrote:> > Everything is faster with more ram. There is no limit, unless the total > used disk in your system is smaller than the available ram in your system > ... which seems very improbable. >Off topic, but... When I managed a build/simulation farm for one of Sun''s ASIC design teams, we had several 24 CPU machines with 96GB or 192GB of RAM and only 36GB or maybe 73GB of disk. Probably a special case though. ;) -Kyle -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJMEANnAAoJEEADRM+bKN5w+8EH/iUP/eEZUkZLLCyqgKN89yfy TBePmfHwBgneIvcW+YJrk1aKysXAze/PNxP4tBtUsgoqrbmPQTFqFkAcIrLxw1Sf udmSD+LQsOAult2W5e/jpJIxbPQRnbWqUuyatimN0xRF6Fs9/D5fFX8LDvjl5Eqb daf+e2fRGFn0rvQ2g+TQpulR6PwQTdkmh+e7oYkQ7kV6DvKjjbPVApRKrurNVMR5 SQbArcm6xwCmq5x+Yn2bXERlM8IPA9Z4APxScY6P7yxc3yqFbKyosEU98fP1JJtR GWflGBRc+uysozCu6Dc2WSek/loIRnihzDTDtdcZynLXsN7if139LaCGYFRx1j4=ylMM -----END PGP SIGNATURE-----
Bob Friesenhahn
2010-Jun-10 17:44 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
On Wed, 9 Jun 2010, Travis Tabbal wrote:> NFS writes on ZFS blows chunks performance wise. The only way to > increase the write speed is by using an slogThe above statement is not quite true. RAID-style adaptor cards which contain battery backed RAM or RAID arrays which include battery backed RAM also help immensely. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Bob Friesenhahn
2010-Jun-10 17:52 UTC
[zfs-discuss] General help with understanding ZFS performance bottlenecks
On Wed, 9 Jun 2010, Edward Ned Harvey wrote:> disks. That is, specifically: > o If you do a large sequential read, with 3 mirrors (6 disks) then you get > 6x performance of a single disk.Should say "up to 6x". Which disk in the pair will be read from is random so you are unlikely to get the full 6x.> o If you do a large sequential read, with 7-disk raidz (capacity of 6 > disks) then you get 6x performance of a single disk.Probably should say "up to 6x" as well. This configuration is more sensitive to latency and available disk IOPS becomes more critical.> o If you do a large sequential write, with 3 mirrors (6 disks) then you > get 3x performance of a single disk.Also an "up to" type value. Perhaps you will only get 1.5X because of some I/O bottleneck between the CPU and the mirrored disks (i.e. two writes at once may cause I/O contention). These rules of thumb are not terribly accurate. If performance is important, then there is no substitute for actual testing. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/