thr3ads.net - zfs discuss - [zfs-discuss] General help with understanding ZFS performance bottlenecks [Jun 2010]

If this information is useful, please help other people find it:
Share via:

besson3c

2010-Jun-07 23:59 UTC

[zfs-discuss] General help with understanding ZFS performance bottlenecks

Hello,

I''m wondering if somebody can kindly direct me to a sort of newbie way
of assessing whether my ZFS pool performance is a bottleneck that can be
improved upon, and/or whether I ought to invest in a SSD ZIL mirrored pair?
I''m a little confused by what the output of iostat, fsstat, the zilstat
script, and other diagnostic tools illuminates, and I''m definitely not
completely confident with what I think I do understand. I''d like to
sort of start over from square one with my understanding of all of this.

So, instead of my posting a bunch of numbers, could you please help me with some
basic tactics and techniques for making these assessments? I have some reason to
believe that there are some performance problems, as the loads on the machine
writing to these ZFS NFS shares can get pretty high during heavy writing of
small files. Throw in the ZFS queue parameters in addition to all of these
others numbers and variables and I''m a little confused as to where best
to start. It is also a possibility that the ZFS server is not the bottleneck
here, but I would love it if I can feel a little more confident in my
assessments.

Thanks for your help! I expect that this conversation will get pretty technical
and that''s cool (that''s what I want too), but hopefully this
is enough to get the ball rolling!
-- 
This message posted from opensolaris.org

Khyron

2010-Jun-08 06:33 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

It would be helpful if you posted more information about your
configuration.
Numbers *are* useful too, but minimally, describing your setup, use case,
the hardware and other such facts would provide people a place to start.

There are much brighter stars on this list than myself, but if you are
sharing
your ZFS dataset(s) via NFS with a heavy traffic load (particularly
writes),
a mirrored SLOG will probably be useful.  (The ZIL is a component of every
ZFS pool.  A SLOG is a device, usually an SSD or mirrored pair of SSDs,
on which you can locate your ZIL for enhanced *synchronous* write
performance.)  Since ZFS does sync writes, that might be a win for you, but
again it depends on a lot of factors.

Help us (or rather, the community) help you by providing real information
and data.

On Mon, Jun 7, 2010 at 19:59, besson3c <joe at netmusician.org> wrote:
> Hello,
>
> I''m wondering if somebody can kindly direct me to a sort of newbie
way of
> assessing whether my ZFS pool performance is a bottleneck that can be
> improved upon, and/or whether I ought to invest in a SSD ZIL mirrored pair?
> I''m a little confused by what the output of iostat, fsstat, the
zilstat
> script, and other diagnostic tools illuminates, and I''m definitely
not
> completely confident with what I think I do understand. I''d like
to sort of
> start over from square one with my understanding of all of this.
>
> So, instead of my posting a bunch of numbers, could you please help me with
> some basic tactics and techniques for making these assessments? I have some
> reason to believe that there are some performance problems, as the loads on
> the machine writing to these ZFS NFS shares can get pretty high during
heavy
> writing of small files. Throw in the ZFS queue parameters in addition to
all
> of these others numbers and variables and I''m a little confused as
to where
> best to start. It is also a possibility that the ZFS server is not the
> bottleneck here, but I would love it if I can feel a little more confident
> in my assessments.
>
> Thanks for your help! I expect that this conversation will get pretty
> technical and that''s cool (that''s what I want too), but
hopefully this is
> enough to get the ball rolling!
> --
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


-- 
"You can choose your friends, you can choose the deals." - Equity
Private

"If Linux is faster, it''s a Solaris bug." - Phil Harman

Blog - http://whatderass.blogspot.com/
Twitter - @khyron4eva
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/ea87f7a5/attachment.html>

besson3c

2010-Jun-08 17:33 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

<blockquote>It would be helpful if you posted more information about your
configuration.
Numbers *are* useful too, but minimally, describing your setup, use case, 
the hardware and other such facts would provide people a place to start. 

There are much brighter stars on this list than myself, but if you are sharing 
your ZFS dataset(s) via NFS with a heavy traffic load (particularly writes), 
a mirrored SLOG will probably be useful.  (The ZIL is a component of every 
ZFS pool.  A SLOG is a device, usually an SSD or mirrored pair of SSDs, 
on which you can locate your ZIL for enhanced *synchronous* write 
performance.)  Since ZFS does sync writes, that might be a win for you, but 
again it depends on a lot of factors.</blockquote>

Sure! The pool consists of 6 SATA drives configured as RAID-Z. There are no
special read or write cache drives. This pool is shared to several VMs via NFS,
these VMs manage email, web, and a Quickbooks server running on FreeBSD, Linux,
and Windows.

On heavy reads or writes (writes seem to be more problematic) my load averages
on my VM host shoot up and overall performance is bogged down. I suspect that I
do need a mirrored SLOG, but I''m wondering what the best way is to go
about assessing this so that I can be more certain about this? I''m also
wondering what other sorts of things can be tweaked software-wise on either the
VM host (running CentOS) or Solaris side to give me a little more headroom? The
thought has crossed my mind that a dedicated SLOG pair of SSDs might be overkill
for my needs, this is not a huge business (yet :)

Thanks for your help!
-- 
This message posted from opensolaris.org

Brandon High

2010-Jun-08 18:08 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

On Tue, Jun 8, 2010 at 10:33 AM, besson3c <joe at netmusician.org>
wrote:> On heavy reads or writes (writes seem to be more problematic) my load
averages on my VM host shoot up and overall performance is bogged down. I
suspect that I do need a mirrored SLOG, but I''m wondering what the best
way is
The load that you''re seeing is probably iowait. If that''s the
case,
it''s almost certainly the write speed of your pool. A raidz will be
slow for your purposes, and adding a zil may help. There''s been lots
of discussion in the archives about how to determine if a log device
will help, such as using zilstat or disabling the zil and testing.

You may want to set the recordsize smaller for the datasets that
contain vmdk files as well. With the default recordsize of 128k, a 4k
write by the VM host can result in 128k being read from and written to
the dataset.

What VM software are you using? There are a few knobs you can turn in
VBox which will help with slow storage. See
http://www.virtualbox.org/manual/ch12.html#id2662300 for instructions
on reducing the flush interval.

-B

-- 
Brandon High : bhigh at freaks.com

Joe Auty

2010-Jun-08 18:27 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

Brandon High wrote:> On Tue, Jun 8, 2010 at 10:33 AM, besson3c <joe at netmusician.org>
wrote:
>   
>> On heavy reads or writes (writes seem to be more problematic) my load
averages on my VM host shoot up and overall performance is bogged down. I
suspect that I do need a mirrored SLOG, but I''m wondering what the best
way is
>>     
>
> The load that you''re seeing is probably iowait. If that''s
the case,
> it''s almost certainly the write speed of your pool. A raidz will
be
> slow for your purposes, and adding a zil may help. There''s been
lots
> of discussion in the archives about how to determine if a log device
> will help, such as using zilstat or disabling the zil and testing.
>
> You may want to set the recordsize smaller for the datasets that
> contain vmdk files as well. With the default recordsize of 128k, a 4k
> write by the VM host can result in 128k being read from and written to
> the dataset.
>
> What VM software are you using? There are a few knobs you can turn in
> VBox which will help with slow storage. See
> http://www.virtualbox.org/manual/ch12.html#id2662300 for instructions
> on reducing the flush interval.
>
> -B
>
>   
I''d love to use Virtualbox, but right now it (3.2.2 commercial which
I''m
evaluating, I haven''t been able to compile OSE on the CentOS 5.5 host
yet) is giving me kernel panics on the host while starting up VMs which
are obviously bothersome, so I''m exploring continuing to use VMWare
Server and seeing what I can do on the Solaris/ZFS side of things. I''ve
also read this on a VMWare forum, although I don''t know if this
correct?
This is in context to me questioning why I don''t seem to have these
same
load average problems running Virtualbox:
> The problem with the comparison VirtualBox comparison is that caching
> is known to be broken in VirtualBox (ignores cache flush, which, by
> continuing to cache, can "speed up" IO at the expense of data
> integrity or loss). This could be playing in your favor from a
> performance perspective, but puts your data at risk. Disabling disk
> caching altogether would be a bit hit on the Virtualbox side...
> Neither solution is ideal.
If this is incorrect and I can get Virtualbox working stably, I''m happy
to switch to it. It has definitely performed better prior to my panics,
and others on the internet seem to agree that it outperforms VMWare
products in general. I''m definitely not opposed to this idea.

I''ve actually never seen much, if any iowait (%w in iostat output,
right?). I''ve run the zilstat script and am happy to share that output
with you if you wouldn''t mind taking a look at it? I''m not
sure I''m
understanding its output correctly...

As far as the recordsizes, the evil tuning guide says this:
> Depending on workloads, the current ZFS implementation can, at times,
> cause much more I/O to be requested than other page-based file
> systems. If the throughput flowing toward the storage, as observed by
> iostat, nears the capacity of the channel linking the storage and the
> host, tuning down the zfs recordsize should improve performance. This
> tuning is dynamic, but only impacts new file creations. Existing files
> keep their old recordsize.
Will this tuning have an impact on my existing VMDK files? Can you
kindly tell me more about this, how I can observe my current recordsize
and play around with this setting if it will help? Will adjusting ZFS
compression on my share hosting my VMDKs be of any help too? Compression
is disabled on my ZFS share where my VMDKs are hosted.

This ZFS host hosts regular data shares in addition to the VMDKs. All
user data on my VM guests that is subject to change is hosted on a ZFS
share, only the OS and basic OS applications are saved to my VMDKs.



-- 
Joe Auty, NetMusician
NetMusician helps musicians, bands and artists create beautiful,
professional, custom designed, career-essential websites that are easy
to maintain and to integrate with popular social networks.
www.netmusician.org <http://www.netmusician.org>
joe at netmusician.org <mailto:joe at netmusician.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/60f9dd71/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nmtwitter.png
Type: image/png
Size: 1674 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/60f9dd71/attachment.png>

Brandon High

2010-Jun-08 18:41 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

On Tue, Jun 8, 2010 at 11:27 AM, Joe Auty <joe at netmusician.org> wrote:
> things. I''ve also read this on a VMWare forum, although I
don''t know if
> this correct? This is in context to me questioning why I don''t
seem to have
> these same load average problems running Virtualbox:
>
> The problem with the comparison VirtualBox comparison is that caching is
> known to be broken in VirtualBox (ignores cache flush, which, by continuing
> to cache, can "speed up" IO at the expense of data integrity or
loss). This
> could be playing in your favor from a performance perspective, but puts
your
> data at risk. Disabling disk caching altogether would be a bit hit on the
> Virtualbox side... Neither solution is ideal.
>
>Check the link that I posted earlier, under "Responding to guest IDE/SATA
flush requests". Setting IgnoreFlush to 0 will turn off the extra caching.

> I''ve actually never seen much, if any iowait (%w in iostat output,
right?).
> I''ve run the zilstat script and am happy to share that output with
you if
> you wouldn''t mind taking a look at it? I''m not sure
I''m understanding its
> output correctly...
>
You''ll see iowait on the VM, not on the zfs server.


> Will this tuning have an impact on my existing VMDK files? Can you kindly
> tell me more about this, how I can observe my current recordsize and play
> around with this setting if it will help? Will adjusting ZFS compression on
> my share hosting my VMDKs be of any help too? Compression is disabled on my
> ZFS share where my VMDKs are hosted.
>
No, your existing files will keep whatever recordsize they were created
with. You can view or change the recordsize property the same as any other
zfs property. You''ll have to recreate the files to re-write them with a
different recordsize. (eg: copy file.vmdk file.vmdk.foo ;  if $?; then mv
file.vmdk.foo file.vmdk; fi)

> This ZFS host hosts regular data shares in addition to the VMDKs. All user
> data on my VM guests that is subject to change is hosted on a ZFS share,
> only the OS and basic OS applications are saved to my VMDKs.
>
The property is per dataset. If the vmdk files are in separate datasets
(which I recommend) you can adjust the properties or take snapshots of each
VM''s data separately.

-B

-- 
Brandon High : bhigh at freaks.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/54beacff/attachment.html>

Joe Auty

2010-Jun-08 19:04 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

Brandon High wrote:> On Tue, Jun 8, 2010 at 11:27 AM, Joe Auty <joe at netmusician.org
> <mailto:joe at netmusician.org>> wrote:
>
>     things. I''ve also read this on a VMWare forum, although I
don''t
>     know if this correct? This is in context to me questioning why I
>     don''t seem to have these same load average problems running
>     Virtualbox:
>
>>     The problem with the comparison VirtualBox comparison is that
>>     caching is known to be broken in VirtualBox (ignores cache flush,
>>     which, by continuing to cache, can "speed up" IO at the
expense
>>     of data integrity or loss). This could be playing in your favor
>>     from a performance perspective, but puts your data at risk.
>>     Disabling disk caching altogether would be a bit hit on the
>>     Virtualbox side... Neither solution is ideal. 
>
>
> Check the link that I posted earlier, under "Responding to guest
> IDE/SATA flush requests". Setting IgnoreFlush to 0 will turn off the
> extra caching.
>  Cool, so maybe this guy was going off of earlier information? Was there
a time when there was no way to enable cache flushing in Virtualbox?

>     I''ve actually never seen much, if any iowait (%w in iostat
output,
>     right?). I''ve run the zilstat script and am happy to share
that
>     output with you if you wouldn''t mind taking a look at it?
I''m not
>     sure I''m understanding its output correctly...
>
>
> You''ll see iowait on the VM, not on the zfs server.
>  My mistake, yes I see pretty significant iowait times on the host...
Right now "iostat" is showing 9.30% wait times.

>  
>
>     Will this tuning have an impact on my existing VMDK files? Can you
>     kindly tell me more about this, how I can observe my current
>     recordsize and play around with this setting if it will help? Will
>     adjusting ZFS compression on my share hosting my VMDKs be of any
>     help too? Compression is disabled on my ZFS share where my VMDKs
>     are hosted.
>
>
> No, your existing files will keep whatever recordsize they were
> created with. You can view or change the recordsize property the same
> as any other zfs property. You''ll have to recreate the files to
> re-write them with a different recordsize. (eg: copy file.vmdk
> file.vmdk.foo ;  if $?; then mv file.vmdk.foo file.vmdk; fi)
>  
>
>     This ZFS host hosts regular data shares in addition to the VMDKs.
>     All user data on my VM guests that is subject to change is hosted
>     on a ZFS share, only the OS and basic OS applications are saved to
>     my VMDKs.
>
>
> The property is per dataset. If the vmdk files are in separate
> datasets (which I recommend) you can adjust the properties or take
> snapshots of each VM''s data separately.
>  
>
Ahhh! Yes, my VMDKs are on a separate dataset, and recordsizes are set
to 128k:

# zfs get recordsize nm/myshare
NAME       PROPERTY    VALUE    SOURCE
nm/myshare  recordsize  128K     default

Do you have a recommendation for a good size to start with for the
dataset hosting VMDKs? Half of 128K? A third?


In general large files are better served with smaller recordsizes,
whereas small files are better served with the 128k default?




-- 
Joe Auty, NetMusician
NetMusician helps musicians, bands and artists create beautiful,
professional, custom designed, career-essential websites that are easy
to maintain and to integrate with popular social networks.
www.netmusician.org <http://www.netmusician.org>
joe at netmusician.org <mailto:joe at netmusician.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/96536b8f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nmtwitter.png
Type: image/png
Size: 1674 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/96536b8f/attachment.png>

Brandon High

2010-Jun-08 20:10 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

On Tue, Jun 8, 2010 at 12:04 PM, Joe Auty <joe at netmusician.org>
wrote:>
>   Cool, so maybe this guy was going off of earlier information? Was there
> a time when there was no way to enable cache flushing in Virtualbox?
>
The default is to ignore cache flushes, so he was correct for the default
setting. The IgnoreFlush command has existed since 2.0 at least.

My mistake, yes I see pretty significant iowait times on the host...
Right> now "iostat" is showing 9.30% wait times.
>
That''s not too bad, but not great. Here''s from a system at
work:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.99    0.00    3.98   92.54    0.50    0.00

The problem is that io gets bursty, so you''ll have good speeds for the
most
part, followed by some large waits. Small writes to the vmdk will have the
worst performance, since the 128k block has to be read and written out with
the change. Because your guest has /var on the vmdk, there are constant
small writes going to the pool.

> Do you have a recommendation for a good size to start with for the dataset
> hosting VMDKs? Half of 128K? A third?
>
There are inherit tradeoffs using smaller blocks, notably more overhead for
checksums.

zvols use an 8k volblocksize by default, which is probably a decent size.

> In general large files are better served with smaller recordsizes, whereas
> small files are better served with the 128k default?
>
Files that have random small writes in the middle of the data will have poor
performance. Things such as database files, vmdk files, etc. Other than
specific cases like what you''ve run into, you shouldn''t ever
need to adjust
the recordsize.

-B

-- 
Brandon High : bhigh at freaks.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100608/3991f87d/attachment.html>

Joe Auty

2010-Jun-08 22:00 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

I''m also noticing that I''m a little short on RAM. I have 6 320
gig
drives and 4 gig of RAM. If the formula is POOL_SIZE/250, this would
mean that I need at least 6.4 gig of RAM.

What role does RAM play with queuing and caching and other things which
might impact overall disk performance? How much more RAM should I get?

Ross Walker

2010-Jun-09 14:05 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

On Jun 8, 2010, at 1:33 PM, besson3c <joe at netmusician.org> wrote:
>
> Sure! The pool consists of 6 SATA drives configured as RAID-Z. There  
> are no special read or write cache drives. This pool is shared to  
> several VMs via NFS, these VMs manage email, web, and a Quickbooks  
> server running on FreeBSD, Linux, and Windows.
Ok, well RAIDZ is going to be a problem here. Because each record is  
spread across the whole pool (each read/write will hit all drives in  
the pool) which has the side effect of making the total number of IOPS  
equal to the total number of IOPS of the slowest drive in the pool.

Since these are SATA let''s say the total number of IOPS will be 80  
which is not good enough for what is a mostly random workload.

If it were a 6 drive pool of mirrors then it would be able to handle  
240 IOPS write and up to 480 IOPS read (can read from either side of  
mirror).

I would probably rethink the setup.

ZIL wil not buy you much here and if your VM software is like VMware  
then each write over NFS will be marked FSYNC which will force the  
lack of IOPS to the surface.

-Ross

Travis Tabbal

2010-Jun-09 15:31 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

NFS writes on ZFS blows chunks performance wise. The only way to increase the
write speed is by using an slog, the problem is that a "proper" slog
device (one that doesn''t lose transactions) does not exist for a
reasonable price. The least expensive SSD that will work is the Intel X25-E, and
even then you have to disable the write cache, which kills performance. And if
you lose transactions in the ZIL, you may as well not have one.

Switching to a pool configuration with mirrors might help some. You will still
get hit with sync write penalties on NFS though.

Before messing with that, try disabling the ZIL entirely and see if
that''s where your problems are. Note that running without a ZIL can
cause you to lose about 30secs of uncommitted data and if the server crashes
without the clients rebooting, you can get corrupted data (from the
client''s perspective). However, it solved the performance issue for me.

If that works, you can then decide how important the ZIL is to you. Personally,
I like things to be correct, but that doesn''t help me if performance is
in the toilet. In my case, the server is on a UPS, the clients aren''t.
And most of the clients use netboot anyway, so they will crash and have to be
rebooted if the server goes down. So for me, the drawback is small while the
performance gain is huge. That''s not the case for everyone, and
it''s up to the admin to decide what they can live with. Thankfully, the
next release of OpenSolaris will have the ability to set ZIL on/off per
filesystem.

Note that the ZIL only effects sync write speed, so if your workload
isn''t sync heavy, it might not matter in your case. However, with NFS
in the mix, it probably is. The ZFS on-disk data state is not effected by ZIL
on/off, so your pool''s data IS safe. You might lose some data that a
client THINKS is safely written, but the ZFS pool will come back properly on
reboot. So the client will be wrong about what is and is not written, thus the
possible "corruption" from the client perspective.

I run ZFS on 2 6-disk raidz2 arrays in the same pool and performance is very
good locally. With ZIL enabled, NFS performance was so bad it was near unusable.
With it disabled, I can saturate the single gigabit link and performance in the
Linux VM (xVM) running on that server improved significantly, to near local
speed, when using the NFS mounts to the main pool. My 5400RPM drives were not up
to ZIL''s needs, though they are plenty fast in general, and a working
slog was out of budget for a home server.
-- 
This message posted from opensolaris.org

Edward Ned Harvey

2010-Jun-09 15:41 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of besson3c
> 
> I''m wondering if somebody can kindly direct me to a sort of newbie
way
> of assessing whether my ZFS pool performance is a bottleneck that can
> be improved upon, and/or whether I ought to invest in a SSD ZIL
> mirrored pair? I''m a little confused by what the output of iostat,
There are a few generalities I can state, which may be of use:

* If you are serving NFS, then it''s likely you''re doing sync
write
operations, and therefore likely that a dedicated zil log device could
benefit your write performance.  To find out, you can disable your ZIL
(requires dismounting & remounting filesystem) temporarily and test
performance with the ZIL disabled.  If there is anything less than a huge
performance gain, then there''s no need for a dedicated log device.

* If you are doing large sequential read/write, then the performance of
striping/mirroring/raidz are all comparable given similar numbers of usable
disks.  That is, specifically:
  o If you do a large sequential read, with 3 mirrors (6 disks) then you get
6x performance of a single disk.
  o If you do a large sequential read, with 7-disk raidz (capacity of 6
disks) then you get 6x performance of a single disk.
  o If you do a large sequential write, with 3 mirrors (6 disks) then you
get 3x performance of a single disk.
  o If you do a large sequential write, with 7-disk raidz (capacity of 6
disks) then you get 6x performance of a single disk.
* So, for large sequential operations, the raidz would be cheaper and
probably slightly faster.

* If you do small random operations, then striping/mirroring can vastly
outperform raidz.   Specifically:
  o If you do random reads, with 3 mirrors (6 disks) then you get 4x-5x
performance of a single disk.  (Assuming you have multiple threads or
processes issuing those reads, or your read requests are queueable in any
way.)
  o If you do random reads, with 7-disk raidz (capacity of 6 disks) you get
about 50% faster than a single disk
  o If you do random writes, with 3 mirrors, then you get about 2x
performance of a single disk
  o If you do random writes, with 7-disk raidz, you get about 50% faster
than a single disk
* So, for small operations, the striping/mirroring would certainly be
faster.

Geoff Nordli

2010-Jun-09 16:20 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

> On Behalf Of Joe Auty
>Sent: Tuesday, June 08, 2010 11:27 AM
>
>
>I''d love to use Virtualbox, but right now it (3.2.2 commercial
which I''m
>evaluating, I haven''t been able to compile OSE on the CentOS 5.5
host yet)
is>giving me kernel panics on the host while starting up VMs which are
obviously>bothersome, so I''m exploring continuing to use VMWare Server and
seeing
what I>can do on the Solaris/ZFS side of things. I''ve also read this on a
VMWare
forum,>although I don''t know if this correct? This is in context to me
questioning
why I>don''t seem to have these same load average problems running
Virtualbox:
>
>
Hi Joe.

One thing about Vbox is they are rapidly adding new features which cause
some instability and regressions.  Unless there is a real need for one of
the new features in the 3.2 branch, I would recommend working with the 3.0
branch in a production environment.  They will announce when they feel that
3.2 becomes production ready.  

VirtualBox is a great type 2 hypervisor, and I can''t believe how much
it has
improved over the last year. 

Have a great day!

Geoff

Geoff Nordli

2010-Jun-09 16:20 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

>
>Brandon High wrote:
>On Tue, Jun 8, 2010 at 10:33 AM, besson3c <joe at netmusician.org>
wrote:
>
>
>What VM software are you using? There are a few knobs you can turn in VBox
>which will help with slow storage. See
>http://www.virtualbox.org/manual/ch12.html#id2662300 for instructions on
>reducing the flush interval.
>
>-B
Hi Brandon.

Have you played with the flush interval? 

I am using iscsi based zvols, and I am thinking about not using the caching
in vbox and instead rely on the comstar/zfs side.  

What do you think? 

Geoff

Garrett D''Amore

2010-Jun-09 17:13 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

You can hardly have too much.  At least 8 GB, maybe 16 would be good.

The benefit will depend on your workload, but zfs and buffer cache will use it
all if you have a big enough read working set.

 -- Garrett

Joe Auty <joe at netmusician.org> wrote:
>I''m also noticing that I''m a little short on RAM. I have 6
320 gig
>drives and 4 gig of RAM. If the formula is POOL_SIZE/250, this would
>mean that I need at least 6.4 gig of RAM.
>
>What role does RAM play with queuing and caching and other things which
>might impact overall disk performance? How much more RAM should I get?
>
>
>_______________________________________________
>zfs-discuss mailing list
>zfs-discuss at opensolaris.org
>http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Joe Auty

2010-Jun-09 17:46 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

Garrett D''Amore wrote:> You can hardly have too much.  At least 8 GB, maybe 16 would be good.
>
> The benefit will depend on your workload, but zfs and buffer cache will use
it all if you have a big enough read working set.
>   
Could lack of RAM be contributing to some of my problems, do you think?

>  -- Garrett
>
> Joe Auty <joe at netmusician.org> wrote:
>
>   
>> I''m also noticing that I''m a little short on RAM. I
have 6 320 gig
>> drives and 4 gig of RAM. If the formula is POOL_SIZE/250, this would
>> mean that I need at least 6.4 gig of RAM.
>>
>> What role does RAM play with queuing and caching and other things which
>> might impact overall disk performance? How much more RAM should I get?
>>
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>>
>>     

-- 
Joe Auty, NetMusician
NetMusician helps musicians, bands and artists create beautiful,
professional, custom designed, career-essential websites that are easy
to maintain and to integrate with popular social networks.
www.netmusician.org <http://www.netmusician.org>
joe at netmusician.org <mailto:joe at netmusician.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100609/dcb9ffc9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nmtwitter.png
Type: image/png
Size: 1674 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100609/dcb9ffc9/attachment.png>

Brandon High

2010-Jun-09 18:49 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

On Wed, Jun 9, 2010 at 9:20 AM, Geoff Nordli <geoffn at grokworx.com>
wrote:> Have you played with the flush interval?
>
> I am using iscsi based zvols, and I am thinking about not using the caching
> in vbox and instead rely on the comstar/zfs side.
>
> What do you think?
If you care about your data, IgnoreFlush should always be set for all
the drives. This ensure that a flush request from the guest actually
writes data to disk. FlushInterval is a little different, in that it
prevents the amount of buffered write data from from stalling the
guest when the host finally does write it out. It''s OK to let VBox
cache some data if you have fast enough storage.

If your storage is reasonably fast, you shouldn''t need to touch
FlushInterval.

As far as my experience, my zpool is an 8 disk raidz2 comprised of
5400 rpm drives, so it''s definitely at the low end of the performance
spectrum. The OpenSolaris machine is hosting 3 linux guests in
VirtualBox 3.0.

Initially I was using disk images in a zfs filesystem. I was having
trouble with IO failing stalling and guests remounting their disks
read-only. Setting FlushInterval to 10MB (as recommended in the VBox
manual) prevented the host from hanging but disk performance was still
poor. I''ve moved to using raw disks mapped to zvols (/dev/zvol/rdsk)
and removed the FlushInterval settings. The io stalls that I
encountered using image files went away.

-B

-- 
Brandon High : bhigh at freaks.com

Edward Ned Harvey

2010-Jun-09 21:04 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Joe Auty
> 
> I''m also noticing that I''m a little short on RAM. I have
6 320 gig
> drives and 4 gig of RAM. If the formula is POOL_SIZE/250, this would
> mean that I need at least 6.4 gig of RAM.
> 
> What role does RAM play with queuing and caching and other things which
> might impact overall disk performance? How much more RAM should I get?
Excess ram accelerates everything.  There is no such thing as a "rule of
thumb."

The OS will cache everything it''s read before, whenever it can (the
memory
isn''t requested for some other purpose).  To avoid having to fetch it a
2nd
time from disk.  The OS will also buffer all writes in memory, and attempt
to optimize for the type of storage available, before pushing it out to
disk.  (Sync writes will hit the ZIL, stay in ram, and then hit the disk
again.)  

If you have compression or dedup enabled, these benefit enormously from
additional ram.

Everything is faster with more ram.  There is no limit, unless the total
used disk in your system is smaller than the available ram in your system
... which seems very improbable.

The more ram, the better.  Choose how much money you''re willing to
spend,
and spend that much on ram.

Kyle McDonald

2010-Jun-09 21:11 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 6/9/2010 5:04 PM, Edward Ned Harvey wrote:
> 
> Everything is faster with more ram.  There is no limit, unless the total
> used disk in your system is smaller than the available ram in your system
> ... which seems very improbable.
>
Off topic, but...

When I managed a build/simulation farm for one of Sun''s ASIC design
teams, we had several 24 CPU machines with 96GB or 192GB of RAM and only
36GB or maybe 73GB of disk.

Probably a special case though. ;)

  -Kyle
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJMEANnAAoJEEADRM+bKN5w+8EH/iUP/eEZUkZLLCyqgKN89yfy
TBePmfHwBgneIvcW+YJrk1aKysXAze/PNxP4tBtUsgoqrbmPQTFqFkAcIrLxw1Sf
udmSD+LQsOAult2W5e/jpJIxbPQRnbWqUuyatimN0xRF6Fs9/D5fFX8LDvjl5Eqb
daf+e2fRGFn0rvQ2g+TQpulR6PwQTdkmh+e7oYkQ7kV6DvKjjbPVApRKrurNVMR5
SQbArcm6xwCmq5x+Yn2bXERlM8IPA9Z4APxScY6P7yxc3yqFbKyosEU98fP1JJtR
GWflGBRc+uysozCu6Dc2WSek/loIRnihzDTDtdcZynLXsN7if139LaCGYFRx1j4=ylMM
-----END PGP SIGNATURE-----

Bob Friesenhahn

2010-Jun-10 17:44 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

On Wed, 9 Jun 2010, Travis Tabbal wrote:
> NFS writes on ZFS blows chunks performance wise. The only way to 
> increase the write speed is by using an slog
The above statement is not quite true. RAID-style adaptor cards which 
contain battery backed RAM or RAID arrays which include battery backed 
RAM also help immensely.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Bob Friesenhahn

2010-Jun-10 17:52 UTC

head link

[zfs-discuss] General help with understanding ZFS performance bottlenecks

On Wed, 9 Jun 2010, Edward Ned Harvey wrote:> disks.  That is, specifically:
>  o If you do a large sequential read, with 3 mirrors (6 disks) then you get
> 6x performance of a single disk.
Should say "up to 6x".  Which disk in the pair will be read from is 
random so you are unlikely to get the full 6x.
>  o If you do a large sequential read, with 7-disk raidz (capacity of 6
> disks) then you get 6x performance of a single disk.
Probably should say "up to 6x" as well.  This configuration is more 
sensitive to latency and available disk IOPS becomes more critical.
>  o If you do a large sequential write, with 3 mirrors (6 disks) then you
> get 3x performance of a single disk.
Also an "up to" type value.  Perhaps you will only get 1.5X because of
some I/O bottleneck between the CPU and the mirrored disks (i.e. two 
writes at once may cause I/O contention).

These rules of thumb are not terribly accurate.  If performance is 
important, then there is no substitute for actual testing.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

zfs discuss - Jun 2010 - General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks

[zfs-discuss] General help with understanding ZFS performance bottlenecks