thr3ads.net - zfs discuss - [zfs-discuss] What performance to expect from mirror vdevs? [Dec 2010]

If this information is useful, please help other people find it:
Share via:

Stephan Budach

2010-Dec-11 15:48 UTC

[zfs-discuss] What performance to expect from mirror vdevs?

Hi,

on friday I received  two of my new fc raids, that I intended to use as 
my new zpool devices. These devices are from CiDesign and their 
type/model is iR16FC4ER. These are fc raids, that also allow JBOD 
operation, which is what I chose. So I configured 16 raid groups on each 
system and configured the raids to attach them to their fc channel one 
by one.

On my Sol11Expr host I have created a zpool of mirror vdevs, by 
selecting 1 disk  from either raid. This way I got a zpool that looks 
like this:

root at solaris11c:~# zpool status newObelixData
   pool: newObelixData
  state: ONLINE
  scan: resilvered 1K in 0h0m with 0 errors on Sat Dec 11 15:25:35 2010
config:

         NAME                        STATE     READ WRITE CKSUM
         newObelixData               ONLINE       0     0     0
           mirror-0                  ONLINE       0     0     0
             c9t2100001378AC02C7d0   ONLINE       0     0     0
             c9t2100001378AC0355d0   ONLINE       0     0     0
           mirror-1                  ONLINE       0     0     0
             c9t2100001378AC02C7d1   ONLINE       0     0     0
             c9t2100001378AC0355d1   ONLINE       0     0     0
           mirror-2                  ONLINE       0     0     0
             c9t2100001378AC02C7d2   ONLINE       0     0     0
             c9t2100001378AC0355d2   ONLINE       0     0     0
           mirror-3                  ONLINE       0     0     0
             c9t2100001378AC02C7d3   ONLINE       0     0     0
             c9t2100001378AC0355d3   ONLINE       0     0     0
           mirror-4                  ONLINE       0     0     0
             c9t2100001378AC02C7d4   ONLINE       0     0     0
             c9t2100001378AC0355d4   ONLINE       0     0     0
           mirror-5                  ONLINE       0     0     0
             c9t2100001378AC02C7d5   ONLINE       0     0     0
             c9t2100001378AC0355d5   ONLINE       0     0     0
           mirror-6                  ONLINE       0     0     0
             c9t2100001378AC02C7d6   ONLINE       0     0     0
             c9t2100001378AC0355d6   ONLINE       0     0     0
           mirror-7                  ONLINE       0     0     0
             c9t2100001378AC02C7d7   ONLINE       0     0     0
             c9t2100001378AC0355d7   ONLINE       0     0     0
           mirror-8                  ONLINE       0     0     0
             c9t2100001378AC02C7d8   ONLINE       0     0     0
             c9t2100001378AC0355d8   ONLINE       0     0     0
           mirror-9                  ONLINE       0     0     0
             c9t2100001378AC02C7d9   ONLINE       0     0     0
             c9t2100001378AC0355d9   ONLINE       0     0     0
           mirror-10                 ONLINE       0     0     0
             c9t2100001378AC02C7d10  ONLINE       0     0     0
             c9t2100001378AC0355d10  ONLINE       0     0     0
           mirror-11                 ONLINE       0     0     0
             c9t2100001378AC02C7d11  ONLINE       0     0     0
             c9t2100001378AC0355d11  ONLINE       0     0     0
           mirror-12                 ONLINE       0     0     0
             c9t2100001378AC02C7d12  ONLINE       0     0     0
             c9t2100001378AC0355d12  ONLINE       0     0     0
           mirror-13                 ONLINE       0     0     0
             c9t2100001378AC02C7d13  ONLINE       0     0     0
             c9t2100001378AC0355d13  ONLINE       0     0     0
           mirror-14                 ONLINE       0     0     0
             c9t2100001378AC02C7d14  ONLINE       0     0     0
             c9t2100001378AC0355d14  ONLINE       0     0     0
           mirror-15                 ONLINE       0     0     0
             c9t2100001378AC02C7d15  ONLINE       0     0     0
             c9t2100001378AC0355d15  ONLINE       0     0     0

errors: No known data errors

At first I disabled all write cache and read ahead options for each raid 
group on the raids, since I wanted to provide ZFS as much control over 
the drives as possible, but the performance was quite worse. I am 
running this zpool on a Sun Fire X4170M2 with 32 GB of RAM so I ran 
bonnie++ with -s 63356 -n 128 and got these results:

Sequential Output
char: 51819
block: 50602
rewrite: 28090

Sequential Input:
char: 62562
block 60979

Random seeks: 510 <- this seems really low to me, isn''t it?

Sequential Create:
create: 27529
read: 172287
delete: 30522

Random Create:
create: 25531
read: 244977
delete 29423

Since I was curious, what would happen, if I''d enable WriteCache and 
ReadAhead on the raid groups, I turned them on for all 32 devices and 
re-ran bonnie++. To my great dismay, this time zfs had a lot of random 
troubles with the drives, where zfs would remove drives arbitrarily from 
the pool since they exceeded the error thresholds. On one run, this only 
happend to 4 drives from one fc raid on the next run 3 drives from the 
other raid got removed from the pool.

I know, that I''d better disable all "optimizations" on the
raid side,
but the performance seems just too bad with these settings. Maybe 
running 16 mirrors in a zpool is not a good idea - but that seems more 
than unlikely to me.

Is there anything else I can check?

Cheers,
budy

Bob Friesenhahn

2010-Dec-13 01:42 UTC

head link

[zfs-discuss] What performance to expect from mirror vdevs?

On Sat, 11 Dec 2010, Stephan Budach wrote:>
> At first I disabled all write cache and read ahead options for each raid 
> group on the raids, since I wanted to provide ZFS as much control over the 
> drives as possible, but the performance was quite worse. I am running this 
> zpool on a Sun Fire X4170M2 with 32 GB of RAM so I ran bonnie++ with -s
63356
> -n 128 and got these results:
>
> Sequential Output
> char: 51819
> block: 50602
> rewrite: 28090
I am not very familiar with bonnie++ output.  Does 51819 mean 
51MB/second?  If so, that is perhaps 1 disk''s worth of performance.
> Random seeks: 510 <- this seems really low to me, isn''t it?
It does seem a bit low.  Everything depends on if the "random seek" 
was satisfied from ARC cache or from the underlying disk.  You should 
be able to obtain at least the number of physical seeks available from 
1/2 your total disks.  For example, with 16 pair and if each disk 
could do 100 seeks per second, then you should expect at least 8*100 
random seeks per second.  With zfs mirroring and doing only 
read-seeks, you should expect to get up to 75% of the seek capability 
of all 16 disks combined.
> Since I was curious, what would happen, if I''d enable WriteCache
and
> ReadAhead on the raid groups, I turned them on for all 32 devices and
re-ran
> bonnie++. To my great dismay, this time zfs had a lot of random troubles
with
> the drives, where zfs would remove drives arbitrarily from the pool since 
> they exceeded the error thresholds. On one run, this only happend to 4
drives
> from one fc raid on the next run 3 drives from the other raid got removed 
> from the pool.
Ungood.  Note that with this many disks, you should be able to swamp 
your fiber channel link and that the fiber channel should be the 
sequential I/O bottleneck.  It may also be that your RAID array 
firmware/CPUs become severely overloaded.
> I know, that I''d better disable all "optimizations" on
the raid side, but the
> performance seems just too bad with these settings. Maybe running 16
mirrors
> in a zpool is not a good idea - but that seems more than unlikely to me.
16 mirrors in a zpool is a very good idea.  Just keep in mind that 
this is a lot of I/O power and you might swamp your FC link and 
adaptor card.
> Is there anything else I can check?
Check the output of

   iostat -xn 30

while bonnie++ is running.  This may reveal an issue.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Ian Collins

2010-Dec-13 01:54 UTC

head link

[zfs-discuss] What performance to expect from mirror vdevs?

On 12/12/10 04:48 AM, Stephan Budach wrote:> Hi,
>
> on friday I received  two of my new fc raids, that I intended to use 
> as my new zpool devices. These devices are from CiDesign and their 
> type/model is iR16FC4ER. These are fc raids, that also allow JBOD 
> operation, which is what I chose. So I configured 16 raid groups on 
> each system and configured the raids to attach them to their fc 
> channel one by one.
>
> On my Sol11Expr host I have created a zpool of mirror vdevs, by 
> selecting 1 disk  from either raid. This way I got a zpool that looks 
> like this:
> At first I disabled all write cache and read ahead options for each 
> raid group on the raids, since I wanted to provide ZFS as much control 
> over the drives as possible, but the performance was quite worse. I am 
> running this zpool on a Sun Fire X4170M2 with 32 GB of RAM so I ran 
> bonnie++ with -s 63356 -n 128 and got these results:
>
> Sequential Output
> char: 51819
> block: 50602
> rewrite: 28090
>
> Sequential Input:
> char: 62562
> block 60979
>
> Random seeks: 510 <- this seems really low to me, isn''t it?
>
> Sequential Create:
> create: 27529
> read: 172287
> delete: 30522
>
> Random Create:
> create: 25531
> read: 244977
> delete 29423
>The closet I have by way of caparison is an old thumper with a stripe of 
9 mirrors:

Sequential Output
char: 206479
block: 601102
rewrite: 218089

Sequential Input:
char: 138945
block 702598

Random seeks: 1970

Getting on for an order of magnitude better on I/O.
> Is there anything else I can check?
>iostat are recommended elsewhere.

-- 
Ian.

Stephan Budach

2010-Dec-13 09:56 UTC

head link

[zfs-discuss] What performance to expect from mirror vdevs?

Bob, Ian? thanks for your input.

It may be that the fw on the raid really got overloaded and that may had 
to do with the way the GUI works.
I am now testing the same configuration on another host, where I can 
risk some lockups when running bonnie++.

I am able to set some options on the drive level, namely write cache and 
read ahead as well as on the virtual drive level. Unfortuanetly the 
options on the virtual drive level are called equally and I thought that 
setting these options on the drive level when configuring a JBOD raid 
group would also set them on the virtual disk level, but that didn''t 
happen. ;)

So, odds are quite good that I overloaded the raid controller with lots 
of virtual disks that had their cache settings to write through and read 
ahead on.
ATM, I have all options disabled on the drive level and on the raid 
group level as well as on the virtual drive level.

My current run of bonnie is of course not that satisfactory and I wanted 
to ask you, if it''s safe to turn on at least the drive level options, 
namely the write cache and the read ahead?

Thanks,
budy

Bob Friesenhahn

2010-Dec-14 02:30 UTC

head link

[zfs-discuss] What performance to expect from mirror vdevs?

On Mon, 13 Dec 2010, Stephan Budach wrote:>
> My current run of bonnie is of course not that satisfactory and I wanted to
> ask you, if it''s safe to turn on at least the drive level options,
namely the
> write cache and the read ahead?
Enabling the write cache is fine as long as it is non-volatile or is 
flushed to disk when zfs requests it.  Zfs will request a 
transaction-group flush on all disks before proceeding with the next 
batch of writes.  The read ahead might not be all that valuable in 
practice (and might cause a severe penalty) because it assumes a 
particular mode and timing of access which might not match how your 
system is actually used.  Most usage scenarios are something other 
than what bonnie++ does.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Stephan Budach

2010-Dec-14 06:43 UTC

head link

[zfs-discuss] What performance to expect from mirror vdevs?

Am 14.12.2010 um 03:30 schrieb Bob Friesenhahn <bfriesen at
simple.dallas.tx.us>:
> On Mon, 13 Dec 2010, Stephan Budach wrote:
>> 
>> My current run of bonnie is of course not that satisfactory and I
wanted to ask you, if it''s safe to turn on at least the drive level
options, namely the write cache and the read ahead?
> 
> Enabling the write cache is fine as long as it is non-volatile or is
flushed to disk when zfs requests it.  Zfs will request a transaction-group
flush on all disks before proceeding with the next batch of writes.  The read
ahead might not be all that valuable in practice (and might cause a severe
penalty) because it assumes a particular mode and timing of access which might
not match how your system is actually used.  Most usage scenarios are something
other than what bonnie++ does.
I know that bonnie++ does not generate the workload I will see on my server, but
it reliably causes ZFS to kick out drives from the pool, which
shouldn''t happen, of course.

Actually, I am expecting the Qsan controller fw, which is what is build into
these raids, has some issues, when it has to deal with high random I/O.

I will try now my good old Infortrend systems and See, if I can reproduce this
issue with them as well.

Cheers,
Budy

Stephan Budach

2010-Dec-14 16:32 UTC

head link

[zfs-discuss] What performance to expect from mirror vdevs?

Am 14.12.10 07:43, schrieb Stephan Budach:> Am 14.12.2010 um 03:30 schrieb Bob Friesenhahn<bfriesen at
simple.dallas.tx.us>:
>
>> On Mon, 13 Dec 2010, Stephan Budach wrote:
>>> My current run of bonnie is of course not that satisfactory and I
wanted to ask you, if it''s safe to turn on at least the drive level
options, namely the write cache and the read ahead?
>> Enabling the write cache is fine as long as it is non-volatile or is
flushed to disk when zfs requests it.  Zfs will request a transaction-group
flush on all disks before proceeding with the next batch of writes.  The read
ahead might not be all that valuable in practice (and might cause a severe
penalty) because it assumes a particular mode and timing of access which might
not match how your system is actually used.  Most usage scenarios are something
other than what bonnie++ does.
> I know that bonnie++ does not generate the workload I will see on my
server, but it reliably causes ZFS to kick out drives from the pool, which
shouldn''t happen, of course.
>
> Actually, I am expecting the Qsan controller fw, which is what is build
into these raids, has some issues, when it has to deal with high random I/O.
>
> I will try now my good old Infortrend systems and See, if I can reproduce
this issue with them as well.I just wanted to wrap this up. So, actually the current firmware 1.0.8x 
for the CiDesign iR16FC4ER has a severe bug which caused ZFS to kick out 
random disks and to degrade the zpool.
So, I tried the older firmware 1.07 which doesn''t has these issues and 
where the 2x16 JBODs are running very well.
Since this is a FC-to-SATA2 raid I also had to tune the throttle 
parameter in the qlc.conf which led to a great performance boost - 
either 1 and 2 did a great job.

Now, that this is solved, I can go ahead and transfer my data from my 
2xRAID6 zpool onto these new devices.

Cheers,
budy

zfs discuss - Dec 2010 - What performance to expect from mirror vdevs?

[zfs-discuss] What performance to expect from mirror vdevs?

[zfs-discuss] What performance to expect from mirror vdevs?

[zfs-discuss] What performance to expect from mirror vdevs?

[zfs-discuss] What performance to expect from mirror vdevs?

[zfs-discuss] What performance to expect from mirror vdevs?

[zfs-discuss] What performance to expect from mirror vdevs?

[zfs-discuss] What performance to expect from mirror vdevs?