Well, I''ve searched my brains out and I can''t seem to find a reason for this. I''m getting bad to medium performance with my new test storage device. I''ve got 24 1.5T disks with 2 SSDs configured as a zil log device. I''m using the Areca raid controller, the driver being arcmsr. Quad core AMD with 16 gig of RAM OpenSolaris upgraded to snv_134. The zpool has 2 11-disk raidz2''s and I''m getting anywhere between 1MB/sec to 40MB/sec with zpool iostat. On average, though it''s more like 5MB/sec if I watch while I''m actively doing some r/w. I know that I should be getting better performance. I''m new to OpenSolaris, but I''ve been using *nix systems for a long time, so if there''s any more information that I can provide, please let me know. Am I doing anything wrong with this configuration? Thanks in advance. -- This message posted from opensolaris.org
On Fri, Jun 18, 2010 at 01:26:11AM -0700, artiepen wrote:> Well, I''ve searched my brains out and I can''t seem to find a reason for this. > > I''m getting bad to medium performance with my new test storage device. I''ve got 24 1.5T disks with 2 SSDs configured as a zil log device. I''m using the Areca raid controller, the driver being arcmsr. Quad core AMD with 16 gig of RAM OpenSolaris upgraded to snv_134. > > The zpool has 2 11-disk raidz2''s and I''m getting anywhere between 1MB/sec to 40MB/sec with zpool iostat. On average, though it''s more like 5MB/sec if I watch while I''m actively doing some r/w. I know that I should be getting better performance. >How are you measuring the performance? Do you understand raidz2 with that big amount of disks in it will give you really poor random write performance? -- Pasi> I''m new to OpenSolaris, but I''ve been using *nix systems for a long time, so if there''s any more information that I can provide, please let me know. Am I doing anything wrong with this configuration? Thanks in advance. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I am new to zfs, so I am still learning. I''m using zpool iostat to measure performance. Would you say that smaller raidz2 sets would give me more reliable and better performance? I''m willing to give it a shot... On Fri, Jun 18, 2010 at 4:42 AM, Pasi K?rkk?inen <pasik at iki.fi> wrote:> On Fri, Jun 18, 2010 at 01:26:11AM -0700, artiepen wrote: >> Well, I''ve searched my brains out and I can''t seem to find a reason for this. >> >> I''m getting bad to medium performance with my new test storage device. I''ve got 24 1.5T disks with 2 SSDs configured as a zil log device. I''m using the Areca raid controller, the driver being arcmsr. Quad core AMD with 16 gig of RAM OpenSolaris upgraded to snv_134. >> >> The zpool has 2 11-disk raidz2''s and I''m getting anywhere between 1MB/sec to 40MB/sec with zpool iostat. On average, though it''s more like 5MB/sec if I watch while I''m actively doing some r/w. I know that I should be getting better performance. >> > > How are you measuring the performance? > Do you understand raidz2 with that big amount of disks in it will give you really poor random write performance? > > -- Pasi > >> I''m new to OpenSolaris, but I''ve been using *nix systems for a long time, so if there''s any more information that I can provide, please let me know. Am I doing anything wrong with this configuration? Thanks in advance. >> -- >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Curtis E. Combs Jr. System Administrator Associate University of Georgia High Performance Computing Center cecombs at uga.edu Office: (706) 542-0186 Cell: (706) 206-7289 Gmail Chat: psynophile at gmail.com
I also have a dtrace script that I found that supposedly gives a more accurate reading. Usually, though, it''s output is very close to what zpool iostat says. Keep in mind this is a test environment, there''s no production here, so I can make and destroy the pools as much as I want to play around with. I''m also still learning about dtrace. On Fri, Jun 18, 2010 at 4:52 AM, Curtis E. Combs Jr. <cecombs at uga.edu> wrote:> I am new to zfs, so I am still learning. I''m using zpool iostat to > measure performance. Would you say that smaller raidz2 sets would give > me more reliable and better performance? I''m willing to give it a > shot... > > On Fri, Jun 18, 2010 at 4:42 AM, Pasi K?rkk?inen <pasik at iki.fi> wrote: >> On Fri, Jun 18, 2010 at 01:26:11AM -0700, artiepen wrote: >>> Well, I''ve searched my brains out and I can''t seem to find a reason for this. >>> >>> I''m getting bad to medium performance with my new test storage device. I''ve got 24 1.5T disks with 2 SSDs configured as a zil log device. I''m using the Areca raid controller, the driver being arcmsr. Quad core AMD with 16 gig of RAM OpenSolaris upgraded to snv_134. >>> >>> The zpool has 2 11-disk raidz2''s and I''m getting anywhere between 1MB/sec to 40MB/sec with zpool iostat. On average, though it''s more like 5MB/sec if I watch while I''m actively doing some r/w. I know that I should be getting better performance. >>> >> >> How are you measuring the performance? >> Do you understand raidz2 with that big amount of disks in it will give you really poor random write performance? >> >> -- Pasi >> >>> I''m new to OpenSolaris, but I''ve been using *nix systems for a long time, so if there''s any more information that I can provide, please let me know. Am I doing anything wrong with this configuration? Thanks in advance. >>> -- >>> This message posted from opensolaris.org >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > > > -- > Curtis E. Combs Jr. > System Administrator Associate > University of Georgia > High Performance Computing Center > cecombs at uga.edu > Office: (706) 542-0186 > Cell: (706) 206-7289 > Gmail Chat: psynophile at gmail.com >-- Curtis E. Combs Jr. System Administrator Associate University of Georgia High Performance Computing Center cecombs at uga.edu Office: (706) 542-0186 Cell: (706) 206-7289 Gmail Chat: psynophile at gmail.com
On Fri, Jun 18, 2010 at 4:42 AM, Pasi K?rkk?inen <pasik at iki.fi> wrote:> On Fri, Jun 18, 2010 at 01:26:11AM -0700, artiepen wrote: > > Well, I''ve searched my brains out and I can''t seem to find a reason for > this. > > > > I''m getting bad to medium performance with my new test storage device. > I''ve got 24 1.5T disks with 2 SSDs configured as a zil log device. I''m using > the Areca raid controller, the driver being arcmsr. Quad core AMD with 16 > gig of RAM OpenSolaris upgraded to snv_134. > > > > The zpool has 2 11-disk raidz2''s and I''m getting anywhere between 1MB/sec > to 40MB/sec with zpool iostat. On average, though it''s more like 5MB/sec if > I watch while I''m actively doing some r/w. I know that I should be getting > better performance. > > > > How are you measuring the performance? > Do you understand raidz2 with that big amount of disks in it will give you > really poor random write performance? > > -- Pasi > >i have a media server with 2 raidz2 vdevs 10 drives wide myself without a ZIL (but with a 64 gb l2arc) I can write to it about 400 MB/s over the network, and scrubs show 600 MB/s but it really depends on the type of i/o you have....random i/o across 2 vdevs will be REALLY slow (as slow as the slowest 2 drives in your pool basically) 40 MB/s might be right if it''s random....though i''d still expect to see more. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100618/344da0d7/attachment.html>
On Fri, Jun 18, 2010 at 04:52:02AM -0400, Curtis E. Combs Jr. wrote:> I am new to zfs, so I am still learning. I''m using zpool iostat to > measure performance. Would you say that smaller raidz2 sets would give > me more reliable and better performance? I''m willing to give it a > shot... >Yes, more smaller raid-sets will give you better performance, since zfs distributes (stripes) data on all of them. What''s your IO pattern? random writes? sequential writes? Basicly if you have 2x 11-disk raidz2 sets you''ll be limited to around performance of 2 disks, in the worst case of small random IO. (the parity needs to be written and that limits the performance of raidz/z2/z3 to the performance of single disk). This is not really zfs specific at all, it''s the same with any raid implementation. -- Pasi> On Fri, Jun 18, 2010 at 4:42 AM, Pasi K?rkk?inen <pasik at iki.fi> wrote: > > On Fri, Jun 18, 2010 at 01:26:11AM -0700, artiepen wrote: > >> Well, I''ve searched my brains out and I can''t seem to find a reason for this. > >> > >> I''m getting bad to medium performance with my new test storage device. I''ve got 24 1.5T disks with 2 SSDs configured as a zil log device. I''m using the Areca raid controller, the driver being arcmsr. Quad core AMD with 16 gig of RAM OpenSolaris upgraded to snv_134. > >> > >> The zpool has 2 11-disk raidz2''s and I''m getting anywhere between 1MB/sec to 40MB/sec with zpool iostat. On average, though it''s more like 5MB/sec if I watch while I''m actively doing some r/w. I know that I should be getting better performance. > >> > > > > How are you measuring the performance? > > Do you understand raidz2 with that big amount of disks in it will give you really poor random write performance? > > > > -- Pasi > > > >> I''m new to OpenSolaris, but I''ve been using *nix systems for a long time, so if there''s any more information that I can provide, please let me know. Am I doing anything wrong with this configuration? Thanks in advance. > >> -- > >> This message posted from opensolaris.org > >> _______________________________________________ > >> zfs-discuss mailing list > >> zfs-discuss at opensolaris.org > >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > > > -- > Curtis E. Combs Jr. > System Administrator Associate > University of Georgia > High Performance Computing Center > cecombs at uga.edu > Office: (706) 542-0186 > Cell: (706) 206-7289 > Gmail Chat: psynophile at gmail.com
40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely. As far as random vs. sequential. Correct me if I''m wrong, but if I used dd to make files from /dev/zero, wouldn''t that be sequential? I measure with zpool iostat 2 in another ssh session while making files of various sizes. This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true... -- This message posted from opensolaris.org
On Fri, Jun 18, 2010 at 05:15:44AM -0400, Thomas Burgess wrote:> On Fri, Jun 18, 2010 at 4:42 AM, Pasi K?rkk?inen <[1]pasik at iki.fi> wrote: > > On Fri, Jun 18, 2010 at 01:26:11AM -0700, artiepen wrote: > > Well, I''ve searched my brains out and I can''t seem to find a reason > for this. > > > > I''m getting bad to medium performance with my new test storage device. > I''ve got 24 1.5T disks with 2 SSDs configured as a zil log device. I''m > using the Areca raid controller, the driver being arcmsr. Quad core AMD > with 16 gig of RAM OpenSolaris upgraded to snv_134. > > > > The zpool has 2 11-disk raidz2''s and I''m getting anywhere between > 1MB/sec to 40MB/sec with zpool iostat. On average, though it''s more like > 5MB/sec if I watch while I''m actively doing some r/w. I know that I > should be getting better performance. > > > > How are you measuring the performance? > Do you understand raidz2 with that big amount of disks in it will give > you really poor random write performance? > -- Pasi > > i have a media server with 2 raidz2 vdevs 10 drives wide myself without a > ZIL (but with a 64 gb l2arc) > I can write to it about 400 MB/s over the network, and scrubs show 600 > MB/s but it really depends on the type of i/o you have....random i/o > across 2 vdevs will be REALLY slow (as slow as the slowest 2 drives in > your pool basically) > 40 MB/s might be right if it''s random....though i''d still expect to see > more. >7200 RPM SATA disk can do around 120 IOPS max (7200/60 = 120), so if you''re doing 4 kB random IO you end up getting 4*120 = 480 kB/sec throughput max from a single disk (in the worst case). 40 MB/sec of random IO throughput using 4 kB IOs would be around 10240 IOPS.. you''d need 85x SATA 7200 RPM disks in raid-0 (striping) for that :) -- Pasi
Yes, and I apologize for basic nature of these questions. Like I said, I''m pretty wet behind the ears with zfs. The MB/sec metric comes from dd, not zpool iostat. zpool iostat usually gives me units of k. I think I''ll try with smaller raid sets and come back to the thread. Thanks, all -- This message posted from opensolaris.org
On 06/18/10 09:21 PM, artiepen wrote:> This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true... >No, wider vdevs gives poor performance, not big pools. 3x7 drive raidz will give you better performance than 2x11 drives. -- Ian.
On Fri, Jun 18, 2010 at 02:21:15AM -0700, artiepen wrote:> 40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely. > > As far as random vs. sequential. Correct me if I''m wrong, but if I used dd to make files from /dev/zero, wouldn''t that be sequential? I measure with zpool iostat 2 in another ssh session while making files of various sizes. >Yep, dd will generate sequential IO. Did you specify blocksize for dd? (bs=1024k for example). As a default dd does 4 kB IOs.. which won''t be very fast. -- Pasi> This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true... > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Yea. I did bs sizes from 8 to 512k with counts from 256 on up. I just added zeros to the count, to try to test performance for larger files. I didn''t notice any difference at all, either with the dtrace script or zpool iostat. Thanks for you help, btw. On Fri, Jun 18, 2010 at 5:30 AM, Pasi K?rkk?inen <pasik at iki.fi> wrote:> On Fri, Jun 18, 2010 at 02:21:15AM -0700, artiepen wrote: >> 40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely. >> >> As far as random vs. sequential. Correct me if I''m wrong, but if I used dd to make files from /dev/zero, wouldn''t that be sequential? I measure with zpool iostat 2 in another ssh session while making files of various sizes. >> > > Yep, dd will generate sequential IO. > Did you specify blocksize for dd? (bs=1024k for example). > > As a default dd does 4 kB IOs.. which won''t be very fast. > > -- Pasi > >> This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true... >> -- >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Curtis E. Combs Jr. System Administrator Associate University of Georgia High Performance Computing Center cecombs at uga.edu Office: (706) 542-0186 Cell: (706) 206-7289 Gmail Chat: psynophile at gmail.com
artiepen wrote:> 40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely. > > As far as random vs. sequential. Correct me if I''m wrong, but if I used dd to make files from /dev/zero, wouldn''t that be sequential? I measure with zpool iostat 2 in another ssh session while making files of various sizes. > > This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true...A quick test on a system with 21 1TB SATA-drives in a single RAIDZ2 group show a performance of about 400MB/s with a single dd, blocksize=1048576. Creating a 10G-file with mkfile takes 25 seconds also. So I''d say basically there is nothing wrong with the zpool configuration. Can you paste some "iostat -xn 1" output while your test is running? --Arne
Sure. And hey, maybe I just need some context to know what''s "normal" IO for the zpool. It just...feels...slow, sometimes. It''s hard to explain. I attached a log of iostat -xn 1 while doing mkfile 10g testfile on the zpool, as well as your dd with the bs set really high. When I Ctl-C''ed the dd it said 460M/sec....like I said, maybe I just need some context... On Fri, Jun 18, 2010 at 5:36 AM, Arne Jansen <sensille at gmx.net> wrote:> artiepen wrote: >> 40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely. >> >> As far as random vs. sequential. Correct me if I''m wrong, but if I used dd to make files from /dev/zero, wouldn''t that be sequential? I measure with zpool iostat 2 in another ssh session while making files of various sizes. >> >> This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true... > > A quick test on a system with 21 1TB SATA-drives in a single > RAIDZ2 group show a performance of about 400MB/s with a > single dd, blocksize=1048576. Creating a 10G-file with mkfile > takes 25 seconds also. > So I''d say basically there is nothing wrong with the zpool > configuration. Can you paste some "iostat -xn 1" output while > your test is running? > > --Arne >-- Curtis E. Combs Jr. System Administrator Associate University of Georgia High Performance Computing Center cecombs at uga.edu Office: (706) 542-0186 Cell: (706) 206-7289 Gmail Chat: psynophile at gmail.com -------------- next part -------------- A non-text attachment was scrubbed... Name: tests.gz Type: application/x-gzip Size: 16939 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100618/eeed9380/attachment.bin>
Curtis E. Combs Jr. wrote:> Sure. And hey, maybe I just need some context to know what''s "normal" > IO for the zpool. It just...feels...slow, sometimes. It''s hard to > explain. I attached a log of iostat -xn 1 while doing mkfile 10g > testfile on the zpool, as well as your dd with the bs set really high. > When I Ctl-C''ed the dd it said 460M/sec....like I said, maybe I just > need some context... >These iostats don''t match to the creation of any large files. What are you doing there? Looks more like 512 byte random writes... Are you generating the load locally or remote?> > On Fri, Jun 18, 2010 at 5:36 AM, Arne Jansen <sensille at gmx.net> wrote: >> artiepen wrote: >>> 40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely. >>> >>> As far as random vs. sequential. Correct me if I''m wrong, but if I used dd to make files from /dev/zero, wouldn''t that be sequential? I measure with zpool iostat 2 in another ssh session while making files of various sizes. >>> >>> This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true... >> A quick test on a system with 21 1TB SATA-drives in a single >> RAIDZ2 group show a performance of about 400MB/s with a >> single dd, blocksize=1048576. Creating a 10G-file with mkfile >> takes 25 seconds also. >> So I''d say basically there is nothing wrong with the zpool >> configuration. Can you paste some "iostat -xn 1" output while >> your test is running? >> >> --Arne >> > > >
Um...I started 2 commands in 2 separate ssh sessions: in ssh session one: iostat -xn 1 > stats in ssh session two: mkfile 10g testfile when the mkfile was finished i did the dd command... on the same zpool1 and zfs filesystem..that''s it, really On Fri, Jun 18, 2010 at 6:06 AM, Arne Jansen <sensille at gmx.net> wrote:> Curtis E. Combs Jr. wrote: >> Sure. And hey, maybe I just need some context to know what''s "normal" >> IO for the zpool. It just...feels...slow, sometimes. It''s hard to >> explain. I attached a log of iostat -xn 1 while doing mkfile 10g >> testfile on the zpool, as well as your dd with the bs set really high. >> When I Ctl-C''ed the dd it said 460M/sec....like I said, maybe I just >> need some context... >> > > These iostats don''t match to the creation of any large files. What are > you doing there? Looks more like 512 byte random writes... Are you > generating the load locally or remote? > >> >> On Fri, Jun 18, 2010 at 5:36 AM, Arne Jansen <sensille at gmx.net> wrote: >>> artiepen wrote: >>>> 40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely. >>>> >>>> As far as random vs. sequential. Correct me if I''m wrong, but if I used dd to make files from /dev/zero, wouldn''t that be sequential? I measure with zpool iostat 2 in another ssh session while making files of various sizes. >>>> >>>> This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true... >>> A quick test on a system with 21 1TB SATA-drives in a single >>> RAIDZ2 group show a performance of about 400MB/s with a >>> single dd, blocksize=1048576. Creating a 10G-file with mkfile >>> takes 25 seconds also. >>> So I''d say basically there is nothing wrong with the zpool >>> configuration. Can you paste some "iostat -xn 1" output while >>> your test is running? >>> >>> --Arne >>> >> >> >> > >-- Curtis E. Combs Jr. System Administrator Associate University of Georgia High Performance Computing Center cecombs at uga.edu Office: (706) 542-0186 Cell: (706) 206-7289 Gmail Chat: psynophile at gmail.com
Curtis E. Combs Jr. wrote:> Um...I started 2 commands in 2 separate ssh sessions: > in ssh session one: > iostat -xn 1 > stats > in ssh session two: > mkfile 10g testfile > > when the mkfile was finished i did the dd command... > on the same zpool1 and zfs filesystem..that''s it, reallyNo, this doesn''t match. Did you enable compression or dedup?
Oh! Yes. dedup. not compression, but dedup, yes. On Fri, Jun 18, 2010 at 6:30 AM, Arne Jansen <sensille at gmx.net> wrote:> Curtis E. Combs Jr. wrote: >> Um...I started 2 commands in 2 separate ssh sessions: >> in ssh session one: >> iostat -xn 1 > stats >> in ssh session two: >> mkfile 10g testfile >> >> when the mkfile was finished i did the dd command... >> on the same zpool1 and zfs filesystem..that''s it, really > > No, this doesn''t match. Did you enable compression or dedup? > > >-- Curtis E. Combs Jr. System Administrator Associate University of Georgia High Performance Computing Center cecombs at uga.edu Office: (706) 542-0186 Cell: (706) 206-7289 Gmail Chat: psynophile at gmail.com
artiepen <cecombs at uga.edu> wrote:> 40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely.I get Read/write speeds of aprox. 630 MB/s into ZFS on a SunFire X4540. It seems that you missconfigured the pool. You need to make sure that each RAID Stripe is made of disks that are all on separate controllers that can do DMA independently. J?rg -- EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin js at cs.tu-berlin.de (uni) joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
Hi Curtis, You might review the ZFS best practices info to help you determine the best pool configuration for your environment: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide If you''re considering using dedup, particularly on a 24T pool, then review the current known issues, described here: http://hub.opensolaris.org/bin/view/Community+Group+zfs/dedup Thanks, Cindy On 06/18/10 02:52, Curtis E. Combs Jr. wrote:> I am new to zfs, so I am still learning. I''m using zpool iostat to > measure performance. Would you say that smaller raidz2 sets would give > me more reliable and better performance? I''m willing to give it a > shot... > > On Fri, Jun 18, 2010 at 4:42 AM, Pasi K?rkk?inen <pasik at iki.fi> wrote: >> On Fri, Jun 18, 2010 at 01:26:11AM -0700, artiepen wrote: >>> Well, I''ve searched my brains out and I can''t seem to find a reason for this. >>> >>> I''m getting bad to medium performance with my new test storage device. I''ve got 24 1.5T disks with 2 SSDs configured as a zil log device. I''m using the Areca raid controller, the driver being arcmsr. Quad core AMD with 16 gig of RAM OpenSolaris upgraded to snv_134. >>> >>> The zpool has 2 11-disk raidz2''s and I''m getting anywhere between 1MB/sec to 40MB/sec with zpool iostat. On average, though it''s more like 5MB/sec if I watch while I''m actively doing some r/w. I know that I should be getting better performance. >>> >> How are you measuring the performance? >> Do you understand raidz2 with that big amount of disks in it will give you really poor random write performance? >> >> -- Pasi >> >>> I''m new to OpenSolaris, but I''ve been using *nix systems for a long time, so if there''s any more information that I can provide, please let me know. Am I doing anything wrong with this configuration? Thanks in advance. >>> -- >>> This message posted from opensolaris.org >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > >
Thank you, all of you, for the super helpful responses, this is probably one of the most helpful forums I''ve been on. I''ve been working with ZFS on some SunFires for a little while now, in prod, and the testing environment with oSol is going really well. I love it. Nothing even comes close. If you have time, I have one more question. We''re going to try it now with 2 12-port Arecas. When I pop the controllers in and reconnect the drives, does ZFS has the intelligence to adjust if I use the same hard drives? Of coarse, it doesn''t matter, I can just destroy the pool and recreate. I''m just curious if that''d work. Thanks again! -- This message posted from opensolaris.org
If the device driver generates or fabricates device IDs, then moving devices around is probably okay. I recall the Areca controllers are problematic when it comes to moving devices under pools. Maybe someone with first-hand experience can comment. Consider exporting the pool first, moving the devices around, and importing the pool. Moving devices under pool is okay for testing but in general, I don''t recommend moving devices around under pools. Thanks, Cindy On 06/18/10 14:29, artiepen wrote:> Thank you, all of you, for the super helpful responses, this is probably one of the most helpful forums I''ve been on. I''ve been working with ZFS on some SunFires for a little while now, in prod, and the testing environment with oSol is going really well. I love it. Nothing even comes close. > > If you have time, I have one more question. We''re going to try it now with 2 12-port Arecas. When I pop the controllers in and reconnect the drives, does ZFS has the intelligence to adjust if I use the same hard drives? Of coarse, it doesn''t matter, I can just destroy the pool and recreate. I''m just curious if that''d work. > > Thanks again!
On Fri, Jun 18, 2010 at 1:52 AM, Curtis E. Combs Jr. <cecombs at uga.edu> wrote:> I am new to zfs, so I am still learning. I''m using zpool iostat to > measure performance. Would you say that smaller raidz2 sets would give > me more reliable and better performance? I''m willing to give it a > shot...A ZFS pool is made up of vdevs. ZFS stripes the vdevs together to improve performance, similar in concept to how RAID0 works. The more vdevs in the pool, the better the performance will be. A vdev is made up one or more disks, depending on the type of vdev and the redundancy level that you want (cache, log, mirror, raidz1, raidz2, raidz3, etc). Due to the algorithms used for raidz, the smaller your individual raidz vdevs (the fewer disks), the better the performance. IOW a 6 disk raidz2 vdev will performance better than an 11 disk raidz2 vdev. So, you want your individual vdevs to be made up of as few physical disks as possible (for your size and redundancy requirements), and your pool to be made up of as many vdevs as possible. -- Freddie Cash fjwcash at gmail.com
On Fri, Jun 18, 2010 at 6:34 AM, Curtis E. Combs Jr. <cecombs at uga.edu>wrote:> Oh! Yes. dedup. not compression, but dedup, yes.dedup may be your problem...it requires some heavy ram and/or decent L2ARC from what i''ve been reading. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100618/2e57b11f/attachment.html>
Sounds to me like something is wrong as on my 20 disk backup machine with 20 1TB disks on a single raidz2 vdev I get the following with DD on sequential reads/writes: writes: root at opensolaris: 11:36 AM :/data# dd bs=1M count=100000 if=/dev/zero of=./100gb.bin 100000+0 records in 100000+0 records out 104857600000 bytes (105 GB) copied, 233.257 s, 450 MB/s reads: root at opensolaris: 11:44 AM :/data# dd bs=1M if=./100gb.bin of=/dev/null 100000+0 records in 100000+0 records out 104857600000 bytes (105 GB) copied, 131.051 s, 800 MB/s zpool iostat <pool> 10 gives me about the same values that DD gives me. Maybe you have a bad drive somewhere? Which areca controller are you using as maybe you can pull the smart info off the drives from a linux boot cd as some of the controllers support that. Could be a bad drive somewhere. On 06/18/2010 02:33 AM, Curtis E. Combs Jr. wrote:> Yea. I did bs sizes from 8 to 512k with counts from 256 on up. I just > added zeros to the count, to try to test performance for larger files. > I didn''t notice any difference at all, either with the dtrace script > or zpool iostat. Thanks for you help, btw. > > On Fri, Jun 18, 2010 at 5:30 AM, Pasi K?rkk?inen <pasik at iki.fi> wrote: > >> On Fri, Jun 18, 2010 at 02:21:15AM -0700, artiepen wrote: >> >>> 40MB/sec is the best that it gets. Really, the average is 5. I see 4, 5, 2, and 6 almost 10x as many times as I see 40MB/sec. It really only bumps up to 40 very rarely. >>> >>> As far as random vs. sequential. Correct me if I''m wrong, but if I used dd to make files from /dev/zero, wouldn''t that be sequential? I measure with zpool iostat 2 in another ssh session while making files of various sizes. >>> >>> >> Yep, dd will generate sequential IO. >> Did you specify blocksize for dd? (bs=1024k for example). >> >> As a default dd does 4 kB IOs.. which won''t be very fast. >> >> -- Pasi >> >> >>> This is a test system. I''m wondering, now, if I should just reconfigure with maybe 7 disks and add another spare. Seems to be the general consensus that bigger raid pools = worse performance. I thought the opposite was true... >>> -- >>> This message posted from opensolaris.org >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >>> >> > > >
Sandon Van Ness wrote:> Sounds to me like something is wrong as on my 20 disk backup machine > with 20 1TB disks on a single raidz2 vdev I get the following with DD on > sequential reads/writes: > > writes: > > root at opensolaris: 11:36 AM :/data# dd bs=1M count=100000 if=/dev/zero > of=./100gb.bin > 100000+0 records in > 100000+0 records out > 104857600000 bytes (105 GB) copied, 233.257 s, 450 MB/s > > reads: > > root at opensolaris: 11:44 AM :/data# dd bs=1M if=./100gb.bin of=/dev/null > 100000+0 records in > 100000+0 records out > 104857600000 bytes (105 GB) copied, 131.051 s, 800 MB/s > > zpool iostat <pool> 10 > > gives me about the same values that DD gives me. Maybe you have a bad > drive somewhere? Which areca controller are you using as maybe you can > pull the smart info off the drives from a linux boot cd as some of the > controllers support that. Could be a bad drive somewhere. >didn''t he say he already gets 400MB/s from dd, but zpool iostat only show a few MB/s? What does zpool iostat show, the value before or after dedup? Curtis, to see if your physical setup is ok you should turn of dedup and measure again. Otherwise you only measure the power of your machine to dedup /dev/zero. --Arne