I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with one controller 2gb/s attached to it. I am running sol 10 u3 . every time I change the recordsize of the zfs fs the disk IO improves (doubles) and stay like that for about 5 to 6 hrs. Then it dies down. I increase the recordsize again and performace jumps back to double again. The main app is oracle database with 8K blocksize I changed the zfs recordsize to from 8K to 16K and then 32K every 8 hrs, which improved the disk IO I wonder if there is any other zfs parameter that I can change to keep the performance good, since I am running older sol 10. I have single disk luns on the 3510 with mpxio enabled on T2000. each disk has two paths (primary,primary) online per luxadm. zpool iostat 10 gives me only about 6MB max write bandwidth. I was hoping it to lot higher. the battery on 3510 is expired and waiting for a replacement. besides replacing the battery, what else can I do to improve the write bandwidth? does the battery expire directly affecting the oracle''s disk IO? I thought oracle will just write to zfs and done. and zpool will then write-through to controller instead of write-back since no battery. sun storage guys found no other issue besides the battery. should disabling zil improve performance? I won''t try it until we get the battery so not to risk data loss during outage. -- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vadud3 at gmail.com> wrote:> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with > one controller 2gb/s attached to it. > I am running sol 10 u3 . > > every time I change the recordsize of the zfs fs the disk IO improves > (doubles) and stay like that for > about 5 to 6 hrs. Then it dies down. I increase the recordsize again > and performace jumps back to > double again. The main app is oracle database with 8K blocksize > > I changed the zfs recordsize to from 8K to 16K and then 32K every 8 > hrs, which improved the disk IO > > I wonder if there is any other zfs parameter that I can change to keep > the performance good, since I > am running older sol 10. > > I have single disk luns on the 3510 with mpxio enabled on T2000. each > disk has two paths (primary,primary) > online per luxadm. > > zpool iostat 10 gives me only about 6MB max write bandwidth. I was > hoping it to lot higher. > > the battery on 3510 is expired and waiting for a replacement. > > besides replacing the battery, what else can I do to improve the write > bandwidth? > > does the battery expire directly affecting the oracle''s disk IO? I > thought oracle will just write to zfs and done. > and zpool will then write-through to controller instead of write-back > since no battery. > > sun storage guys found no other issue besides the battery. > > should disabling zil improve performance? I won''t try it until we get > the battery so not to risk data loss > during outage.so my 3510 is essentially behaving like a 3510 jbod but why would that make the IO bandwidth this low? here are some iodata which make the t2000/3510 setup looks even worse http://pastebin.com/QeAKDbfj> > -- > Asif Iqbal > PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? >-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
On Thu, May 20, 2010 at 2:07 PM, Asif Iqbal <vadud3 at gmail.com> wrote:> On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vadud3 at gmail.com> wrote: >> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with >> one controller 2gb/s attached to it. >> I am running sol 10 u3 . >> >> every time I change the recordsize of the zfs fs the disk IO improves >> (doubles) and stay like that for >> about 5 to 6 hrs. Then it dies down. I increase the recordsize again >> and performace jumps back to >> double again. The main app is oracle database with 8K blocksize >> >> I changed the zfs recordsize to from 8K to 16K and then 32K every 8 >> hrs, which improved the disk IO >> >> I wonder if there is any other zfs parameter that I can change to keep >> the performance good, since I >> am running older sol 10. >> >> I have single disk luns on the 3510 with mpxio enabled on T2000. each >> disk has two paths (primary,primary) >> online per luxadm. >> >> zpool iostat 10 gives me only about 6MB max write bandwidth. I was >> hoping it to lot higher. >> >> the battery on 3510 is expired and waiting for a replacement. >> >> besides replacing the battery, what else can I do to improve the write >> bandwidth? >> >> does the battery expire directly affecting the oracle''s disk IO? I >> thought oracle will just write to zfs and done. >> and zpool will then write-through to controller instead of write-back >> since no battery. >> >> sun storage guys found no other issue besides the battery. >> >> should disabling zil improve performance? I won''t try it until we get >> the battery so not to risk data loss >> during outage. > > so my 3510 is essentially behaving like a 3510 jbod but why would that > make the IO bandwidth this low? > > here are some iodata which make the t2000/3510 setup looks even worse > > http://pastebin.com/QeAKDbfjwith a raidz2 of 6 LD 146GB 15K rpm 2gb FC disks I should expect lot higher than 7MB/s write bandwidth even when 3510FC acting as JBOD in absence of battery> > >> >> -- >> Asif Iqbal >> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >> A: Because it messes up the order in which people normally read text. >> Q: Why is top-posting such a bad thing? >> > > > > -- > Asif Iqbal > PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? >-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Richard Elling
2010-May-21 00:34 UTC
[zfs-discuss] zfs recordsize change improves performance
On May 20, 2010, at 11:07 AM, Asif Iqbal wrote:> On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vadud3 at gmail.com> wrote: >> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with >> one controller 2gb/s attached to it. >> I am running sol 10 u3 . >> >> every time I change the recordsize of the zfs fs the disk IO improves >> (doubles) and stay like that for >> about 5 to 6 hrs. Then it dies down. I increase the recordsize again >> and performace jumps back to >> double again. The main app is oracle database with 8K blocksize >> >> I changed the zfs recordsize to from 8K to 16K and then 32K every 8 >> hrs, which improved the disk IO >> >> I wonder if there is any other zfs parameter that I can change to keep >> the performance good, since I >> am running older sol 10. >> >> I have single disk luns on the 3510 with mpxio enabled on T2000. each >> disk has two paths (primary,primary) >> online per luxadm. >> >> zpool iostat 10 gives me only about 6MB max write bandwidth. I was >> hoping it to lot higher. >> >> the battery on 3510 is expired and waiting for a replacement. >> >> besides replacing the battery, what else can I do to improve the write >> bandwidth? >> >> does the battery expire directly affecting the oracle''s disk IO? I >> thought oracle will just write to zfs and done. >> and zpool will then write-through to controller instead of write-back >> since no battery. >> >> sun storage guys found no other issue besides the battery. >> >> should disabling zil improve performance? I won''t try it until we get >> the battery so not to risk data loss >> during outage. > > so my 3510 is essentially behaving like a 3510 jbod but why would that > make the IO bandwidth this low?The application is not driving enough load to make the bandwidth be higher. Why? Because it is an Oracle database and will be making sync writes, by default. Since you do not have a working battery, those writes are taking 10-40ms each. Replace your battery. -- richard> > here are some iodata which make the t2000/3510 setup looks even worse > > http://pastebin.com/QeAKDbfj > > >> >> -- >> Asif Iqbal >> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >> A: Because it messes up the order in which people normally read text. >> Q: Why is top-posting such a bad thing? >> > > > > -- > Asif Iqbal > PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/
On Thu, May 20, 2010 at 8:34 PM, Richard Elling <richard.elling at gmail.com> wrote:> On May 20, 2010, at 11:07 AM, Asif Iqbal wrote: > >> On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vadud3 at gmail.com> wrote: >>> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with >>> one controller 2gb/s attached to it. >>> I am running sol 10 u3 . >>> >>> every time I change the recordsize of the zfs fs the disk IO improves >>> (doubles) and stay like that for >>> about 5 to 6 hrs. Then it dies down. I increase the recordsize again >>> and performace jumps back to >>> double again. The main app is oracle database with 8K blocksize >>> >>> I changed the zfs recordsize to from 8K to 16K and then 32K every 8 >>> hrs, which improved the disk IO >>> >>> I wonder if there is any other zfs parameter that I can change to keep >>> the performance good, since I >>> am running older sol 10. >>> >>> I have single disk luns on the 3510 with mpxio enabled on T2000. each >>> disk has two paths (primary,primary) >>> online per luxadm. >>> >>> zpool iostat 10 gives me only about 6MB max write bandwidth. I was >>> hoping it to lot higher. >>> >>> the battery on 3510 is expired and waiting for a replacement. >>> >>> besides replacing the battery, what else can I do to improve the write >>> bandwidth? >>> >>> does the battery expire directly affecting the oracle''s disk IO? I >>> thought oracle will just write to zfs and done. >>> and zpool will then write-through to controller instead of write-back >>> since no battery. >>> >>> sun storage guys found no other issue besides the battery. >>> >>> should disabling zil improve performance? I won''t try it until we get >>> the battery so not to risk data loss >>> during outage. >> >> so my 3510 is essentially behaving like a 3510 jbod but why would that >> make the IO bandwidth this low? > > The application is not driving enough load to make the bandwidth beknow of an one liner how to test the high write bandwidth can reach ? I know sequential read bandwidth (dd) and random read bandwidth (find) test. But do not know of one for write.> higher. ?Why? ?Because it is an Oracle database and will be making > sync writes, by default. Since you do not have a working battery, thosewould be nice if there is a way to tell oracle to let zpool do the sync write instead> writes are taking 10-40ms each. ?Replace your battery.i was told by app team that, another similar setup (t2000+4port hba+3510+6 LD raidz2) performed twice as high disk write IO and i checked the battery is expired on that one too unfortunately we failed over to this server to do maintenance on that setup. so no way to verify that observation.. heh. oh well> ?-- richard-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
On Thu, May 20, 2010 at 8:34 PM, Richard Elling <richard.elling at gmail.com> wrote:> On May 20, 2010, at 11:07 AM, Asif Iqbal wrote: > >> On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vadud3 at gmail.com> wrote: >>> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with >>> one controller 2gb/s attached to it. >>> I am running sol 10 u3 . >>> >>> every time I change the recordsize of the zfs fs the disk IO improves >>> (doubles) and stay like that for >>> about 5 to 6 hrs. Then it dies down. I increase the recordsize again >>> and performace jumps back to >>> double again. The main app is oracle database with 8K blocksize >>> >>> I changed the zfs recordsize to from 8K to 16K and then 32K every 8 >>> hrs, which improved the disk IO >>> >>> I wonder if there is any other zfs parameter that I can change to keep >>> the performance good, since I >>> am running older sol 10. >>> >>> I have single disk luns on the 3510 with mpxio enabled on T2000. each >>> disk has two paths (primary,primary) >>> online per luxadm. >>> >>> zpool iostat 10 gives me only about 6MB max write bandwidth. I was >>> hoping it to lot higher. >>> >>> the battery on 3510 is expired and waiting for a replacement. >>> >>> besides replacing the battery, what else can I do to improve the write >>> bandwidth? >>> >>> does the battery expire directly affecting the oracle''s disk IO? I >>> thought oracle will just write to zfs and done. >>> and zpool will then write-through to controller instead of write-back >>> since no battery. >>> >>> sun storage guys found no other issue besides the battery. >>> >>> should disabling zil improve performance? I won''t try it until we get >>> the battery so not to risk data loss >>> during outage. >> >> so my 3510 is essentially behaving like a 3510 jbod but why would that >> make the IO bandwidth this low? > > The application is not driving enough load to make the bandwidth be > higher. ?Why? ?Because it is an Oracle database and will be making > sync writes, by default. Since you do not have a working battery, those > writes are taking 10-40ms each. ?Replace your battery.is that mean, in other words oracle write io will be about 7MB/s if zpool is made out of only jbods ? I am assuming the disks spec 146GB 15K rpm> ?-- richard > >> >> here are some iodata which make the t2000/3510 setup looks even worse >> >> http://pastebin.com/QeAKDbfj >> >> >>> >>> -- >>> Asif Iqbal >>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >>> A: Because it messes up the order in which people normally read text. >>> Q: Why is top-posting such a bad thing? >>> >> >> >> >> -- >> Asif Iqbal >> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >> A: Because it messes up the order in which people normally read text. >> Q: Why is top-posting such a bad thing? >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- > ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 > http://nexenta-rotterdam.eventbrite.com/ > > > > > > >-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Richard Elling
2010-May-21 02:53 UTC
[zfs-discuss] zfs recordsize change improves performance
On May 20, 2010, at 7:09 PM, Asif Iqbal wrote:> On Thu, May 20, 2010 at 8:34 PM, Richard Elling > <richard.elling at gmail.com> wrote: >> On May 20, 2010, at 11:07 AM, Asif Iqbal wrote: >> >>> On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vadud3 at gmail.com> wrote: >>>> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with >>>> one controller 2gb/s attached to it. >>>> I am running sol 10 u3 .I seemed to have missed this the first read-through. Solaris 10u3? Are you serious? That was released nearly 5 years ago. Has it been patched at all? If not, then I think you shouldn''t expect the sort of performance you can get with a modern release.>>>> every time I change the recordsize of the zfs fs the disk IO improves >>>> (doubles) and stay like that for >>>> about 5 to 6 hrs. Then it dies down. I increase the recordsize again >>>> and performace jumps back to >>>> double again. The main app is oracle database with 8K blocksize >>>> >>>> I changed the zfs recordsize to from 8K to 16K and then 32K every 8 >>>> hrs, which improved the disk IOYes. If the recordsize is greater than the database block size, then you will be doing more read/modify/write cycles which will increase disk I/O rates, but decrease overall performance and efficiency.>>>> >>>> I wonder if there is any other zfs parameter that I can change to keep >>>> the performance good, since I >>>> am running older sol 10. >>>> >>>> I have single disk luns on the 3510 with mpxio enabled on T2000. each >>>> disk has two paths (primary,primary) >>>> online per luxadm. >>>> >>>> zpool iostat 10 gives me only about 6MB max write bandwidth. I was >>>> hoping it to lot higher. >>>> >>>> the battery on 3510 is expired and waiting for a replacement. >>>> >>>> besides replacing the battery, what else can I do to improve the write >>>> bandwidth? >>>> >>>> does the battery expire directly affecting the oracle''s disk IO? I >>>> thought oracle will just write to zfs and done. >>>> and zpool will then write-through to controller instead of write-back >>>> since no battery. >>>> >>>> sun storage guys found no other issue besides the battery. >>>> >>>> should disabling zil improve performance? I won''t try it until we get >>>> the battery so not to risk data loss >>>> during outage.If you disable the ZIL for locally run Oracle and you have an unscheduled outage, then it is highly probable that you will lose data.>>> >>> so my 3510 is essentially behaving like a 3510 jbod but why would that >>> make the IO bandwidth this low? >> >> The application is not driving enough load to make the bandwidth be >> higher. Why? Because it is an Oracle database and will be making >> sync writes, by default. Since you do not have a working battery, those >> writes are taking 10-40ms each. Replace your battery. > > is that mean, in other words oracle write io will be about 7MB/s if > zpool is made out of only jbods ? I am > assuming the disks spec 146GB 15K rpmHow is the pool created? Send the output of "zpool status poolname" I can''t tell definitively from the iostat, but it appears that you have quite a bit of read/modify/write activity. You will not likely be bandwidth limited for Oracle. You are very likely to be latency limited. Until you get better latency, you won''t see better application performance. -- richard> >> -- richard >> >>> >>> here are some iodata which make the t2000/3510 setup looks even worse >>> >>> http://pastebin.com/QeAKDbfj >>> >>> >>>> >>>> -- >>>> Asif Iqbal >>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >>>> A: Because it messes up the order in which people normally read text. >>>> Q: Why is top-posting such a bad thing? >>>> >>> >>> >>> >>> -- >>> Asif Iqbal >>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >>> A: Because it messes up the order in which people normally read text. >>> Q: Why is top-posting such a bad thing? >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> -- >> ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 >> http://nexenta-rotterdam.eventbrite.com/ >> >> >> >> >> >> >> > > > > -- > Asif Iqbal > PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing?-- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/
On Thu, May 20, 2010 at 10:53 PM, Richard Elling <richard.elling at gmail.com> wrote:> On May 20, 2010, at 7:09 PM, Asif Iqbal wrote: > >> On Thu, May 20, 2010 at 8:34 PM, Richard Elling >> <richard.elling at gmail.com> wrote: >>> On May 20, 2010, at 11:07 AM, Asif Iqbal wrote: >>> >>>> On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vadud3 at gmail.com> wrote: >>>>> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with >>>>> one controller 2gb/s attached to it. >>>>> I am running sol 10 u3 . > > I seemed to have missed this the first read-through. ?Solaris 10u3? ?Are > you serious? ?That was released nearly 5 years ago. ?Has it been patched > at all? ?If not, then I think you shouldn''t expect the sort of performance you > can get with a modern release.We just patched the failover server/storage. This is next.> >>>>> every time I change the recordsize of the zfs fs the disk IO improves >>>>> (doubles) and stay like that for >>>>> about 5 to 6 hrs. Then it dies down. I increase the recordsize again >>>>> and performace jumps back to >>>>> double again. The main app is oracle database with 8K blocksize >>>>> >>>>> I changed the zfs recordsize to from 8K to 16K and then 32K every 8 >>>>> hrs, which improved the disk IO > > Yes. ?If the recordsize is greater than the database block size, then > you will be doing more read/modify/write cycles which will increase > disk I/O rates, but decrease overall performance and efficiency. >well application becomes happy to with every upward change. made me think zfs cache is getting flushed with this change.>>>>> >>>>> I wonder if there is any other zfs parameter that I can change to keep >>>>> the performance good, since I >>>>> am running older sol 10. >>>>> >>>>> I have single disk luns on the 3510 with mpxio enabled on T2000. each >>>>> disk has two paths (primary,primary) >>>>> online per luxadm. >>>>> >>>>> zpool iostat 10 gives me only about 6MB max write bandwidth. I was >>>>> hoping it to lot higher. >>>>> >>>>> the battery on 3510 is expired and waiting for a replacement. >>>>> >>>>> besides replacing the battery, what else can I do to improve the write >>>>> bandwidth? >>>>> >>>>> does the battery expire directly affecting the oracle''s disk IO? I >>>>> thought oracle will just write to zfs and done. >>>>> and zpool will then write-through to controller instead of write-back >>>>> since no battery. >>>>> >>>>> sun storage guys found no other issue besides the battery. >>>>> >>>>> should disabling zil improve performance? I won''t try it until we get >>>>> the battery so not to risk data loss >>>>> during outage. > > If you disable the ZIL for locally run Oracle and you have an unscheduled > outage, then it is highly probable that you will lose data.yep. that is why I am not doing it until we replace the battery> >>>> >>>> so my 3510 is essentially behaving like a 3510 jbod but why would that >>>> make the IO bandwidth this low? >>> >>> The application is not driving enough load to make the bandwidth be >>> higher. ?Why? ?Because it is an Oracle database and will be making >>> sync writes, by default. Since you do not have a working battery, those >>> writes are taking 10-40ms each. ?Replace your battery. >> >> is that mean, in other words oracle write io will be about 7MB/s if >> zpool is made out of only jbods ? I am >> assuming the disks spec 146GB 15K rpm > > How is the pool created? ?Send the output of "zpool status poolname" > I can''t tell definitively from the iostat, but it appears that you have quite > a bit of read/modify/write activity.bash-3.00# zpool status mypool pool: mypool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c4t600C0FF0000000000A77B02A06F84B00d0 ONLINE 0 0 0 c4t600C0FF0000000000A77B02E7F2C8C00d0 ONLINE 0 0 0 c4t600C0FF0000000000A77B05D232D4E00d0 ONLINE 0 0 0 c4t600C0FF0000000000A77B07E236A7A00d0 ONLINE 0 0 0 c4t600C0FF0000000000A77B07E6593C400d0 ONLINE 0 0 0 c4t600C0FF0000000000A77B016E1C3A800d0 ONLINE 0 0 0 errors: No known data errors> > You will not likely be bandwidth limited for Oracle. ?You are very likely > to be latency limited. ?Until you get better latency, you won''t see better > application performance.ok. like I mentioned to another thread, would be nice if there is a way to tell oracle to not to sync write to disk but just to zpool. but that will probably make oracle angry> ?-- richard > >> >>> ?-- richard >>> >>>> >>>> here are some iodata which make the t2000/3510 setup looks even worse >>>> >>>> http://pastebin.com/QeAKDbfj >>>> >>>> >>>>> >>>>> -- >>>>> Asif Iqbal >>>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >>>>> A: Because it messes up the order in which people normally read text. >>>>> Q: Why is top-posting such a bad thing? >>>>> >>>> >>>> >>>> >>>> -- >>>> Asif Iqbal >>>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >>>> A: Because it messes up the order in which people normally read text. >>>> Q: Why is top-posting such a bad thing? >>>> _______________________________________________ >>>> zfs-discuss mailing list >>>> zfs-discuss at opensolaris.org >>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >>> >>> -- >>> ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 >>> http://nexenta-rotterdam.eventbrite.com/ >>> >>> >>> >>> >>> >>> >>> >> >> >> >> -- >> Asif Iqbal >> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >> A: Because it messes up the order in which people normally read text. >> Q: Why is top-posting such a bad thing? > > -- > ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 > http://nexenta-rotterdam.eventbrite.com/ > > > > > > >-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
Miles Nordin
2010-May-24 17:47 UTC
[zfs-discuss] zfs recordsize change improves performance
>>>>> "ai" == Asif Iqbal <vadud3 at gmail.com> writes:>> If you disable the ZIL for locally run Oracle and you have an >> unscheduled outage, then it is highly probable that you will >> lose data. ai> yep. that is why I am not doing it until we replace the ai> battery no, wait please, you still need the ZIL to be on, even with the battery. disabling the cache flush command is what the guide says is allowed and sometimes helpful for people who have NVRAM''s, but disabling the cache flush command and disabling the ZIL are different. Disabling the ZIL means the write can be cached in DRAM until the next txg flush and not issued to the disks at all, so even if you have a disk array with an NVRAM that effectively writes everything as if it were sync, the disk array will not even see the write until txg commit time with ZIL disabled. If you have working NVRAM, I think disabling the ZIL is likely not to give much speed-up, so if you are going to try disabling it, now when your battery is dead is the time to do it. Once the battery''s fixed theory says your testing will probably show things are just as fast with ZIL enabled. AIUI if you disable the ZIL, the database should still come back in a crash-consisent state after a cord-yank, but it will be an older state than it should be, so if you have several RDBMS behind some kind of tiered middleware the different databases won''t be in sync with each other so you can lose integrity. If you have only one RDBMS I think you will lose only durability through this monkeybusiness, and integrity will survive. I''m not an expert of anything, but that''s my understanding for now. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100524/ed70d2c2/attachment.bin>
On Thu, May 20, 2010 at 8:34 PM, Richard Elling <richard.elling at gmail.com> wrote:> On May 20, 2010, at 11:07 AM, Asif Iqbal wrote: > >> On Thu, May 20, 2010 at 1:51 PM, Asif Iqbal <vadud3 at gmail.com> wrote: >>> I have a T2000 with a dual port 4gb hba (QLE2462) and a 3510FC with >>> one controller 2gb/s attached to it. >>> I am running sol 10 u3 . >>> >>> every time I change the recordsize of the zfs fs the disk IO improves >>> (doubles) and stay like that for >>> about 5 to 6 hrs. Then it dies down. I increase the recordsize again >>> and performace jumps back to >>> double again. The main app is oracle database with 8K blocksize >>> >>> I changed the zfs recordsize to from 8K to 16K and then 32K every 8 >>> hrs, which improved the disk IO >>> >>> I wonder if there is any other zfs parameter that I can change to keep >>> the performance good, since I >>> am running older sol 10. >>> >>> I have single disk luns on the 3510 with mpxio enabled on T2000. each >>> disk has two paths (primary,primary) >>> online per luxadm. >>> >>> zpool iostat 10 gives me only about 6MB max write bandwidth. I was >>> hoping it to lot higher. >>> >>> the battery on 3510 is expired and waiting for a replacement. >>> >>> besides replacing the battery, what else can I do to improve the write >>> bandwidth? >>> >>> does the battery expire directly affecting the oracle''s disk IO? I >>> thought oracle will just write to zfs and done. >>> and zpool will then write-through to controller instead of write-back >>> since no battery. >>> >>> sun storage guys found no other issue besides the battery. >>> >>> should disabling zil improve performance? I won''t try it until we get >>> the battery so not to risk data loss >>> during outage. >> >> so my 3510 is essentially behaving like a 3510 jbod but why would that >> make the IO bandwidth this low? > > The application is not driving enough load to make the bandwidth be > higher. ?Why? ?Because it is an Oracle database and will be making > sync writes, by default. Since you do not have a working battery, those > writes are taking 10-40ms each. ?Replace your battery.replaced the battery on the 3510 FC array sccli> show battery-status Upper Battery Type: 1 Upper Battery Manufacturing Date: Tue Mar 30 00:00:00 2010 Upper Battery Placed In Service: Wed Jun 2 19:49:34 2010 Upper Battery Expiration Date: Sat Jun 2 07:49:34 2012 Upper Battery Expiration Status: OK sccli: retrieving battery status: error: not an existing target --------------------------------------------------------------- Upper Battery Hardware Status: OK Lower Battery Hardware Status: N/A active service time still pretty high http://pastebin.com/TRs6UDqm current write cache policy is write-back sccli> show cache-parameters mode: write-through optimization: sequential sync-period: disabled current-global-write-policy: write-back> ?-- richard > >> >> here are some iodata which make the t2000/3510 setup looks even worse >> >> http://pastebin.com/QeAKDbfj >> >> >>> >>> -- >>> Asif Iqbal >>> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >>> A: Because it messes up the order in which people normally read text. >>> Q: Why is top-posting such a bad thing? >>> >> >> >> >> -- >> Asif Iqbal >> PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu >> A: Because it messes up the order in which people normally read text. >> Q: Why is top-posting such a bad thing? >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- > ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 > http://nexenta-rotterdam.eventbrite.com/ > > > > > > >-- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?