Hello, Given the following configuration: * Server with 12 SPARCVII CPUs and 96 GB of RAM * ZFS used as file system for Oracle data * Oracle 10.2.0.4 with 1.7TB of data and indexes * 1800 concurrents users with PeopleSoft Financial * 20000 PeopleSoft transactions per day * HDS USP1100 with LUNs stripped on 6 parity groups (450xRAID7+1), total 48 disks * 2x 4Gbps FC with MPxIO Which is the best Oracle SGA size to avoid cache duplication between Oracle and ZFS? Is it better to have a "small SGA + big ZFS ARC" or "large SGA + small ZFS ARC"? Who does a better cache for overall performance? Thanks in advance and best regards, Javi
On Sep 24, 2009, at 10:30 AM, Javier Conde wrote:> > Hello, > > Given the following configuration: > > * Server with 12 SPARCVII CPUs and 96 GB of RAM > * ZFS used as file system for Oracle data > * Oracle 10.2.0.4 with 1.7TB of data and indexes > * 1800 concurrents users with PeopleSoft Financial > * 20000 PeopleSoft transactions per day > * HDS USP1100 with LUNs stripped on 6 parity groups (450xRAID7+1), > total 48 disks > * 2x 4Gbps FC with MPxIO > > Which is the best Oracle SGA size to avoid cache duplication between > Oracle and ZFS? > > Is it better to have a "small SGA + big ZFS ARC" or "large SGA + > small ZFS ARC"? > > Who does a better cache for overall performance?In general, it is better to cache closer to the consumer (application). You don''t mention what version of Solaris or ZFS you are using. For later versions, the primarycache property allows you to control the ARC usage on a per-dataset basis. -- richard
Hi Richard, Thanks for your reply. We are using Solaris 10 u6 and ZFS version 10. Regards, Javi Richard Elling wrote:> On Sep 24, 2009, at 10:30 AM, Javier Conde wrote: >> >> Hello, >> >> Given the following configuration: >> >> * Server with 12 SPARCVII CPUs and 96 GB of RAM >> * ZFS used as file system for Oracle data >> * Oracle 10.2.0.4 with 1.7TB of data and indexes >> * 1800 concurrents users with PeopleSoft Financial >> * 20000 PeopleSoft transactions per day >> * HDS USP1100 with LUNs stripped on 6 parity groups (450xRAID7+1), >> total 48 disks >> * 2x 4Gbps FC with MPxIO >> >> Which is the best Oracle SGA size to avoid cache duplication between >> Oracle and ZFS? >> >> Is it better to have a "small SGA + big ZFS ARC" or "large SGA + >> small ZFS ARC"? >> >> Who does a better cache for overall performance? > > In general, it is better to cache closer to the consumer (application). > > You don''t mention what version of Solaris or ZFS you are using. > For later versions, the primarycache property allows you to control the > ARC usage on a per-dataset basis. > -- richard >
Richard Elling wrote:> On Sep 24, 2009, at 10:30 AM, Javier Conde wrote: >> >> Hello, >> >> Given the following configuration: >> >> * Server with 12 SPARCVII CPUs and 96 GB of RAM >> * ZFS used as file system for Oracle data >> * Oracle 10.2.0.4 with 1.7TB of data and indexes >> * 1800 concurrents users with PeopleSoft Financial >> * 20000 PeopleSoft transactions per day >> * HDS USP1100 with LUNs stripped on 6 parity groups (450xRAID7+1), >> total 48 disks >> * 2x 4Gbps FC with MPxIO >> >> Which is the best Oracle SGA size to avoid cache duplication between >> Oracle and ZFS? >> >> Is it better to have a "small SGA + big ZFS ARC" or "large SGA + small >> ZFS ARC"? >> >> Who does a better cache for overall performance? > > In general, it is better to cache closer to the consumer (application). > > You don''t mention what version of Solaris or ZFS you are using. > For later versions, the primarycache property allows you to control the > ARC usage on a per-dataset basis. > -- richard > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discussHi addign oracle-interest I would suggest some testing but standard recommendation to start with are keep zfs record size is db block size, keep oracle log writer to it''s own pool ( 128k recordsize is recommended I believe for this one ), the log writer is a io limiting factor as such , use latest Ku''s for solaris as they contain some critical fixes for zfs/oracle, ie 6775697 for instance. Small SGA is not usually recommended, but of course a lot depends on application layer as well, I can only say test with the recommendations above and then deviate from there, perhaps keeping zil on separate high latency device might help ( again only analysis can determine all that ). Then remember that even after that with a large SGA etc, sometimes perf can degrade, ie might need to instruct oracle to actually cache, via alter table cache command etc. getting familiar with statspack aws will be a must here :-) as only an analysis of Oracle from an oracle point of view can really tell what is workign as such. Enda
Hi, Definitely large SGA, small arc. In fact, it''s best to disable the ARC altogether for the Oracle filesystems. Blocks in the db_cache (oracle cache) can be used "as is" while cached data from ARC needs significant CPU processing before it''s inserted back into the db_cache. Not to mention that block in db_cache can remain dirty for longer periods, saving disk writes. But definetelly: - separate redo disk (preferably dedicated disk/pool) - your ZFS filesystem needs to match the oracle block size (8Kb default) With your configuration, and assuming nothing else (but oracle database server) on the system, a db_cache size in the 70 GiB range would be perfectly acceptable. Don''t forget to set pga_aggregate_target to something reasonable too, like 20 GiB. Christo Kutrovsky Senior DBA The Pythian Group I Blog at: www.pythian.com/news -- This message posted from opensolaris.org
Hi all, There is no generic response for: Is it better to have a "small SGA + big ZFS ARC" or "large SGA + small ZFS ARC"? We can awser: Have a large enough SGA do get good cache hit ratio (higher than 90 % for OLTP). Have some GB ZFS arc (Not less than 500M, usually more than 16GB is not usefull). Then you have to tune. We know that ZFS cache help the database reads. The cache strategies of ZFS and Oracle are different, and usually they help each other. The is no reason to avoid to cache the same data twice. Exemple: Oracle query ask for a range scan on index. ZFS detect sequential reads and start to prefetch the data. ZFS try to cache the data that Oracle will probably ask next. When Oracle ask, the data is cache twice. All the cache are dynamics. The best knowned record size for an OLTP environment is : Dataset Recordsize Table Data 8K (db_block_size) Redo Logs 128K Index 8K (db_block_size) Undo 128K Temp 128K We still recommand a distinct zpool for redologs. Regards. Alain Ch?reau Enda O''Connor a ?crit :> Richard Elling wrote: >> On Sep 24, 2009, at 10:30 AM, Javier Conde wrote: >>> >>> Hello, >>> >>> Given the following configuration: >>> >>> * Server with 12 SPARCVII CPUs and 96 GB of RAM >>> * ZFS used as file system for Oracle data >>> * Oracle 10.2.0.4 with 1.7TB of data and indexes >>> * 1800 concurrents users with PeopleSoft Financial >>> * 20000 PeopleSoft transactions per day >>> * HDS USP1100 with LUNs stripped on 6 parity groups (450xRAID7+1), >>> total 48 disks >>> * 2x 4Gbps FC with MPxIO >>> >>> Which is the best Oracle SGA size to avoid cache duplication between >>> Oracle and ZFS? >>> >>> Is it better to have a "small SGA + big ZFS ARC" or "large SGA + >>> small ZFS ARC"? >>> >>> Who does a better cache for overall performance? >> >> In general, it is better to cache closer to the consumer (application). >> >> You don''t mention what version of Solaris or ZFS you are using. >> For later versions, the primarycache property allows you to control the >> ARC usage on a per-dataset basis. >> -- richard >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > Hi > addign oracle-interest > I would suggest some testing but standard recommendation to start with > are keep zfs record size is db block size, keep oracle log writer to > it''s own pool ( 128k recordsize is recommended I believe for this one > ), the log writer is a io limiting factor as such , use latest Ku''s > for solaris as they contain some critical fixes for zfs/oracle, ie > 6775697 for instance. Small SGA is not usually recommended, but of > course a lot depends on application layer as well, I can only say test > with the recommendations above and then deviate from there, perhaps > keeping zil on separate high latency device might help ( again only > analysis can determine all that ). Then remember that even after that > with a large SGA etc, sometimes perf can degrade, ie might need to > instruct oracle to actually cache, via alter table cache command etc. > > getting familiar with statspack aws will be a must here :-) as only an > analysis of Oracle from an oracle point of view can really tell what > is workign as such. > > Enda > >
Been there, done that, got the tee shirt.... A larger SGA will *always* be more efficient at servicing Oracle requests for blocks. You avoid going through all the IO code of Oracle and it simply reduces to a hash. http://blogs.sun.com/glennf/entry/where_do_you_cache_oracle alain at sun wrote:> Hi all, > > There is no generic response for: > Is it better to have a "small SGA + big ZFS ARC" or "large SGA + small > ZFS ARC"? > > We can awser: > Have a large enough SGA do get good cache hit ratio (higher than 90 % > for OLTP). > Have some GB ZFS arc (Not less than 500M, usually more than 16GB is > not usefull). > > Then you have to tune. We know that ZFS cache help the database reads. > The cache strategies of ZFS and Oracle are different, and usually they > help each other. > > The is no reason to avoid to cache the same data twice. > Exemple: > Oracle query ask for a range scan on index. ZFS detect sequential > reads and > start to prefetch the data. ZFS try to cache the data that Oracle will > probably ask next. > When Oracle ask, the data is cache twice. > All the cache are dynamics. > > > The best knowned record size for an OLTP environment is : > Dataset Recordsize > Table Data 8K (db_block_size) > Redo Logs 128K > Index 8K (db_block_size) > Undo > 128K > Temp > 128K > > > We still recommand a distinct zpool for redologs. > > Regards. > > Alain Ch?reau > > > Enda O''Connor a ?crit : >> Richard Elling wrote: >>> On Sep 24, 2009, at 10:30 AM, Javier Conde wrote: >>>> >>>> Hello, >>>> >>>> Given the following configuration: >>>> >>>> * Server with 12 SPARCVII CPUs and 96 GB of RAM >>>> * ZFS used as file system for Oracle data >>>> * Oracle 10.2.0.4 with 1.7TB of data and indexes >>>> * 1800 concurrents users with PeopleSoft Financial >>>> * 20000 PeopleSoft transactions per day >>>> * HDS USP1100 with LUNs stripped on 6 parity groups (450xRAID7+1), >>>> total 48 disks >>>> * 2x 4Gbps FC with MPxIO >>>> >>>> Which is the best Oracle SGA size to avoid cache duplication >>>> between Oracle and ZFS? >>>> >>>> Is it better to have a "small SGA + big ZFS ARC" or "large SGA + >>>> small ZFS ARC"? >>>> >>>> Who does a better cache for overall performance? >>> >>> In general, it is better to cache closer to the consumer (application). >>> >>> You don''t mention what version of Solaris or ZFS you are using. >>> For later versions, the primarycache property allows you to control the >>> ARC usage on a per-dataset basis. >>> -- richard >>> >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> Hi >> addign oracle-interest >> I would suggest some testing but standard recommendation to start with >> are keep zfs record size is db block size, keep oracle log writer to >> it''s own pool ( 128k recordsize is recommended I believe for this one >> ), the log writer is a io limiting factor as such , use latest Ku''s >> for solaris as they contain some critical fixes for zfs/oracle, ie >> 6775697 for instance. Small SGA is not usually recommended, but of >> course a lot depends on application layer as well, I can only say >> test with the recommendations above and then deviate from there, >> perhaps keeping zil on separate high latency device might help ( >> again only analysis can determine all that ). Then remember that even >> after that with a large SGA etc, sometimes perf can degrade, ie might >> need to instruct oracle to actually cache, via alter table cache >> command etc. >> >> getting familiar with statspack aws will be a must here :-) as only >> an analysis of Oracle from an oracle point of view can really tell >> what is workign as such. >> >> Enda >> >>
Le 28 sept. 09 ? 17:58, Glenn Fawcett a ?crit :> Been there, done that, got the tee shirt.... A larger SGA will > *always* be more efficient at servicing Oracle requests for blocks. > You avoid going through all the IO code of Oracle and it simply > reduces to a hash.Sounds like good advice when the system is single purpose. But if that leads to memory being paritioned into multiple chunks (e.g. consolidation scenarios), then putting you''re beans in the unified filesystem cache should not be overlooked. -r> > http://blogs.sun.com/glennf/entry/where_do_you_cache_oracle > > alain at sun wrote: >> Hi all, >> >> There is no generic response for: >> Is it better to have a "small SGA + big ZFS ARC" or "large SGA + >> small ZFS ARC"? >> >> We can awser: >> Have a large enough SGA do get good cache hit ratio (higher than 90 >> % for OLTP). >> Have some GB ZFS arc (Not less than 500M, usually more than 16GB is >> not usefull). >> >> Then you have to tune. We know that ZFS cache help the database >> reads. >> The cache strategies of ZFS and Oracle are different, and usually >> they help each other. >> >> The is no reason to avoid to cache the same data twice. >> Exemple: >> Oracle query ask for a range scan on index. ZFS detect sequential >> reads and >> start to prefetch the data. ZFS try to cache the data that Oracle >> will probably ask next. >> When Oracle ask, the data is cache twice. >> All the cache are dynamics. >> >> >> The best knowned record size for an OLTP environment is : >> Dataset Recordsize >> Table Data 8K (db_block_size) >> Redo Logs 128K >> Index 8K (db_block_size) >> Undo >> 128K >> Temp >> 128K >> >> >> We still recommand a distinct zpool for redologs. >> >> Regards. >> >> Alain Ch?reau >> >> >> Enda O''Connor a ?crit : >>> Richard Elling wrote: >>>> On Sep 24, 2009, at 10:30 AM, Javier Conde wrote: >>>>> >>>>> Hello, >>>>> >>>>> Given the following configuration: >>>>> >>>>> * Server with 12 SPARCVII CPUs and 96 GB of RAM >>>>> * ZFS used as file system for Oracle data >>>>> * Oracle 10.2.0.4 with 1.7TB of data and indexes >>>>> * 1800 concurrents users with PeopleSoft Financial >>>>> * 20000 PeopleSoft transactions per day >>>>> * HDS USP1100 with LUNs stripped on 6 parity groups >>>>> (450xRAID7+1), total 48 disks >>>>> * 2x 4Gbps FC with MPxIO >>>>> >>>>> Which is the best Oracle SGA size to avoid cache duplication >>>>> between Oracle and ZFS? >>>>> >>>>> Is it better to have a "small SGA + big ZFS ARC" or "large SGA + >>>>> small ZFS ARC"? >>>>> >>>>> Who does a better cache for overall performance? >>>> >>>> In general, it is better to cache closer to the consumer >>>> (application). >>>> >>>> You don''t mention what version of Solaris or ZFS you are using. >>>> For later versions, the primarycache property allows you to >>>> control the >>>> ARC usage on a per-dataset basis. >>>> -- richard >>>> >>>> _______________________________________________ >>>> zfs-discuss mailing list >>>> zfs-discuss at opensolaris.org >>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >>> Hi >>> addign oracle-interest >>> I would suggest some testing but standard recommendation to start >>> with >>> are keep zfs record size is db block size, keep oracle log writer >>> to it''s own pool ( 128k recordsize is recommended I believe for >>> this one ), the log writer is a io limiting factor as such , use >>> latest Ku''s for solaris as they contain some critical fixes for >>> zfs/oracle, ie 6775697 for instance. Small SGA is not usually >>> recommended, but of course a lot depends on application layer as >>> well, I can only say test with the recommendations above and then >>> deviate from there, perhaps keeping zil on separate high latency >>> device might help ( again only analysis can determine all that ). >>> Then remember that even after that with a large SGA etc, sometimes >>> perf can degrade, ie might need to instruct oracle to actually >>> cache, via alter table cache command etc. >>> >>> getting familiar with statspack aws will be a must here :-) as >>> only an analysis of Oracle from an oracle point of view can really >>> tell what is workign as such. >>> >>> Enda >>> >>>-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2431 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090929/cf1c8d55/attachment.bin>