I am starting to see a few issues with OCFS and 9.2.0.3 RAC on redHat Linux AS 2.1. Wondering if there is anyone out there experiencing similar issues... a few pointers to the issues.. 1. OCFS read/write performance is way lower than a read/write to a raw device.. i can give you some comparison numbers.. 2. Writes to shared disk with ocfs would get locked up by one server.. it doesnt have to even be an oracle process.. just a file copy will lock up the directory and the other server trying to access the same directory in the shared disk will just hang (os process has a state 'D').... the only way to free this up is to unmount the shared disks from the other server.. even killing the process which is hanging doesnt work (kill -9).. 'strace -p' also hangs and obviously the process is waiting on io.. Is there anyone else experiencing this kind of issue? Thanks Karthik Ramadoss Oracle Apps DBA The ASU Group
> a few pointers to the issues.. > 1. OCFS read/write performance is way lower than a read/write to a raw > device.. i can give you some comparison numbers..and I assume you are doing on directio's.> 2. Writes to shared disk with ocfs would get locked up by one server.. > it doesnt have to even be an oracle process.. just a file copy will > lock up the directory and the other server trying to access the same > directory in the shared disk will just hang (os process has a state > 'D').... the only way to free > this up is to unmount the shared disks from the other server.. even > killing the > process which is hanging doesnt work (kill -9).. 'strace -p' also hangs > and obviously the process is waiting on io..and can you reproduce this WITH oracle ? point is, 1- you should do directio comparisons and All/Every benchmark we ever did we were within at most 2% of raw. but yuou have to do IO like we expect Oracle to do and 2- it's not a general purpose filesystem so regular like file stuff iwll not work I will post some perforamnce results on here and I suggest you go to http://oss.oracle.com/projects/oss, go to the docs tab and read thedifferent thinsg that are on there specifically the do's and don'ts etc. if you use it for what it's built for, it works just fine. anyways let me post some results when I get around to it and read the docs that will help
Wim, Thanks for the reply. Here are a few observations. * I am using direct_io for file copies.. * Yes, I can reproduce this with Oracle. Couple of instances where I can clealy tell.. a. RMAN restore (16GB) on ocfs takes 6 hours compared to 45 minutes in ext3 filesystem. I know I shouldn't be comparing ocfs with ext3 but it still seems a looooot slower. We dont have a raw disk setup to get an actual comparison. But I will use the benchmarks you post to compare. b. The hanging issue happens even I create or resize a tablespace from Oracle. We saw this issue first with ocfs 1.0.9.9 when we upgraded from 1.0.9.4 but even rolling back to 1.0.9.4 didnt solve it now. Regards Karthik Ramadoss Oracle Apps DBA The ASU Group>>> Wim Coekaerts <wim.coekaerts@oracle.com> 11/11/2003 11:38:26 AM >>> > a few pointers to the issues.. > 1. OCFS read/write performance is way lower than a read/write to araw> device.. i can give you some comparison numbers..and I assume you are doing on directio's.> 2. Writes to shared disk with ocfs would get locked up by oneserver..> it doesnt have to even be an oracle process.. just a file copy will > lock up the directory and the other server trying to access the same > directory in the shared disk will just hang (os process has a state > 'D').... the only way to free > this up is to unmount the shared disks from the other server.. even > killing the > process which is hanging doesnt work (kill -9).. 'strace -p' alsohangs> and obviously the process is waiting on io..and can you reproduce this WITH oracle ? point is, 1- you should do directio comparisons and All/Every benchmark we ever did we were within at most 2% of raw. but yuou have to do IO like we expect Oracle to do and 2- it's not a general purpose filesystem so regular like file stuff iwll not work I will post some perforamnce results on here and I suggest you go to http://oss.oracle.com/projects/oss, go to the docs tab and read thedifferent thinsg that are on there specifically the do's and don'ts etc. if you use it for what it's built for, it works just fine. anyways let me post some results when I get around to it and read the docs that will help
Jim, I appreciate your response. I understand we are NOT supposed to use non-Oracle files on ocfs and we are not. But, the Scenario#2 occurs even when I resize a tablespace or create a tablespace from an sqlplus session in Oracle. Sometimes, the archiver process hangs with 'D' state and puts lots of the below messages in the alert log. ARC1: Evaluating archive log 3 thread 2 sequence 3 ARC1: Unable to archive log 3 thread 2 sequence 3 We are using a common archive destination for both instances and thats a directory on ocfs. I wonder if that could be causing the lockup for this particular archive issue. But that still doesnt explain why any major change to datafile like resizing or creating through Oracle would hang. With our 11.5.9 - 9.2.0.3 setup we dont have Oracle-Managed Files implemented as you may know, 11.5.9 comes with almost 400 seeded datafiles. Thanks, Karthik>>> "Fennacy, Jim" <jimf@HDCSI.com> 11/11/2003 11:47:37 AM >>>I had the chance to talk with one of the OCFS developers recently. I have not experienced scenario #1, so I cannot speak to that. But for scenario #2... OCFS v1.9 does NOT support non-Oracle managed files on the OCFS partition. Scenario #2 you indicated will occur. Oracle's UTL_FILE package does not count as "Oracle managed files", so you cannot create/read/write files using UTL_FILE package. I was told to NEVER put any files on OCFS. Let Oracle do all the work. At best there will be speed degradation; at worst the system will become unusable. OCFS v2.0 will (1) support installing a shared Oracle Home on OCFS and (2) increased performance. But only Oracle-managed files can exist on OCFS. OCFS v2.x will support any file on OCFS. This version will be released along with Oracle 10G, but may be able to work with Oracle 9i. I was also told that non-Oracle managed files on OCFS v1.9 and v2.0 only affects Linux, technically. Windows does not have the same problem. But Oracle will not support it. We are doing OFCS on a 2-node Redhat AS2.1 cluster at two different sites. We had a large number of non-Oracle files on OCFS for a couple months and experienced repeated problems. Since moving all the files off OCFS the problems have disappeared. Hope this helps. -----Original Message----- From: Karthik Ramadoss [mailto:krama@asugroup.com] Sent: Nov 11, 2003 8:19 AM To: ocfs-devel@oss.oracle.com; ocfs-users@oss.oracle.com Subject: [Ocfs-users] ocfs issues with 9.2.0.3 I am starting to see a few issues with OCFS and 9.2.0.3 RAC on redHat Linux AS 2.1. Wondering if there is anyone out there experiencing similar issues... a few pointers to the issues.. 1. OCFS read/write performance is way lower than a read/write to a raw device.. i can give you some comparison numbers.. 2. Writes to shared disk with ocfs would get locked up by one server.. it doesnt have to even be an oracle process.. just a file copy will lock up the directory and the other server trying to access the same directory in the shared disk will just hang (os process has a state 'D').... the only way to free this up is to unmount the shared disks from the other server.. even killing the process which is hanging doesnt work (kill -9).. 'strace -p' also hangs and obviously the process is waiting on io.. Is there anyone else experiencing this kind of issue? Thanks Karthik Ramadoss Oracle Apps DBA The ASU Group _______________________________________________ Ocfs-users mailing list Ocfs-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs-users
Wim, 1- which device driver, and are you using powerpath or securepath we have had at least 3 customers that were pointing at us for perofrmance and it ended up being the driver in most cases We are using Powerpath. 2- how many filesystems (ocfs) do you have Just one ocfs on 4 volumes carved out of EMC SAN (u01, u02, u03, u04). 3- if you archive, do yuou arechive to 1 filesystem from all nodes, into a seperate directory or the same dir. We are archiving from the 2 nodes to different directories in different volumes in ocfs. (Node1 to /u02 & Node2 to /u04). Thanks, Karthik On Tue, Nov 11, 2003 at 12:08:46PM -0500, Karthik Ramadoss wrote:> Jim, > > I appreciate your response. I understand we are NOT supposedto> use non-Oracle files on ocfs and we are not. > But, the Scenario#2 occurs even when I resize a tablespace or createa> tablespace from an sqlplus session in Oracle. > > Sometimes, the archiver process hangs with 'D' state and puts > lots of the below messages in the alert log. > ARC1: Evaluating archive log 3 thread 2 sequence 3 > ARC1: Unable to archive log 3 thread 2 sequence 3 > We are using a common archive destination for both instancesand> thats a directory on ocfs. I wonder if that could be causing thelockup> for this particular archive issue. But that still doesnt explain why > any major change to datafile like resizing or creating throughOracle> would hang. > With our 11.5.9 - 9.2.0.3 setup we dont have Oracle-ManagedFiles> implemented as you may know, 11.5.9 comes with > almost 400 seeded datafiles. > > Thanks, > Karthik > > >>> "Fennacy, Jim" <jimf@HDCSI.com> 11/11/2003 11:47:37 AM >>> > I had the chance to talk with one of the OCFS developers recently. > I have not experienced scenario #1, so I cannot speak to that. > > But for scenario #2... > > OCFS v1.9 does NOT support non-Oracle managed files on the OCFS > partition. > Scenario #2 you indicated will occur. Oracle's UTL_FILE packagedoes> not > count as "Oracle managed files", so you cannot create/read/writefiles> using > UTL_FILE package. I was told to NEVER put any files on OCFS. Let > Oracle do > all the work. At best there will be speed degradation; at worst the > system > will become unusable. > > OCFS v2.0 will (1) support installing a shared Oracle Home on OCFSand> (2) > increased performance. But only Oracle-managed files can exist on > OCFS. > > OCFS v2.x will support any file on OCFS. This version will be > released > along with Oracle 10G, but may be able to work with Oracle 9i. > > I was also told that non-Oracle managed files on OCFS v1.9 and v2.0 > only > affects Linux, technically. Windows does not have the same problem.> But > Oracle will not support it. > > We are doing OFCS on a 2-node Redhat AS2.1 cluster at two different > sites. > We had a large number of non-Oracle files on OCFS for a couplemonths> and > experienced repeated problems. Since moving all the files off OCFS > the > problems have disappeared. > > Hope this helps. > > -----Original Message----- > From: Karthik Ramadoss [mailto:krama@asugroup.com] > Sent: Nov 11, 2003 8:19 AM > To: ocfs-devel@oss.oracle.com; ocfs-users@oss.oracle.com > Subject: [Ocfs-users] ocfs issues with 9.2.0.3 > > > I am starting to see a few issues with OCFS and 9.2.0.3 RAC on > redHat Linux AS 2.1. Wondering if there is anyone out there > experiencing similar issues... > > a few pointers to the issues.. > 1. OCFS read/write performance is way lower than a read/write to araw> device.. i can give you some comparison numbers.. > > 2. Writes to shared disk with ocfs would get locked up by oneserver..> it doesnt have to even be an oracle process.. just a file copy will > lock up the directory and the other server trying to access the same > directory in the shared disk will just hang (os process has a state > 'D').... the only way to free > this up is to unmount the shared disks from the other server.. even > killing the > process which is hanging doesnt work (kill -9).. 'strace -p' also > hangs > and obviously the process is waiting on io.. > > Is there anyone else experiencing this kind of issue? > > Thanks > Karthik Ramadoss > Oracle Apps DBA > The ASU Group > > _______________________________________________ > Ocfs-users mailing list > Ocfs-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs-users > _______________________________________________ > Ocfs-users mailing list > Ocfs-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs-users
I wuold try without powerpath (mount the /dev/sdXX devices directly) I would try the qlogic driver that comes with red hat (if you are using the emc/dell provided one) and try again pwoerpath hurts performance a lot with certain drivers. the qla driver that we have seen being installed at customers was linked without himem io and without varyio which is just plain wrong. so, also you shoudl talk to oracle support since this clearly is a real setup... this is just as time permits on this list so don't want to set expectactions that this will always get instant responses. also if you run e.24 as kernel, go to at least e.25 so my advice red hat e.25 qlogic driver from addon/qla2xxx/ from Red Hat -not anyone elses and try direct on the device instead of empower On Tue, Nov 11, 2003 at 01:34:10PM -0500, Karthik Ramadoss wrote:> Wim, > > 1- which device driver, and are you using powerpath or securepath > we have had at least 3 customers that were pointing at us for > perofrmance and it ended up being the driver in most cases > > We are using Powerpath. > > 2- how many filesystems (ocfs) do you have > > Just one ocfs on 4 volumes carved out of EMC SAN (u01, u02, u03, u04). > > 3- if you archive, do yuou arechive to 1 filesystem from all nodes, > into > a seperate directory or the same dir. > > We are archiving from the 2 nodes to different directories in different > volumes in ocfs. (Node1 to /u02 & Node2 to /u04). > > Thanks, > Karthik > > On Tue, Nov 11, 2003 at 12:08:46PM -0500, Karthik Ramadoss wrote: > > Jim, > > > > I appreciate your response. I understand we are NOT supposed > to > > use non-Oracle files on ocfs and we are not. > > But, the Scenario#2 occurs even when I resize a tablespace or create > a > > tablespace from an sqlplus session in Oracle. > > > > Sometimes, the archiver process hangs with 'D' state and puts > > lots of the below messages in the alert log. > > ARC1: Evaluating archive log 3 thread 2 sequence 3 > > ARC1: Unable to archive log 3 thread 2 sequence 3 > > We are using a common archive destination for both instances > and > > thats a directory on ocfs. I wonder if that could be causing the > lockup > > for this particular archive issue. But that still doesnt explain why > > any major change to datafile like resizing or creating through > Oracle > > would hang. > > With our 11.5.9 - 9.2.0.3 setup we dont have Oracle-Managed > Files > > implemented as you may know, 11.5.9 comes with > > almost 400 seeded datafiles. > > > > Thanks, > > Karthik > > > > >>> "Fennacy, Jim" <jimf@HDCSI.com> 11/11/2003 11:47:37 AM >>> > > I had the chance to talk with one of the OCFS developers recently. > > I have not experienced scenario #1, so I cannot speak to that. > > > > But for scenario #2... > > > > OCFS v1.9 does NOT support non-Oracle managed files on the OCFS > > partition. > > Scenario #2 you indicated will occur. Oracle's UTL_FILE package > does > > not > > count as "Oracle managed files", so you cannot create/read/write > files > > using > > UTL_FILE package. I was told to NEVER put any files on OCFS. Let > > Oracle do > > all the work. At best there will be speed degradation; at worst the > > system > > will become unusable. > > > > OCFS v2.0 will (1) support installing a shared Oracle Home on OCFS > and > > (2) > > increased performance. But only Oracle-managed files can exist on > > OCFS. > > > > OCFS v2.x will support any file on OCFS. This version will be > > released > > along with Oracle 10G, but may be able to work with Oracle 9i. > > > > I was also told that non-Oracle managed files on OCFS v1.9 and v2.0 > > only > > affects Linux, technically. Windows does not have the same problem. > > > But > > Oracle will not support it. > > > > We are doing OFCS on a 2-node Redhat AS2.1 cluster at two different > > sites. > > We had a large number of non-Oracle files on OCFS for a couple > months > > and > > experienced repeated problems. Since moving all the files off OCFS > > the > > problems have disappeared. > > > > Hope this helps. > > > > -----Original Message----- > > From: Karthik Ramadoss [mailto:krama@asugroup.com] > > Sent: Nov 11, 2003 8:19 AM > > To: ocfs-devel@oss.oracle.com; ocfs-users@oss.oracle.com > > Subject: [Ocfs-users] ocfs issues with 9.2.0.3 > > > > > > I am starting to see a few issues with OCFS and 9.2.0.3 RAC on > > redHat Linux AS 2.1. Wondering if there is anyone out there > > experiencing similar issues... > > > > a few pointers to the issues.. > > 1. OCFS read/write performance is way lower than a read/write to a > raw > > device.. i can give you some comparison numbers.. > > > > 2. Writes to shared disk with ocfs would get locked up by one > server.. > > it doesnt have to even be an oracle process.. just a file copy will > > lock up the directory and the other server trying to access the same > > directory in the shared disk will just hang (os process has a state > > 'D').... the only way to free > > this up is to unmount the shared disks from the other server.. even > > killing the > > process which is hanging doesnt work (kill -9).. 'strace -p' also > > hangs > > and obviously the process is waiting on io.. > > > > Is there anyone else experiencing this kind of issue? > > > > Thanks > > Karthik Ramadoss > > Oracle Apps DBA > > The ASU Group > > > > _______________________________________________ > > Ocfs-users mailing list > > Ocfs-users@oss.oracle.com > > http://oss.oracle.com/mailman/listinfo/ocfs-users > > _______________________________________________ > > Ocfs-users mailing list > > Ocfs-users@oss.oracle.com > > http://oss.oracle.com/mailman/listinfo/ocfs-users