A few weeks back we opened a TAR with Oracle support to determine whether an OCFS (1.0.9-12) + async configuration was considered supportable. At first a technician said yes, but I followed up with Wim's explanation of how that combination is potentially troublesome and inadequately tested. The technician double-checked, then confirmed that OCFS + async is considered risky. He did say that a later Oracle patch would address this. Just a couple of days ago, in response to a performance-related TAR, another technician said that we should try turning async mode on. When we referred him to the previous TAR, he said that the problems have since been fixed. I just noticed that OCFS 1.0.11 has been released (we skipped 1.0.10 due to the problems people were reporting). I was wondering if this version explicitly addresses the known concerns about async mode, or if the technician was misinformed. We do plan to test this release on a QA system, but I suspect that any problems would be non obvious. Any further info on this matter would be appreciated. Derek
no that hasn't been solved yet. I am not sure why we ar still seeing some problems, actually, haven't had time to look at some of the later reported issues. normally the 1.0.9-x and 1.0.10/11 should work fine. as long as the redologilfes are created in 1 chunk. there is absolutely no reason why anything else would fail. but I still have to double check. On Wed, Mar 24, 2004 at 04:54:25PM -0800, Derek Suzuki wrote:> A few weeks back we opened a TAR with Oracle support to determine > whether an OCFS (1.0.9-12) + async configuration was considered supportable. > At first a technician said yes, but I followed up with Wim's explanation of > how that combination is potentially troublesome and inadequately tested. > The technician double-checked, then confirmed that OCFS + async is > considered risky. He did say that a later Oracle patch would address this. > Just a couple of days ago, in response to a performance-related TAR, > another technician said that we should try turning async mode on. When we > referred him to the previous TAR, he said that the problems have since been > fixed. > I just noticed that OCFS 1.0.11 has been released (we skipped 1.0.10 > due to the problems people were reporting). I was wondering if this version > explicitly addresses the known concerns about async mode, or if the > technician was misinformed. We do plan to test this release on a QA system, > but I suspect that any problems would be non obvious. > Any further info on this matter would be appreciated. > > Derek > _______________________________________________ > Ocfs-users mailing list > Ocfs-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs-users
The real issue is that asyncio + directio has one problem in the 2.4 kernel, i.e., non-contiguous ios. The probable reason it was not detected earlier was because raw, the only core kernel component using directio, performs only contiguous ios. This issue cannot be fixed in the 2.4 kernel without breaking binary compatibility. It has been since been addressed in the 2.6 kernel. As ocfs is a filesystem, it allows files to be non-contiguous. Also, as it is a "clustered" file system, it requires the ios to be o_direct. So, well, we are affected by this kernel bug or limitation, depending on whom you talk to. So, if anyone were to ask, does ocfs support asyncio, the easier answer is no. However, if someone persisted, the answer is, only if your logfiles are contiguous. And as that gets into reading debugocfs outputs, the user has to make the determination, if the effort is worth the gain in performance. Why only logfiles? Well, because Oracle performs large ios only to the logfiles. The ios to be datafiles are in smaller chunks. Hope this helps. Sunil Derek Suzuki wrote:> A few weeks back we opened a TAR with Oracle support to determine >whether an OCFS (1.0.9-12) + async configuration was considered supportable. >At first a technician said yes, but I followed up with Wim's explanation of >how that combination is potentially troublesome and inadequately tested. >The technician double-checked, then confirmed that OCFS + async is >considered risky. He did say that a later Oracle patch would address this. > Just a couple of days ago, in response to a performance-related TAR, >another technician said that we should try turning async mode on. When we >referred him to the previous TAR, he said that the problems have since been >fixed. > I just noticed that OCFS 1.0.11 has been released (we skipped 1.0.10 >due to the problems people were reporting). I was wondering if this version >explicitly addresses the known concerns about async mode, or if the >technician was misinformed. We do plan to test this release on a QA system, >but I suspect that any problems would be non obvious. > Any further info on this matter would be appreciated. > >Derek >_______________________________________________ >Ocfs-users mailing list >Ocfs-users@oss.oracle.com >http://oss.oracle.com/mailman/listinfo/ocfs-users > >
Okay, thanks for the update. We'll probably stay in sync mode for now because the system is still way faster than our old platform. I'm setting up a test cluster that we can use to try out the newer OCFS releases and perform some testing with async mode.> -----Original Message----- > From: Wim Coekaerts [mailto:wim.coekaerts@oracle.com] > Sent: Wednesday, March 24, 2004 5:46 PM > To: Derek Suzuki > Cc: 'ocfs-users@oss.oracle.com' > Subject: Re: [Ocfs-users] Follow up on async I/O question > > > no that hasn't been solved yet. > > I am not sure why we ar still seeing some problems, actually, haven't > had time to look at some of the later reported issues. normally the > 1.0.9-x and 1.0.10/11 should work fine. as long as the > redologilfes are > created in 1 chunk. there is absolutely no reason why anything else > would fail. but I still have to double check. > >
Hi everybody, We have been using OCFS for one year now (many TARs opened, many patches applied...) but we never worried about direct IO or async IO. I assume that we use async IO (default behaviour for our system). What should we look at ? We use RedHat AS 2.1 (kernel errata 35) Ocfs 1.0.9-12 Oracle 9.2.0.4. Thank you for your help. Regards, Christian -----Message d'origine----- De : Derek Suzuki [mailto:DSuzuki@ziprealty.com] Envoy? : jeudi 25 mars 2004 02:54 ? : 'Wim Coekaerts' Cc : 'ocfs-users@oss.oracle.com' Objet : RE: [Ocfs-users] Follow up on async I/O question Okay, thanks for the update. We'll probably stay in sync mode for now because the system is still way faster than our old platform. I'm setting up a test cluster that we can use to try out the newer OCFS releases and perform some testing with async mode.> -----Original Message----- > From: Wim Coekaerts [mailto:wim.coekaerts@oracle.com] > Sent: Wednesday, March 24, 2004 5:46 PM > To: Derek Suzuki > Cc: 'ocfs-users@oss.oracle.com' > Subject: Re: [Ocfs-users] Follow up on async I/O question > > > no that hasn't been solved yet. > > I am not sure why we ar still seeing some problems, actually, haven't > had time to look at some of the later reported issues. normally the > 1.0.9-x and 1.0.10/11 should work fine. as long as the > redologilfes are > created in 1 chunk. there is absolutely no reason why anything else > would fail. but I still have to double check. > >_______________________________________________ Ocfs-users mailing list Ocfs-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs-users