Hi OCFS2 list :) We are currently using XCP (XenServer opensource) with mixed shared storage (some NFS and some OCFS2). Everything works quite well except that when we have a network interruption or problem on the filer providing the ocfs2 iscsi target, all the machines connected to it reboots after a certain amount of time, instead of keeping to try a reconnect. It seems it's due to ocfs2 that is fencing the node if it can't write to the storage after an amount of time. With nfs we don't have this problem, if the nfs filer goes down it just waits until it comes back and resume the operations. Since we use ocfs2 all the nodes are now rebooted, that means we have to restart each VM and this take a long time. Is there a way to disable that ocfs2 behavior so our hosts doesn't reboot automatically ? Thanks :) Cheers, S?bastien
Hi, yeah we are using de o2cb cluster stack. Thanks for pointing out that we can use another cluster stack to manage it :) I'll look into that. Thanks again. Cheers, S?bastien On 05.12.2012 11:41, Lars Marowsky-Bree wrote:> On 2012-12-05T10:47:58, S?bastien RICCIO <sr at swisscenter.com> wrote: > >> With nfs we don't have this problem, if the nfs filer goes down it just >> waits until it comes back and resume the operations. >> >> Since we use ocfs2 all the nodes are now rebooted, that means we have to >> restart each VM and this take a long time. >> >> Is there a way to disable that ocfs2 behavior so our hosts doesn't >> reboot automatically ? > If you're running OCFS2 with the in-kernel O2CB cluster stack, it uses > heartbeating over the storage as the membership algorithm in a tightly > coupled version with IO fencing. So, no. > > If you were using OCFS2 with, say, Pacemaker+Corosync, you could, > because these make the cluster membership decisions based on the network > layer. > > (With SBD as a fencing agent, you can still have fencing via shared > storage, but it allows you to have up to three devices and also take the > network quorum into account before suiciding.) > > > Regards, > Lars >
If you run ocfs2 file system in cluster mode, then all nodes have to heartbeat to each other on network and storage within a timeout value. You can increase the timeout values to tolerate huge delays. On 12/5/2012 1:47 AM, S?bastien RICCIO wrote:> Hi OCFS2 list :) > > We are currently using XCP (XenServer opensource) with mixed shared > storage (some NFS and some OCFS2). > Everything works quite well except that when we have a network > interruption or problem on the filer providing the ocfs2 iscsi target, > all the machines connected to it reboots after a certain amount of time, > instead of keeping to try a reconnect. > > It seems it's due to ocfs2 that is fencing the node if it can't write to > the storage after an amount of time. > > With nfs we don't have this problem, if the nfs filer goes down it just > waits until it comes back and resume the operations. > > Since we use ocfs2 all the nodes are now rebooted, that means we have to > restart each VM and this take a long time. > > Is there a way to disable that ocfs2 behavior so our hosts doesn't > reboot automatically ? > > Thanks :) > > Cheers, > S?bastien > > > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users