thr3ads.net - Ocfs2 users - [Ocfs2-users] Avoid node reboot on timeout [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Sébastien RICCIO

2012-Dec-05 09:47 UTC

[Ocfs2-users] Avoid node reboot on timeout

Hi OCFS2 list :)

We are currently using XCP (XenServer opensource) with mixed shared 
storage (some NFS and some OCFS2).
Everything works quite well except that when we have a network 
interruption or problem on the filer providing the ocfs2 iscsi target, 
all the machines connected to it reboots after a certain amount of time, 
instead of keeping to try  a reconnect.

It seems it's due to ocfs2 that is fencing the node if it can't write to
the storage after an amount of time.

With nfs we don't have this problem, if the nfs filer goes down it just 
waits until it comes back and resume the operations.

Since we use ocfs2 all the nodes are now rebooted, that means we have to 
restart each VM and this take a long time.

Is there a way to disable that ocfs2 behavior so our hosts doesn't 
reboot automatically ?

Thanks :)

Cheers,
S?bastien

Sébastien RICCIO

2012-Dec-05 13:35 UTC

head link

[Ocfs2-users] Avoid node reboot on timeout

Hi, yeah we are using de o2cb cluster stack.

Thanks for pointing out that we can use another cluster stack to manage 
it :)

I'll look into that.

Thanks again.

Cheers,
S?bastien

On 05.12.2012 11:41, Lars Marowsky-Bree wrote:> On 2012-12-05T10:47:58, S?bastien RICCIO <sr at swisscenter.com>
wrote:
>
>> With nfs we don't have this problem, if the nfs filer goes down it
just
>> waits until it comes back and resume the operations.
>>
>> Since we use ocfs2 all the nodes are now rebooted, that means we have
to
>> restart each VM and this take a long time.
>>
>> Is there a way to disable that ocfs2 behavior so our hosts doesn't
>> reboot automatically ?
> If you're running OCFS2 with the in-kernel O2CB cluster stack, it uses
> heartbeating over the storage as the membership algorithm in a tightly
> coupled version with IO fencing. So, no.
>
> If you were using OCFS2 with, say, Pacemaker+Corosync, you could,
> because these make the cluster membership decisions based on the network
> layer.
>
> (With SBD as a fencing agent, you can still have fencing via shared
> storage, but it allows you to have up to three devices and also take the
> network quorum into account before suiciding.)
>
>
> Regards,
>      Lars
>

srinivas eeda

2012-Dec-05 17:56 UTC

head link

[Ocfs2-users] Avoid node reboot on timeout

If you run ocfs2 file system in cluster mode, then all nodes have to 
heartbeat to each other on network and storage within a timeout value. 
You can increase the timeout values to tolerate huge delays.

On 12/5/2012 1:47 AM, S?bastien RICCIO wrote:> Hi OCFS2 list :)
>
> We are currently using XCP (XenServer opensource) with mixed shared
> storage (some NFS and some OCFS2).
> Everything works quite well except that when we have a network
> interruption or problem on the filer providing the ocfs2 iscsi target,
> all the machines connected to it reboots after a certain amount of time,
> instead of keeping to try  a reconnect.
>
> It seems it's due to ocfs2 that is fencing the node if it can't
write to
> the storage after an amount of time.
>
> With nfs we don't have this problem, if the nfs filer goes down it just
> waits until it comes back and resume the operations.
>
> Since we use ocfs2 all the nodes are now rebooted, that means we have to
> restart each VM and this take a long time.
>
> Is there a way to disable that ocfs2 behavior so our hosts doesn't
> reboot automatically ?
>
> Thanks :)
>
> Cheers,
> S?bastien
>
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users

Ocfs2 users - Dec 2012 - Avoid node reboot on timeout

[Ocfs2-users] Avoid node reboot on timeout

[Ocfs2-users] Avoid node reboot on timeout

[Ocfs2-users] Avoid node reboot on timeout