A 7 second timeout is very low. We default to 30 secs. And depending
on your setup, you could easily increase it to 60 secs.
On 04/05/2011 12:20 AM, Marc Kowal wrote:> Hi all,
>
> we are currently running a three node Moodle/Apache cluster with OCFS2
> as upload directory. Everything is fine, but sometimes some nodes losing
> connections.
>
> I get the following error on Node 2
>
> kernel: [555631.411454] o2net: connection to node node-03 (num 2) at
> xxx.196.20.20:7777
> has been idle for 7.0 seconds, shutting it down.
>
> kernel: [555631.411482] (19959,0):o2net_idle_timer:1495 here are some
> times that
> might help debug the situation: (tmr 1301847991.990535 now
> 1301847998.990086 dr 1301847991.990489
> adv 1301847991.990536:1301847991.990537 func (d672c340:502)
> 1301847983.930438:1301847983.930444)
>
> after that Apache is going down and forces some kernel errors.
>
> and Node 3:
>
> kernel: [555392.301334] o2net: no longer connected to node node-02 (num
> 1) at xxx.196.20.9:7777
>
> and is trying to reconnect FOR HOURS...
>
> and also here Apache is going down causing the cluster to stuck. I'm
not
> able to stop ocfs2 nor o2cb
>
> All nodes are running:
> Debian Squeeze, 2.6.32-5-amd64 on a VMWare ESX Virtual Machine
>
> If you need any further information please let me know. Thanks for all
> help i'll get
>
> regards
>
> Marc
>
>
>
>
>
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users