On Thursday night I had one of my Linux ( RHEL v4 2.6.9-55.ELsmp ocfs2-2.6.9-55.ELsmp-1.2.5-1) reboot 2X. Once at 21:46 then again at 22:26. The only clues I have so far are that immediately prior to both reboots I see the below messages from OCFS. Can anyone shed some light on what this is trying to tell me? Thanks, M. Woods Jun 14 21:46:43 finps03 kernel: o2net: connection to node finbkp.fnal.gov (num 0) at 131.225.xx.66:7777 has been idle for 10.0 seconds, shutting it down. Jun 14 21:46:43 finps03 kernel: (0,0):o2net_idle_timer:1418 here are some times that might help debug the situation: (tmr 1181875593.301830 now 1181875603.299843 dr 1181875593.3018 11 adv 1181875593.301836:1181875593.301839 func (7da8978c:504) 1181875398.117144:1181875398.117184) Jun 14 21:46:43 finps03 kernel: o2net: no longer connected to node finbkp.fnal.gov (num 0) at 131.225.xx.66:7777 Jun 14 22:22:37 finps03 kernel: o2net: connection to node finbkp.fnal.gov (num 0) at 131.225.xx.66:7777 has been idle for 10.0 seconds, shutting it down. Jun 14 22:22:37 finps03 kernel: (0,0):o2net_idle_timer:1418 here are some times that might help debug the situation: (tmr 1181877747.709771 now 1181877757.708111 dr 1181877747.7097 59 adv 1181877747.709777:1181877747.709781 func (7d0b14e6:505) 1181877007.416795:1181877007.416800) Jun 14 22:22:37 finps03 kernel: o2net: no longer connected to node finbkp.fnal.gov (num 0) at 131.225.xx.66:7777
Do you have netdump/netconsole setup? The actual reason for the reboot will only be caught in those logs. mwoods@fnal.gov wrote:> On Thursday night I had one of my Linux ( RHEL v4 2.6.9-55.ELsmp > ocfs2-2.6.9-55.ELsmp-1.2.5-1) reboot 2X. Once at 21:46 then again at 22:26. > The only clues I have so far are that immediately prior to both reboots I > see the below messages from OCFS. Can anyone shed some light on what this > is trying to tell me? > > Thanks, > > M. Woods > > > > Jun 14 21:46:43 finps03 kernel: o2net: connection to node finbkp.fnal.gov > (num 0) at 131.225.xx.66:7777 has been idle for 10.0 seconds, shutting it > down. > Jun 14 21:46:43 finps03 kernel: (0,0):o2net_idle_timer:1418 here are some > times that might help debug the situation: (tmr 1181875593.301830 now > 1181875603.299843 dr 1181875593.3018 > 11 adv 1181875593.301836:1181875593.301839 func (7da8978c:504) > 1181875398.117144:1181875398.117184) > Jun 14 21:46:43 finps03 kernel: o2net: no longer connected to node > finbkp.fnal.gov (num 0) at 131.225.xx.66:7777 > > > > Jun 14 22:22:37 finps03 kernel: o2net: connection to node finbkp.fnal.gov > (num 0) at 131.225.xx.66:7777 has been idle for 10.0 seconds, shutting it > down. > Jun 14 22:22:37 finps03 kernel: (0,0):o2net_idle_timer:1418 here are some > times that might help debug the situation: (tmr 1181877747.709771 now > 1181877757.708111 dr 1181877747.7097 > 59 adv 1181877747.709777:1181877747.709781 func (7d0b14e6:505) > 1181877007.416795:1181877007.416800) > Jun 14 22:22:37 finps03 kernel: o2net: no longer connected to node > finbkp.fnal.gov (num 0) at 131.225.xx.66:7777 > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users >
Marcus Alves Grando
2007-Jun-20 13:03 UTC
[Ocfs2-users] OCFS msgs then system reboots. - Help
It's about network connection between servers in your cluster. Check if connection works fine and do not lost packets between nodes. Check firewall, switch, etc. Regards mwoods@fnal.gov wrote:> On Thursday night I had one of my Linux ( RHEL v4 2.6.9-55.ELsmp > ocfs2-2.6.9-55.ELsmp-1.2.5-1) reboot 2X. Once at 21:46 then again at 22:26. > The only clues I have so far are that immediately prior to both reboots I > see the below messages from OCFS. Can anyone shed some light on what this > is trying to tell me? > > Thanks, > > M. Woods > > > > Jun 14 21:46:43 finps03 kernel: o2net: connection to node finbkp.fnal.gov > (num 0) at 131.225.xx.66:7777 has been idle for 10.0 seconds, shutting it > down. > Jun 14 21:46:43 finps03 kernel: (0,0):o2net_idle_timer:1418 here are some > times that might help debug the situation: (tmr 1181875593.301830 now > 1181875603.299843 dr 1181875593.3018 > 11 adv 1181875593.301836:1181875593.301839 func (7da8978c:504) > 1181875398.117144:1181875398.117184) > Jun 14 21:46:43 finps03 kernel: o2net: no longer connected to node > finbkp.fnal.gov (num 0) at 131.225.xx.66:7777 > > > > Jun 14 22:22:37 finps03 kernel: o2net: connection to node finbkp.fnal.gov > (num 0) at 131.225.xx.66:7777 has been idle for 10.0 seconds, shutting it > down. > Jun 14 22:22:37 finps03 kernel: (0,0):o2net_idle_timer:1418 here are some > times that might help debug the situation: (tmr 1181877747.709771 now > 1181877757.708111 dr 1181877747.7097 > 59 adv 1181877747.709777:1181877747.709781 func (7d0b14e6:505) > 1181877007.416795:1181877007.416800) > Jun 14 22:22:37 finps03 kernel: o2net: no longer connected to node > finbkp.fnal.gov (num 0) at 131.225.xx.66:7777 > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users > > Esta mensagem foi verificada pelo E-mail Protegido Terra. > Scan engine: McAfee VirusScan / Atualizado em 20/06/2007 / Vers?o: 5.1.00/5057 > Proteja o seu e-mail Terra: http://mail.terra.com.br/-- Marcus Alves Grando <marcus.grando [] terra.com.br> Suporte Engenharia 1 Terra Networks Brasil S/A Tel: 55 (51) 3284-4238 Qual ? a sua Terra?