Devender Narula
2010-May-28 16:21 UTC
[Ocfs2-users] OCFS2 initiating reboot on production machine.
HI Team ? my ocfs2 1.4.1 running on RHEL 5.4 with 11 G Rel 1 software running causing reboot with below mention error message .. can you please suggest me what is the cause of this and hw we can fix this. ? please help me in this. ? Regards, ? Devender ? ---------------- /var/log/messages output. ? May 24 02:10:49 ewhpbc3bl7 kernel: (26,0):o2hb_write_timeout:172 ERROR: Heartbeat write timeout to device sda1 after 60000 milliseconds May 24 02:10:49 ewhpbc3bl7 kernel: (26,0):o2hb_stop_all_regions:1967 ERROR: stopping heartbeat on all active regions. May 24 02:10:49 ewhpbc3bl7 kernel: ocfs2 is very sorry to be fencing this system by restarting May 24 02:13:41 ewhpbc3bl7 syslogd 1.4.1: restart. May 24 02:13:41 ewhpbc3bl7 kernel: klogd 1.4.1, log source = /proc/kmsg started. May 24 02:13:41 ewhpbc3bl7 kernel: Linux version 2.6.18-164.11.1.el5 (mockbuild at ls20-bc2-13.build.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Wed Jan 6 13:26:04 EST 2010 May 24 02:13:41 ewhpbc3bl7 kernel: Command line: ro root=LABEL=/ acpi=off apm=off rhgb quiet May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-provided physical RAM map: May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 0000000000010000 - 000000000009f400 (usable) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 0000000000100000 - 00000000d762f000 (usable) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 00000000d762f000 - 00000000d763c000 (ACPI data) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 00000000d763c000 - 00000000d763d000 (usable) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 00000000d763d000 - 00000000dc000000 (reserved) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 00000000fec00000 - 00000000fee10000 (reserved) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved) May 24 02:13:41 ewhpbc3bl7 kernel:? BIOS-e820: 0000000100000000 - 0000000227fff000 (usable) May 24 02:13:41 ewhpbc3bl7 kernel: DMI 2.6 present. May 24 02:13:41 ewhpbc3bl7 kernel: No NUMA configuration found May 24 02:13:41 ewhpbc3bl7 kernel: Faking a node at 0000000000000000-0000000227fff000 May 24 02:13:41 ewhpbc3bl7 kernel: Bootmem setup node 0 0000000000000000-0000000227fff000 May 24 02:13:41 ewhpbc3bl7 kernel: Memory for crash kernel (0x0 to 0x0) notwithin permissible range May 24 02:13:41 ewhpbc3bl7 kernel: disabling kdump May 24 02:13:41 ewhpbc3bl7 kernel: Intel MultiProcessor Specification v1.4 May 24 02:13:41 ewhpbc3bl7 kernel:???? Virtual Wire compatibility mode. May 24 02:13:41 ewhpbc3bl7 kernel: OEM ID: HP?????? Product ID: PROLIANT???? APIC at: 0xFEE00000 May 24 02:13:41 ewhpbc3bl7 kernel: Processor #16 6:10 APIC version 20 May 24 02:13:41 ewhpbc3bl7 kernel: Processor #0 6:10 APIC version 20 May 24 02:13:41 ewhpbc3bl7 kernel: Processor #2 6:10 APIC version 20 May 24 02:13:41 ewhpbc3bl7 kernel: Processor #4 6:10 APIC version 20 May 24 02:13:41 ewhpbc3bl7 kernel: Processor #6 6:10 APIC version 20 May 24 02:13:41 ewhpbc3bl7 kernel: Processor #18 6:10 APIC version 20 May 24 02:13:41 ewhpbc3bl7 kernel: Processor #20 6:10 APIC version 20 May 24 02:13:41 ewhpbc3bl7 kernel: Processor #22 6:10 APIC version 20 May 24 02:13:41 ewhpbc3bl7 kernel: I/O APIC #8 Version 32 at 0xFEC00000. May 24 02:13:41 ewhpbc3bl7 kernel: I/O APIC #0 Version 32 at 0xFEC80000. May 24 02:13:41 ewhpbc3bl7 kernel: Setting APIC routing to clustered -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20100528/a8d5d1b5/attachment.html
Srinivas Eeda
2010-May-28 16:33 UTC
[Ocfs2-users] OCFS2 initiating reboot on production machine.
May 24 02:10:49 ewhpbc3bl7 kernel: (26,0):o2hb_write_timeout:172 ERROR: Heartbeat write timeout to device sda1 after 60000 milliseconds May 24 02:10:49 ewhpbc3bl7 kernel: (26,0):o2hb_stop_all_regions:1967 ERROR: stopping heartbeat on all active regions. It means heartbeat took longer than 60seconds. You didn't paste the whole message, so not sure what call took longer. Check your storage On 5/28/2010 9:21 AM, Devender Narula wrote:> > HI Team > > my ocfs2 1.4.1 running on RHEL 5.4 with 11 G Rel 1 software > running causing reboot with below mention error message .. can you > please suggest me what is the cause of this and hw we can fix this. > > please help me in this. > > Regards, > > Devender > > ---------------- > /var/log/messages output. > > May 24 02:10:49 ewhpbc3bl7 kernel: (26,0):o2hb_write_timeout:172 > ERROR: Heartbeat write timeout to device sda1 after 60000 milliseconds > May 24 02:10:49 ewhpbc3bl7 kernel: > (26,0):o2hb_stop_all_regions:1967 ERROR: stopping heartbeat on all > active regions. > May 24 02:10:49 ewhpbc3bl7 kernel: ocfs2 is very sorry to be > fencing this system by restarting > May 24 02:13:41 ewhpbc3bl7 syslogd 1.4.1: restart. > May 24 02:13:41 ewhpbc3bl7 kernel: klogd 1.4.1, log source > /proc/kmsg started. > May 24 02:13:41 ewhpbc3bl7 kernel: Linux version > 2.6.18-164.11.1.el5 (mockbuild at ls20-bc2-13.build.redhat.com > <mailto:mockbuild at ls20-bc2-13.build.redhat.com>) (gcc version > 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Wed Jan 6 13:26:04 EST 2010 > May 24 02:13:41 ewhpbc3bl7 kernel: Command line: ro root=LABEL=/ > acpi=off apm=off rhgb quiet > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-provided physical RAM map: > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 0000000000010000 - > 000000000009f400 (usable) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 000000000009f400 - > 00000000000a0000 (reserved) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 00000000000f0000 - > 0000000000100000 (reserved) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 0000000000100000 - > 00000000d762f000 (usable) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 00000000d762f000 - > 00000000d763c000 (ACPI data) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 00000000d763c000 - > 00000000d763d000 (usable) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 00000000d763d000 - > 00000000dc000000 (reserved) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 00000000fec00000 - > 00000000fee10000 (reserved) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 00000000ff800000 - > 0000000100000000 (reserved) > May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-e820: 0000000100000000 - > 0000000227fff000 (usable) > May 24 02:13:41 ewhpbc3bl7 kernel: DMI 2.6 present. > May 24 02:13:41 ewhpbc3bl7 kernel: No NUMA configuration found > May 24 02:13:41 ewhpbc3bl7 kernel: Faking a node at > 0000000000000000-0000000227fff000 > May 24 02:13:41 ewhpbc3bl7 kernel: Bootmem setup node 0 > 0000000000000000-0000000227fff000 > May 24 02:13:41 ewhpbc3bl7 kernel: Memory for crash kernel (0x0 to > 0x0) notwithin permissible range > May 24 02:13:41 ewhpbc3bl7 kernel: disabling kdump > May 24 02:13:41 ewhpbc3bl7 kernel: Intel MultiProcessor > Specification v1.4 > May 24 02:13:41 ewhpbc3bl7 kernel: Virtual Wire compatibility > mode. > May 24 02:13:41 ewhpbc3bl7 kernel: OEM ID: HP Product ID: > PROLIANT APIC at: 0xFEE00000 > May 24 02:13:41 ewhpbc3bl7 kernel: Processor #16 6:10 APIC version 20 > May 24 02:13:41 ewhpbc3bl7 kernel: Processor #0 6:10 APIC version 20 > May 24 02:13:41 ewhpbc3bl7 kernel: Processor #2 6:10 APIC version 20 > May 24 02:13:41 ewhpbc3bl7 kernel: Processor #4 6:10 APIC version 20 > May 24 02:13:41 ewhpbc3bl7 kernel: Processor #6 6:10 APIC version 20 > May 24 02:13:41 ewhpbc3bl7 kernel: Processor #18 6:10 APIC version 20 > May 24 02:13:41 ewhpbc3bl7 kernel: Processor #20 6:10 APIC version 20 > May 24 02:13:41 ewhpbc3bl7 kernel: Processor #22 6:10 APIC version 20 > May 24 02:13:41 ewhpbc3bl7 kernel: I/O APIC #8 Version 32 at > 0xFEC00000. > May 24 02:13:41 ewhpbc3bl7 kernel: I/O APIC #0 Version 32 at > 0xFEC80000. > May 24 02:13:41 ewhpbc3bl7 kernel: Setting APIC routing to clustered > > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users-------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20100528/52e5183e/attachment.html