ohue.hidetoshi at jp.fujitsu.com
2009-Apr-15 07:29 UTC
[Ocfs-users] I'd like to stop the OCSF2_disk_heart_beat.
This is Hide. This is the first contribution. I'd like to configure cluster file system by OCFS2. The number of nodes are two. CompactFlash card(CF) is used as shared DISK. Guarantees of the writing frequency to CF are up to 100,000 times. However, it exceed in two day, due to the disk_heart_beat executed every two seconds. I tried following procedures to solve that. It seems solved. 1. O2HB_DEFAULT_DEAD_THRESHOLD changed larger than previous value. It's defined in hearbeat.h. 2. The Skip_flag defined as a global parameter. 3. When the number of connected nodes becomes two , skip_flag is turned on by dlmdomain.c 4. If the skip_flag is ON, the o2hb_do_disk_hearbeat is not executed. 5. When the number of connected nodes becomes below two, skip_flag is turned off by dlmdomain.c. 6. If the skip_flag is OFF, the o2hb_do_disk_hearbeat is executed. If one node is active, the o2hb_do_disk_hearbeat is executed in any condition. I wouldn't like to execute the o2hb_do_disk_hearbeat, even if one node is active. Because to take away writing frequency problem. However, if the o2hb_do_disk_hearbeat is not executed, I can't supervision when the other node is active. So, I'd like to know that how can I supervision it without o2hb_do_disk_hearbeat. I think one of plan is supervision by IP communication. Is there any effective source in tcp.c? Do you have any other plan? *The version is the following. Linux-2.6.21.7 ocfs2-tools-1.2.3 If you have any good idea, please give it for me.
Sunil Mushran
2009-Apr-15 18:30 UTC
[Ocfs-users] I'd like to stop the OCSF2_disk_heart_beat.
The disk heartbeat is the node liveness detection component in o2cb (ocfs2's native cluster stack). If you want to run it like a local fs, say ext3, then you can use tunefs.ocfs2 to change the mount type to local. That will disable the hb thread. But that's not what you want. You want to run in clustered mode, but only disable the hb thread when one node is active. If I am understanding you correctly. If so, then the solution maybe found in one of the userspace clusterstacks, pacemaker or cman. In 2.6.26 or so, we added support for userspace cluster stacks. Novell is about to ship SLES11 HA that will showcase ocfs2 with pacemaker (novell's cluster stack). Sometime next year, we'll have the same on (rh)el6 - ocfs2 with cman. These two cluster stacks have been known to have config setting that does not require disk heartbeat. Or atleast allows users to setup one disk as the quorum disk and not have the hb on all disks. That may satisfy your requirements. BTW, the correct mailing list is ocfs2-users at oss.oracle.com. Sunil ohue.hidetoshi at jp.fujitsu.com wrote:> This is Hide. > This is the first contribution. > > I'd like to configure cluster file system by OCFS2. > The number of nodes are two. > CompactFlash card(CF) is used as shared DISK. > Guarantees of the writing frequency to CF are up to 100,000 times. > However, it exceed in two day, due to the disk_heart_beat executed every two seconds. > > I tried following procedures to solve that. It seems solved. > 1. O2HB_DEFAULT_DEAD_THRESHOLD changed larger than previous value. It's defined in hearbeat.h. > 2. The Skip_flag defined as a global parameter. > 3. When the number of connected nodes becomes two , skip_flag is turned on by dlmdomain.c > 4. If the skip_flag is ON, the o2hb_do_disk_hearbeat is not executed. > 5. When the number of connected nodes becomes below two, skip_flag is turned off by dlmdomain.c. > 6. If the skip_flag is OFF, the o2hb_do_disk_hearbeat is executed. > > If one node is active, the o2hb_do_disk_hearbeat is executed in any condition. > I wouldn't like to execute the o2hb_do_disk_hearbeat, even if one node is active. > Because to take away writing frequency problem. > However, if the o2hb_do_disk_hearbeat is not executed, I can't supervision > when the other node is active. > So, I'd like to know that how can I supervision it without o2hb_do_disk_hearbeat. > > I think one of plan is supervision by IP communication. > Is there any effective source in tcp.c? > Do you have any other plan? > > *The version is the following. > Linux-2.6.21.7 > ocfs2-tools-1.2.3 > > If you have any good idea, please give it for me.
Apparently Analagous Threads
- [PATCH] ocfs2/dlm: check dlm_state under spinlock
- [PATCH 1/1] OCFS2: fasten dlm_lock_resource hash_table lookups
- Garbage ERESTARTSYS in dlmdomain.c?
- [PATCH 1/2] ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE
- [PATCH] ocfs2: remove some pointless conditionals before kfree()