Laurentiu Gosu
2011-Oct-18 21:05 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
Hi, I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. My problem is that all the time when i try to run /etc/init.d/o2cb stop it fails with this error: Stopping O2CB cluster CLUSTER: Failed Unable to stop cluster as heartbeat region still active There is no active mount point. I tried to manually stop the heartdbeat with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). But even if refs number is set to zero the "heartbeat region still active" occurs. How can i fix this? Thank you in advance. Laurentiu.
Sunil Mushran
2011-Oct-18 21:12 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
ls -lR /sys/kernel/config/cluster What does this return? On 10/18/2011 02:05 PM, Laurentiu Gosu wrote:> Hi, > I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, > ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. > My problem is that all the time when i try to run /etc/init.d/o2cb stop > it fails with this error: > Stopping O2CB cluster CLUSTER: Failed > Unable to stop cluster as heartbeat region still active > There is no active mount point. I tried to manually stop the heartdbeat > with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding > the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). > But even if refs number is set to zero the "heartbeat region still > active" occurs. > How can i fix this? > > Thank you in advance. > Laurentiu. > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users
Laurentiu Gosu
2011-Oct-18 21:14 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
Here is the output: ls -lR /sys/kernel/config/cluster /sys/kernel/config/cluster: total 0 drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER /sys/kernel/config/cluster/CLUSTER: total 0 -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms drwxr-xr-x 4 root root 0 Oct 11 20:23 node -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms /sys/kernel/config/cluster/CLUSTER/heartbeat: total 0 drwxr-xr-x 2 root root 0 Oct 19 00:12 918673F06F8F4ED188DDCE14F39945F6 -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: total 0 -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev -r--r--r-- 1 root root 4096 Oct 19 00:12 pid -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block /sys/kernel/config/cluster/CLUSTER/node: total 0 drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: total 0 -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port -rw-r--r-- 1 root root 4096 Oct 19 00:12 local -rw-r--r-- 1 root root 4096 Oct 19 00:12 num /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: total 0 -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port -rw-r--r-- 1 root root 4096 Oct 19 00:12 local -rw-r--r-- 1 root root 4096 Oct 19 00:12 num On 10/19/2011 00:12, Sunil Mushran wrote:> ls -lR /sys/kernel/config/cluster > > What does this return? > > On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >> Hi, >> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >> My problem is that all the time when i try to run /etc/init.d/o2cb stop >> it fails with this error: >> Stopping O2CB cluster CLUSTER: Failed >> Unable to stop cluster as heartbeat region still active >> There is no active mount point. I tried to manually stop the heartdbeat >> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding >> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). >> But even if refs number is set to zero the "heartbeat region still >> active" occurs. >> How can i fix this? >> >> Thank you in advance. >> Laurentiu. >> >> >> _______________________________________________ >> Ocfs2-users mailing list >> Ocfs2-users at oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs2-users >
Sunil Mushran
2011-Oct-18 21:17 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
What does this return? cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev Also, do: ls -lR /sys/kernel/debug/ocfs2 ls -lR /sys/kernel/debug/o2dlm On 10/18/2011 02:14 PM, Laurentiu Gosu wrote:> Here is the output: > > ls -lR /sys/kernel/config/cluster > /sys/kernel/config/cluster: > total 0 > drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER > > /sys/kernel/config/cluster/CLUSTER: > total 0 > -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method > drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat > -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms > -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms > drwxr-xr-x 4 root root 0 Oct 11 20:23 node > -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms > > /sys/kernel/config/cluster/CLUSTER/heartbeat: > total 0 > drwxr-xr-x 2 root root 0 Oct 19 00:12 918673F06F8F4ED188DDCE14F39945F6 > -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold > > /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: > total 0 > -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes > -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks > -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev > -r--r--r-- 1 root root 4096 Oct 19 00:12 pid > -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block > > /sys/kernel/config/cluster/CLUSTER/node: > total 0 > drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 > drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 > > /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: > total 0 > -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address > -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port > -rw-r--r-- 1 root root 4096 Oct 19 00:12 local > -rw-r--r-- 1 root root 4096 Oct 19 00:12 num > > /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: > total 0 > -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address > -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port > -rw-r--r-- 1 root root 4096 Oct 19 00:12 local > -rw-r--r-- 1 root root 4096 Oct 19 00:12 num > > > > > On 10/19/2011 00:12, Sunil Mushran wrote: >> ls -lR /sys/kernel/config/cluster >> >> What does this return? >> >> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>> Hi, >>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>> My problem is that all the time when i try to run /etc/init.d/o2cb stop >>> it fails with this error: >>> Stopping O2CB cluster CLUSTER: Failed >>> Unable to stop cluster as heartbeat region still active >>> There is no active mount point. I tried to manually stop the heartdbeat >>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding >>> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). >>> But even if refs number is set to zero the "heartbeat region still >>> active" occurs. >>> How can i fix this? >>> >>> Thank you in advance. >>> Laurentiu. >>> >>> >>> _______________________________________________ >>> Ocfs2-users mailing list >>> Ocfs2-users at oss.oracle.com >>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >> >
Laurentiu Gosu
2011-Oct-18 21:23 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
Again the outputs: cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev dm-2 --->here should be volgr1-lvol0 i guess? ls -lR /sys/kernel/debug/ocfs2 ls: /sys/kernel/debug/ocfs2: No such file or directory ls -lR /sys/kernel/debug/o2dlm ls: /sys/kernel/debug/o2dlm: No such file or directory I think i have to enable debug first somehow..? Laurentiu. On 10/19/2011 00:17, Sunil Mushran wrote:> What does this return? > cat > /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev > > Also, do: > ls -lR /sys/kernel/debug/ocfs2 > ls -lR /sys/kernel/debug/o2dlm > > On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >> Here is the output: >> >> ls -lR /sys/kernel/config/cluster >> /sys/kernel/config/cluster: >> total 0 >> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >> >> /sys/kernel/config/cluster/CLUSTER: >> total 0 >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >> >> /sys/kernel/config/cluster/CLUSTER/heartbeat: >> total 0 >> drwxr-xr-x 2 root root 0 Oct 19 00:12 >> 918673F06F8F4ED188DDCE14F39945F6 >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >> >> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >> >> total 0 >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >> >> /sys/kernel/config/cluster/CLUSTER/node: >> total 0 >> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >> >> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >> total 0 >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >> >> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >> total 0 >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >> >> >> >> >> On 10/19/2011 00:12, Sunil Mushran wrote: >>> ls -lR /sys/kernel/config/cluster >>> >>> What does this return? >>> >>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>> Hi, >>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>> My problem is that all the time when i try to run /etc/init.d/o2cb >>>> stop >>>> it fails with this error: >>>> Stopping O2CB cluster CLUSTER: Failed >>>> Unable to stop cluster as heartbeat region still active >>>> There is no active mount point. I tried to manually stop the >>>> heartdbeat >>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after >>>> finding >>>> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). >>>> But even if refs number is set to zero the "heartbeat region still >>>> active" occurs. >>>> How can i fix this? >>>> >>>> Thank you in advance. >>>> Laurentiu. >>>> >>>> >>>> _______________________________________________ >>>> Ocfs2-users mailing list >>>> Ocfs2-users at oss.oracle.com >>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>> >> >
Sunil Mushran
2011-Oct-18 21:27 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
mount -t debugfs debugfs /sys/kernel/debug Then list that dir. Also, do: ocfs2_hb_ctl -l -d /dev/dm-2 Be careful before killing. We want to be sure that dev is not mounted. On 10/18/2011 02:23 PM, Laurentiu Gosu wrote:> Again the outputs: > cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev > dm-2 > --->here should be volgr1-lvol0 i guess? > > ls -lR /sys/kernel/debug/ocfs2 > ls: /sys/kernel/debug/ocfs2: No such file or directory > > ls -lR /sys/kernel/debug/o2dlm > ls: /sys/kernel/debug/o2dlm: No such file or directory > > I think i have to enable debug first somehow..? > > Laurentiu. > > On 10/19/2011 00:17, Sunil Mushran wrote: >> What does this return? >> cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >> >> Also, do: >> ls -lR /sys/kernel/debug/ocfs2 >> ls -lR /sys/kernel/debug/o2dlm >> >> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >>> Here is the output: >>> >>> ls -lR /sys/kernel/config/cluster >>> /sys/kernel/config/cluster: >>> total 0 >>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >>> >>> /sys/kernel/config/cluster/CLUSTER: >>> total 0 >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >>> >>> /sys/kernel/config/cluster/CLUSTER/heartbeat: >>> total 0 >>> drwxr-xr-x 2 root root 0 Oct 19 00:12 918673F06F8F4ED188DDCE14F39945F6 >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >>> >>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >>> total 0 >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >>> >>> /sys/kernel/config/cluster/CLUSTER/node: >>> total 0 >>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >>> >>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >>> total 0 >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>> >>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >>> total 0 >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>> >>> >>> >>> >>> On 10/19/2011 00:12, Sunil Mushran wrote: >>>> ls -lR /sys/kernel/config/cluster >>>> >>>> What does this return? >>>> >>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>>> Hi, >>>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>>> My problem is that all the time when i try to run /etc/init.d/o2cb stop >>>>> it fails with this error: >>>>> Stopping O2CB cluster CLUSTER: Failed >>>>> Unable to stop cluster as heartbeat region still active >>>>> There is no active mount point. I tried to manually stop the heartdbeat >>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding >>>>> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). >>>>> But even if refs number is set to zero the "heartbeat region still >>>>> active" occurs. >>>>> How can i fix this? >>>>> >>>>> Thank you in advance. >>>>> Laurentiu. >>>>> >>>>> >>>>> _______________________________________________ >>>>> Ocfs2-users mailing list >>>>> Ocfs2-users at oss.oracle.com >>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>> >>> >> >
Laurentiu Gosu
2011-Oct-18 21:32 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
ls -lR /sys/kernel/debug/ocfs2 /sys/kernel/debug/ocfs2: total 0 ls -lR /sys/kernel/debug/o2dlm /sys/kernel/debug/o2dlm: total 0 ocfs2_hb_ctl -I -d /dev/dm-2 ocfs2_hb_ctl: Device name specified was not found while reading uuid There is no /dev/dm-2 mounted. On 10/19/2011 00:27, Sunil Mushran wrote:> mount -t debugfs debugfs /sys/kernel/debug > > Then list that dir. > > Also, do: > ocfs2_hb_ctl -l -d /dev/dm-2 > > Be careful before killing. We want to be sure that dev is not mounted. > > On 10/18/2011 02:23 PM, Laurentiu Gosu wrote: >> Again the outputs: >> cat >> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >> dm-2 >> --->here should be volgr1-lvol0 i guess? >> >> ls -lR /sys/kernel/debug/ocfs2 >> ls: /sys/kernel/debug/ocfs2: No such file or directory >> >> ls -lR /sys/kernel/debug/o2dlm >> ls: /sys/kernel/debug/o2dlm: No such file or directory >> >> I think i have to enable debug first somehow..? >> >> Laurentiu. >> >> On 10/19/2011 00:17, Sunil Mushran wrote: >>> What does this return? >>> cat >>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>> >>> Also, do: >>> ls -lR /sys/kernel/debug/ocfs2 >>> ls -lR /sys/kernel/debug/o2dlm >>> >>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >>>> Here is the output: >>>> >>>> ls -lR /sys/kernel/config/cluster >>>> /sys/kernel/config/cluster: >>>> total 0 >>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >>>> >>>> /sys/kernel/config/cluster/CLUSTER: >>>> total 0 >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >>>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >>>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >>>> >>>> /sys/kernel/config/cluster/CLUSTER/heartbeat: >>>> total 0 >>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 >>>> 918673F06F8F4ED188DDCE14F39945F6 >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >>>> >>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >>>> >>>> total 0 >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >>>> >>>> /sys/kernel/config/cluster/CLUSTER/node: >>>> total 0 >>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >>>> >>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >>>> total 0 >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>> >>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >>>> total 0 >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>> >>>> >>>> >>>> >>>> On 10/19/2011 00:12, Sunil Mushran wrote: >>>>> ls -lR /sys/kernel/config/cluster >>>>> >>>>> What does this return? >>>>> >>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>>>> Hi, >>>>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>>>> My problem is that all the time when i try to run >>>>>> /etc/init.d/o2cb stop >>>>>> it fails with this error: >>>>>> Stopping O2CB cluster CLUSTER: Failed >>>>>> Unable to stop cluster as heartbeat region still active >>>>>> There is no active mount point. I tried to manually stop the >>>>>> heartdbeat >>>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after >>>>>> finding >>>>>> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 >>>>>> "). >>>>>> But even if refs number is set to zero the "heartbeat region still >>>>>> active" occurs. >>>>>> How can i fix this? >>>>>> >>>>>> Thank you in advance. >>>>>> Laurentiu. >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Ocfs2-users mailing list >>>>>> Ocfs2-users at oss.oracle.com >>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>> >>>> >>> >> >
Sunil Mushran
2011-Oct-18 21:37 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
So it is not mounted. But we still have a hb thread because hb could not be stopped during umount. The reason for that could be the same that causes ocfs2_hb_ctl to fail. Do: mounted.ocfs2 -d On 10/18/2011 02:32 PM, Laurentiu Gosu wrote:> ls -lR /sys/kernel/debug/ocfs2 > /sys/kernel/debug/ocfs2: > total 0 > > ls -lR /sys/kernel/debug/o2dlm > /sys/kernel/debug/o2dlm: > total 0 > > ocfs2_hb_ctl -I -d /dev/dm-2 > ocfs2_hb_ctl: Device name specified was not found while reading uuid > > There is no /dev/dm-2 mounted. > > > On 10/19/2011 00:27, Sunil Mushran wrote: >> mount -t debugfs debugfs /sys/kernel/debug >> >> Then list that dir. >> >> Also, do: >> ocfs2_hb_ctl -l -d /dev/dm-2 >> >> Be careful before killing. We want to be sure that dev is not mounted. >> >> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote: >>> Again the outputs: >>> cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>> dm-2 >>> --->here should be volgr1-lvol0 i guess? >>> >>> ls -lR /sys/kernel/debug/ocfs2 >>> ls: /sys/kernel/debug/ocfs2: No such file or directory >>> >>> ls -lR /sys/kernel/debug/o2dlm >>> ls: /sys/kernel/debug/o2dlm: No such file or directory >>> >>> I think i have to enable debug first somehow..? >>> >>> Laurentiu. >>> >>> On 10/19/2011 00:17, Sunil Mushran wrote: >>>> What does this return? >>>> cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>> >>>> Also, do: >>>> ls -lR /sys/kernel/debug/ocfs2 >>>> ls -lR /sys/kernel/debug/o2dlm >>>> >>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >>>>> Here is the output: >>>>> >>>>> ls -lR /sys/kernel/config/cluster >>>>> /sys/kernel/config/cluster: >>>>> total 0 >>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >>>>> >>>>> /sys/kernel/config/cluster/CLUSTER: >>>>> total 0 >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >>>>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >>>>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >>>>> >>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat: >>>>> total 0 >>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 918673F06F8F4ED188DDCE14F39945F6 >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >>>>> >>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >>>>> total 0 >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >>>>> >>>>> /sys/kernel/config/cluster/CLUSTER/node: >>>>> total 0 >>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >>>>> >>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >>>>> total 0 >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>> >>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >>>>> total 0 >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>> >>>>> >>>>> >>>>> >>>>> On 10/19/2011 00:12, Sunil Mushran wrote: >>>>>> ls -lR /sys/kernel/config/cluster >>>>>> >>>>>> What does this return? >>>>>> >>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>>>>> Hi, >>>>>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>>>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>>>>> My problem is that all the time when i try to run /etc/init.d/o2cb stop >>>>>>> it fails with this error: >>>>>>> Stopping O2CB cluster CLUSTER: Failed >>>>>>> Unable to stop cluster as heartbeat region still active >>>>>>> There is no active mount point. I tried to manually stop the heartdbeat >>>>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding >>>>>>> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). >>>>>>> But even if refs number is set to zero the "heartbeat region still >>>>>>> active" occurs. >>>>>>> How can i fix this? >>>>>>> >>>>>>> Thank you in advance. >>>>>>> Laurentiu. >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Ocfs2-users mailing list >>>>>>> Ocfs2-users at oss.oracle.com >>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>>> >>>>> >>>> >>> >> >
Laurentiu Gosu
2011-Oct-18 21:40 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
mounted.ocfs2 -d Device FS Stack UUID Label /dev/mapper/volgr1-lvol0 ocfs2 o2cb 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2 mounted.ocfs2 -f Device FS Nodes /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001 ro02xsrv001 = the other node in the cluster. By the way, there is no /dev/md-2 ls /dev/dm-* /dev/dm-0 /dev/dm-1 On 10/19/2011 00:37, Sunil Mushran wrote:> So it is not mounted. But we still have a hb thread because > hb could not be stopped during umount. The reason for that > could be the same that causes ocfs2_hb_ctl to fail. > > Do: > mounted.ocfs2 -d > > On 10/18/2011 02:32 PM, Laurentiu Gosu wrote: >> ls -lR /sys/kernel/debug/ocfs2 >> /sys/kernel/debug/ocfs2: >> total 0 >> >> ls -lR /sys/kernel/debug/o2dlm >> /sys/kernel/debug/o2dlm: >> total 0 >> >> ocfs2_hb_ctl -I -d /dev/dm-2 >> ocfs2_hb_ctl: Device name specified was not found while reading uuid >> >> There is no /dev/dm-2 mounted. >> >> >> On 10/19/2011 00:27, Sunil Mushran wrote: >>> mount -t debugfs debugfs /sys/kernel/debug >>> >>> Then list that dir. >>> >>> Also, do: >>> ocfs2_hb_ctl -l -d /dev/dm-2 >>> >>> Be careful before killing. We want to be sure that dev is not mounted. >>> >>> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote: >>>> Again the outputs: >>>> cat >>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>> dm-2 >>>> --->here should be volgr1-lvol0 i guess? >>>> >>>> ls -lR /sys/kernel/debug/ocfs2 >>>> ls: /sys/kernel/debug/ocfs2: No such file or directory >>>> >>>> ls -lR /sys/kernel/debug/o2dlm >>>> ls: /sys/kernel/debug/o2dlm: No such file or directory >>>> >>>> I think i have to enable debug first somehow..? >>>> >>>> Laurentiu. >>>> >>>> On 10/19/2011 00:17, Sunil Mushran wrote: >>>>> What does this return? >>>>> cat >>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>> >>>>> Also, do: >>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>> ls -lR /sys/kernel/debug/o2dlm >>>>> >>>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >>>>>> Here is the output: >>>>>> >>>>>> ls -lR /sys/kernel/config/cluster >>>>>> /sys/kernel/config/cluster: >>>>>> total 0 >>>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >>>>>> >>>>>> /sys/kernel/config/cluster/CLUSTER: >>>>>> total 0 >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >>>>>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >>>>>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >>>>>> >>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat: >>>>>> total 0 >>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 >>>>>> 918673F06F8F4ED188DDCE14F39945F6 >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >>>>>> >>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >>>>>> >>>>>> total 0 >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >>>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >>>>>> >>>>>> /sys/kernel/config/cluster/CLUSTER/node: >>>>>> total 0 >>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >>>>>> >>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >>>>>> total 0 >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>> >>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >>>>>> total 0 >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 10/19/2011 00:12, Sunil Mushran wrote: >>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>> >>>>>>> What does this return? >>>>>>> >>>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>>>>>> Hi, >>>>>>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>>>>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>>>>>> My problem is that all the time when i try to run >>>>>>>> /etc/init.d/o2cb stop >>>>>>>> it fails with this error: >>>>>>>> Stopping O2CB cluster CLUSTER: Failed >>>>>>>> Unable to stop cluster as heartbeat region still active >>>>>>>> There is no active mount point. I tried to manually stop the >>>>>>>> heartdbeat >>>>>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after >>>>>>>> finding >>>>>>>> the refs number with "ocfs2_hb_ctl -I -d >>>>>>>> /dev/mapper/volgr1-lvol0 "). >>>>>>>> But even if refs number is set to zero the "heartbeat region still >>>>>>>> active" occurs. >>>>>>>> How can i fix this? >>>>>>>> >>>>>>>> Thank you in advance. >>>>>>>> Laurentiu. >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Ocfs2-users mailing list >>>>>>>> Ocfs2-users at oss.oracle.com >>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>>>> >>>>>> >>>>> >>>> >>> >> >
Sunil Mushran
2011-Oct-18 21:43 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D On 10/18/2011 02:40 PM, Laurentiu Gosu wrote:> mounted.ocfs2 -d > Device FS Stack UUID Label > /dev/mapper/volgr1-lvol0 ocfs2 o2cb 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2 > > mounted.ocfs2 -f > Device FS Nodes > /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001 > > ro02xsrv001 = the other node in the cluster. > > By the way, there is no /dev/md-2 > ls /dev/dm-* > /dev/dm-0 /dev/dm-1 > > > On 10/19/2011 00:37, Sunil Mushran wrote: >> So it is not mounted. But we still have a hb thread because >> hb could not be stopped during umount. The reason for that >> could be the same that causes ocfs2_hb_ctl to fail. >> >> Do: >> mounted.ocfs2 -d >> >> On 10/18/2011 02:32 PM, Laurentiu Gosu wrote: >>> ls -lR /sys/kernel/debug/ocfs2 >>> /sys/kernel/debug/ocfs2: >>> total 0 >>> >>> ls -lR /sys/kernel/debug/o2dlm >>> /sys/kernel/debug/o2dlm: >>> total 0 >>> >>> ocfs2_hb_ctl -I -d /dev/dm-2 >>> ocfs2_hb_ctl: Device name specified was not found while reading uuid >>> >>> There is no /dev/dm-2 mounted. >>> >>> >>> On 10/19/2011 00:27, Sunil Mushran wrote: >>>> mount -t debugfs debugfs /sys/kernel/debug >>>> >>>> Then list that dir. >>>> >>>> Also, do: >>>> ocfs2_hb_ctl -l -d /dev/dm-2 >>>> >>>> Be careful before killing. We want to be sure that dev is not mounted. >>>> >>>> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote: >>>>> Again the outputs: >>>>> cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>> dm-2 >>>>> --->here should be volgr1-lvol0 i guess? >>>>> >>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>> ls: /sys/kernel/debug/ocfs2: No such file or directory >>>>> >>>>> ls -lR /sys/kernel/debug/o2dlm >>>>> ls: /sys/kernel/debug/o2dlm: No such file or directory >>>>> >>>>> I think i have to enable debug first somehow..? >>>>> >>>>> Laurentiu. >>>>> >>>>> On 10/19/2011 00:17, Sunil Mushran wrote: >>>>>> What does this return? >>>>>> cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>>> >>>>>> Also, do: >>>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>>> ls -lR /sys/kernel/debug/o2dlm >>>>>> >>>>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >>>>>>> Here is the output: >>>>>>> >>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>> /sys/kernel/config/cluster: >>>>>>> total 0 >>>>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >>>>>>> >>>>>>> /sys/kernel/config/cluster/CLUSTER: >>>>>>> total 0 >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >>>>>>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >>>>>>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >>>>>>> >>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat: >>>>>>> total 0 >>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 918673F06F8F4ED188DDCE14F39945F6 >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >>>>>>> >>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >>>>>>> total 0 >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >>>>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >>>>>>> >>>>>>> /sys/kernel/config/cluster/CLUSTER/node: >>>>>>> total 0 >>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >>>>>>> >>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >>>>>>> total 0 >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>>> >>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >>>>>>> total 0 >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 10/19/2011 00:12, Sunil Mushran wrote: >>>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>>> >>>>>>>> What does this return? >>>>>>>> >>>>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>>>>>>> Hi, >>>>>>>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>>>>>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>>>>>>> My problem is that all the time when i try to run /etc/init.d/o2cb stop >>>>>>>>> it fails with this error: >>>>>>>>> Stopping O2CB cluster CLUSTER: Failed >>>>>>>>> Unable to stop cluster as heartbeat region still active >>>>>>>>> There is no active mount point. I tried to manually stop the heartdbeat >>>>>>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding >>>>>>>>> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). >>>>>>>>> But even if refs number is set to zero the "heartbeat region still >>>>>>>>> active" occurs. >>>>>>>>> How can i fix this? >>>>>>>>> >>>>>>>>> Thank you in advance. >>>>>>>>> Laurentiu. >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Ocfs2-users mailing list >>>>>>>>> Ocfs2-users at oss.oracle.com >>>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
Laurentiu Gosu
2011-Oct-18 21:44 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs On 10/19/2011 00:43, Sunil Mushran wrote:> ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D > > On 10/18/2011 02:40 PM, Laurentiu Gosu wrote: >> mounted.ocfs2 -d >> Device FS Stack UUID >> Label >> /dev/mapper/volgr1-lvol0 ocfs2 o2cb >> 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2 >> >> mounted.ocfs2 -f >> Device FS Nodes >> /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001 >> >> ro02xsrv001 = the other node in the cluster. >> >> By the way, there is no /dev/md-2 >> ls /dev/dm-* >> /dev/dm-0 /dev/dm-1 >> >> >> On 10/19/2011 00:37, Sunil Mushran wrote: >>> So it is not mounted. But we still have a hb thread because >>> hb could not be stopped during umount. The reason for that >>> could be the same that causes ocfs2_hb_ctl to fail. >>> >>> Do: >>> mounted.ocfs2 -d >>> >>> On 10/18/2011 02:32 PM, Laurentiu Gosu wrote: >>>> ls -lR /sys/kernel/debug/ocfs2 >>>> /sys/kernel/debug/ocfs2: >>>> total 0 >>>> >>>> ls -lR /sys/kernel/debug/o2dlm >>>> /sys/kernel/debug/o2dlm: >>>> total 0 >>>> >>>> ocfs2_hb_ctl -I -d /dev/dm-2 >>>> ocfs2_hb_ctl: Device name specified was not found while reading uuid >>>> >>>> There is no /dev/dm-2 mounted. >>>> >>>> >>>> On 10/19/2011 00:27, Sunil Mushran wrote: >>>>> mount -t debugfs debugfs /sys/kernel/debug >>>>> >>>>> Then list that dir. >>>>> >>>>> Also, do: >>>>> ocfs2_hb_ctl -l -d /dev/dm-2 >>>>> >>>>> Be careful before killing. We want to be sure that dev is not >>>>> mounted. >>>>> >>>>> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote: >>>>>> Again the outputs: >>>>>> cat >>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>>> dm-2 >>>>>> --->here should be volgr1-lvol0 i guess? >>>>>> >>>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>>> ls: /sys/kernel/debug/ocfs2: No such file or directory >>>>>> >>>>>> ls -lR /sys/kernel/debug/o2dlm >>>>>> ls: /sys/kernel/debug/o2dlm: No such file or directory >>>>>> >>>>>> I think i have to enable debug first somehow..? >>>>>> >>>>>> Laurentiu. >>>>>> >>>>>> On 10/19/2011 00:17, Sunil Mushran wrote: >>>>>>> What does this return? >>>>>>> cat >>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>>>> >>>>>>> Also, do: >>>>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>>>> ls -lR /sys/kernel/debug/o2dlm >>>>>>> >>>>>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >>>>>>>> Here is the output: >>>>>>>> >>>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>>> /sys/kernel/config/cluster: >>>>>>>> total 0 >>>>>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >>>>>>>> >>>>>>>> /sys/kernel/config/cluster/CLUSTER: >>>>>>>> total 0 >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >>>>>>>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >>>>>>>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >>>>>>>> >>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat: >>>>>>>> total 0 >>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 >>>>>>>> 918673F06F8F4ED188DDCE14F39945F6 >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >>>>>>>> >>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >>>>>>>> >>>>>>>> total 0 >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >>>>>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >>>>>>>> >>>>>>>> /sys/kernel/config/cluster/CLUSTER/node: >>>>>>>> total 0 >>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >>>>>>>> >>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >>>>>>>> total 0 >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>>>> >>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >>>>>>>> total 0 >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 10/19/2011 00:12, Sunil Mushran wrote: >>>>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>>>> >>>>>>>>> What does this return? >>>>>>>>> >>>>>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>>>>>>>> Hi, >>>>>>>>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>>>>>>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>>>>>>>> My problem is that all the time when i try to run >>>>>>>>>> /etc/init.d/o2cb stop >>>>>>>>>> it fails with this error: >>>>>>>>>> Stopping O2CB cluster CLUSTER: Failed >>>>>>>>>> Unable to stop cluster as heartbeat region still active >>>>>>>>>> There is no active mount point. I tried to manually stop the >>>>>>>>>> heartdbeat >>>>>>>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" >>>>>>>>>> (after finding >>>>>>>>>> the refs number with "ocfs2_hb_ctl -I -d >>>>>>>>>> /dev/mapper/volgr1-lvol0 "). >>>>>>>>>> But even if refs number is set to zero the "heartbeat region >>>>>>>>>> still >>>>>>>>>> active" occurs. >>>>>>>>>> How can i fix this? >>>>>>>>>> >>>>>>>>>> Thank you in advance. >>>>>>>>>> Laurentiu. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Ocfs2-users mailing list >>>>>>>>>> Ocfs2-users at oss.oracle.com >>>>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
Sunil Mushran
2011-Oct-18 21:50 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
See if this cleans it up. ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D On 10/18/2011 02:44 PM, Laurentiu Gosu wrote:> ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D > 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs > > > On 10/19/2011 00:43, Sunil Mushran wrote: >> ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D >> >> On 10/18/2011 02:40 PM, Laurentiu Gosu wrote: >>> mounted.ocfs2 -d >>> Device FS Stack UUID Label >>> /dev/mapper/volgr1-lvol0 ocfs2 o2cb 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2 >>> >>> mounted.ocfs2 -f >>> Device FS Nodes >>> /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001 >>> >>> ro02xsrv001 = the other node in the cluster. >>> >>> By the way, there is no /dev/md-2 >>> ls /dev/dm-* >>> /dev/dm-0 /dev/dm-1 >>> >>> >>> On 10/19/2011 00:37, Sunil Mushran wrote: >>>> So it is not mounted. But we still have a hb thread because >>>> hb could not be stopped during umount. The reason for that >>>> could be the same that causes ocfs2_hb_ctl to fail. >>>> >>>> Do: >>>> mounted.ocfs2 -d >>>> >>>> On 10/18/2011 02:32 PM, Laurentiu Gosu wrote: >>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>> /sys/kernel/debug/ocfs2: >>>>> total 0 >>>>> >>>>> ls -lR /sys/kernel/debug/o2dlm >>>>> /sys/kernel/debug/o2dlm: >>>>> total 0 >>>>> >>>>> ocfs2_hb_ctl -I -d /dev/dm-2 >>>>> ocfs2_hb_ctl: Device name specified was not found while reading uuid >>>>> >>>>> There is no /dev/dm-2 mounted. >>>>> >>>>> >>>>> On 10/19/2011 00:27, Sunil Mushran wrote: >>>>>> mount -t debugfs debugfs /sys/kernel/debug >>>>>> >>>>>> Then list that dir. >>>>>> >>>>>> Also, do: >>>>>> ocfs2_hb_ctl -l -d /dev/dm-2 >>>>>> >>>>>> Be careful before killing. We want to be sure that dev is not mounted. >>>>>> >>>>>> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote: >>>>>>> Again the outputs: >>>>>>> cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>>>> dm-2 >>>>>>> --->here should be volgr1-lvol0 i guess? >>>>>>> >>>>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>>>> ls: /sys/kernel/debug/ocfs2: No such file or directory >>>>>>> >>>>>>> ls -lR /sys/kernel/debug/o2dlm >>>>>>> ls: /sys/kernel/debug/o2dlm: No such file or directory >>>>>>> >>>>>>> I think i have to enable debug first somehow..? >>>>>>> >>>>>>> Laurentiu. >>>>>>> >>>>>>> On 10/19/2011 00:17, Sunil Mushran wrote: >>>>>>>> What does this return? >>>>>>>> cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>>>>> >>>>>>>> Also, do: >>>>>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>>>>> ls -lR /sys/kernel/debug/o2dlm >>>>>>>> >>>>>>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >>>>>>>>> Here is the output: >>>>>>>>> >>>>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>>>> /sys/kernel/config/cluster: >>>>>>>>> total 0 >>>>>>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >>>>>>>>> >>>>>>>>> /sys/kernel/config/cluster/CLUSTER: >>>>>>>>> total 0 >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >>>>>>>>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >>>>>>>>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >>>>>>>>> >>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat: >>>>>>>>> total 0 >>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 918673F06F8F4ED188DDCE14F39945F6 >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >>>>>>>>> >>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >>>>>>>>> total 0 >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >>>>>>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >>>>>>>>> >>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node: >>>>>>>>> total 0 >>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >>>>>>>>> >>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >>>>>>>>> total 0 >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>>>>> >>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >>>>>>>>> total 0 >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 10/19/2011 00:12, Sunil Mushran wrote: >>>>>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>>>>> >>>>>>>>>> What does this return? >>>>>>>>>> >>>>>>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, >>>>>>>>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>>>>>>>>> My problem is that all the time when i try to run /etc/init.d/o2cb stop >>>>>>>>>>> it fails with this error: >>>>>>>>>>> Stopping O2CB cluster CLUSTER: Failed >>>>>>>>>>> Unable to stop cluster as heartbeat region still active >>>>>>>>>>> There is no active mount point. I tried to manually stop the heartdbeat >>>>>>>>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" (after finding >>>>>>>>>>> the refs number with "ocfs2_hb_ctl -I -d /dev/mapper/volgr1-lvol0 "). >>>>>>>>>>> But even if refs number is set to zero the "heartbeat region still >>>>>>>>>>> active" occurs. >>>>>>>>>>> How can i fix this? >>>>>>>>>>> >>>>>>>>>>> Thank you in advance. >>>>>>>>>>> Laurentiu. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Ocfs2-users mailing list >>>>>>>>>>> Ocfs2-users at oss.oracle.com >>>>>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
Laurentiu Gosu
2011-Oct-18 21:52 UTC
[Ocfs2-users] Unable to stop cluster as heartbeat region still active
ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping heartbeat No improvment :( On 10/19/2011 00:50, Sunil Mushran wrote:> See if this cleans it up. > ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D > > On 10/18/2011 02:44 PM, Laurentiu Gosu wrote: >> ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D >> 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs >> >> >> On 10/19/2011 00:43, Sunil Mushran wrote: >>> ocfs2_hb_ctl -l -u 0C4AB55FE9314FA5A9F81652FDB9B22D >>> >>> On 10/18/2011 02:40 PM, Laurentiu Gosu wrote: >>>> mounted.ocfs2 -d >>>> Device FS Stack >>>> UUID Label >>>> /dev/mapper/volgr1-lvol0 ocfs2 o2cb >>>> 0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2 >>>> >>>> mounted.ocfs2 -f >>>> Device FS Nodes >>>> /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001 >>>> >>>> ro02xsrv001 = the other node in the cluster. >>>> >>>> By the way, there is no /dev/md-2 >>>> ls /dev/dm-* >>>> /dev/dm-0 /dev/dm-1 >>>> >>>> >>>> On 10/19/2011 00:37, Sunil Mushran wrote: >>>>> So it is not mounted. But we still have a hb thread because >>>>> hb could not be stopped during umount. The reason for that >>>>> could be the same that causes ocfs2_hb_ctl to fail. >>>>> >>>>> Do: >>>>> mounted.ocfs2 -d >>>>> >>>>> On 10/18/2011 02:32 PM, Laurentiu Gosu wrote: >>>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>>> /sys/kernel/debug/ocfs2: >>>>>> total 0 >>>>>> >>>>>> ls -lR /sys/kernel/debug/o2dlm >>>>>> /sys/kernel/debug/o2dlm: >>>>>> total 0 >>>>>> >>>>>> ocfs2_hb_ctl -I -d /dev/dm-2 >>>>>> ocfs2_hb_ctl: Device name specified was not found while reading uuid >>>>>> >>>>>> There is no /dev/dm-2 mounted. >>>>>> >>>>>> >>>>>> On 10/19/2011 00:27, Sunil Mushran wrote: >>>>>>> mount -t debugfs debugfs /sys/kernel/debug >>>>>>> >>>>>>> Then list that dir. >>>>>>> >>>>>>> Also, do: >>>>>>> ocfs2_hb_ctl -l -d /dev/dm-2 >>>>>>> >>>>>>> Be careful before killing. We want to be sure that dev is not >>>>>>> mounted. >>>>>>> >>>>>>> On 10/18/2011 02:23 PM, Laurentiu Gosu wrote: >>>>>>>> Again the outputs: >>>>>>>> cat >>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>>>>> dm-2 >>>>>>>> --->here should be volgr1-lvol0 i guess? >>>>>>>> >>>>>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>>>>> ls: /sys/kernel/debug/ocfs2: No such file or directory >>>>>>>> >>>>>>>> ls -lR /sys/kernel/debug/o2dlm >>>>>>>> ls: /sys/kernel/debug/o2dlm: No such file or directory >>>>>>>> >>>>>>>> I think i have to enable debug first somehow..? >>>>>>>> >>>>>>>> Laurentiu. >>>>>>>> >>>>>>>> On 10/19/2011 00:17, Sunil Mushran wrote: >>>>>>>>> What does this return? >>>>>>>>> cat >>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >>>>>>>>> >>>>>>>>> Also, do: >>>>>>>>> ls -lR /sys/kernel/debug/ocfs2 >>>>>>>>> ls -lR /sys/kernel/debug/o2dlm >>>>>>>>> >>>>>>>>> On 10/18/2011 02:14 PM, Laurentiu Gosu wrote: >>>>>>>>>> Here is the output: >>>>>>>>>> >>>>>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>>>>> /sys/kernel/config/cluster: >>>>>>>>>> total 0 >>>>>>>>>> drwxr-xr-x 4 root root 0 Oct 19 00:12 CLUSTER >>>>>>>>>> >>>>>>>>>> /sys/kernel/config/cluster/CLUSTER: >>>>>>>>>> total 0 >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 fence_method >>>>>>>>>> drwxr-xr-x 3 root root 0 Oct 19 00:12 heartbeat >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 idle_timeout_ms >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 keepalive_delay_ms >>>>>>>>>> drwxr-xr-x 4 root root 0 Oct 11 20:23 node >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 reconnect_delay_ms >>>>>>>>>> >>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat: >>>>>>>>>> total 0 >>>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 >>>>>>>>>> 918673F06F8F4ED188DDCE14F39945F6 >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dead_threshold >>>>>>>>>> >>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6: >>>>>>>>>> >>>>>>>>>> total 0 >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 block_bytes >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 blocks >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 dev >>>>>>>>>> -r--r--r-- 1 root root 4096 Oct 19 00:12 pid >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 start_block >>>>>>>>>> >>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node: >>>>>>>>>> total 0 >>>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv001 >>>>>>>>>> drwxr-xr-x 2 root root 0 Oct 19 00:12 ro02xsrv002 >>>>>>>>>> >>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv001: >>>>>>>>>> total 0 >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>>>>>> >>>>>>>>>> /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: >>>>>>>>>> total 0 >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 local >>>>>>>>>> -rw-r--r-- 1 root root 4096 Oct 19 00:12 num >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 10/19/2011 00:12, Sunil Mushran wrote: >>>>>>>>>>> ls -lR /sys/kernel/config/cluster >>>>>>>>>>> >>>>>>>>>>> What does this return? >>>>>>>>>>> >>>>>>>>>>> On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> I have a 2 nodes ocfs2 cluster running UEK >>>>>>>>>>>> 2.6.32-100.0.19.el5, >>>>>>>>>>>> ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. >>>>>>>>>>>> My problem is that all the time when i try to run >>>>>>>>>>>> /etc/init.d/o2cb stop >>>>>>>>>>>> it fails with this error: >>>>>>>>>>>> Stopping O2CB cluster CLUSTER: Failed >>>>>>>>>>>> Unable to stop cluster as heartbeat region still active >>>>>>>>>>>> There is no active mount point. I tried to manually stop >>>>>>>>>>>> the heartdbeat >>>>>>>>>>>> with "ocfs2_hb_ctl -K -d /dev/mapper/volgr1-lvol0 ocfs2" >>>>>>>>>>>> (after finding >>>>>>>>>>>> the refs number with "ocfs2_hb_ctl -I -d >>>>>>>>>>>> /dev/mapper/volgr1-lvol0 "). >>>>>>>>>>>> But even if refs number is set to zero the "heartbeat >>>>>>>>>>>> region still >>>>>>>>>>>> active" occurs. >>>>>>>>>>>> How can i fix this? >>>>>>>>>>>> >>>>>>>>>>>> Thank you in advance. >>>>>>>>>>>> Laurentiu. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Ocfs2-users mailing list >>>>>>>>>>>> Ocfs2-users at oss.oracle.com >>>>>>>>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
Possibly Parallel Threads
- ocfs2_unlink:953 ERROR: status = -39
- OCFS2 and ASM Question
- Another node is heartbeating in our slot! errors with LUN removal/addition
- problem mounting ocfs2: heartbeat
- mount.ocfs2: Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not permitted".