Umarzuki Mochlis
2015-Mar-17 05:28 UTC
[Ocfs2-users] 答复: Re: What do I need so ocfs2 logical volume mounted during boot
so I put nc to check until port 7777 is up, then proceed with starting ocfs2 & o2cb like below, which works while ! nc -z 10.224.202.192 7777 do sleep 5 done I'm not familiar with corosync/pacemaker. Do I need fencing device for 2 nodes? Any configuration samples? 2015-03-17 13:14 GMT+08:00 Zhen Ren <zren at suse.com>:> Yeah, it seems nodes in the same cluster cannot fully reach each other. > This may make o2cb fail to manage nodes in low level, and then cause ocfs2 > not work. > > In my end,I use corosync/pacemaker cluster stack. Nodes in cluster must ssh > each other without passwd. > Hope it can help. > > Also,please add the mail list ;-). > > -- > Best regards, > Zhen, Ren > HA team, SUSE > > >>>> >> There's one problem though. >> >> I tried on another node this method, but result is different. Its >> network interface (em2) does not seem to be able to reach first node. >> >> Mar 17 12:54:23 app2 kernel: [ 21.952527] igb: em2 NIC Link is Up >> 1000 Mbps Full Duplex, Flow Control: RX/TX >> Mar 17 12:54:23 app2 kernel: [ 21.952737] IPv6: >> ADDRCONF(NETDEV_CHANGE): em2: link becomes ready >> Mar 17 12:54:28 app2 kernel: [ 26.756863] OCFS2 Node Manager 1.5.0 >> Mar 17 12:54:28 app2 kernel: [ 26.760481] OCFS2 DLM 1.5.0 >> Mar 17 12:54:28 app2 kernel: [ 26.761550] ocfs2: Registered cluster >> interface o2cb >> Mar 17 12:54:28 app2 kernel: [ 26.770524] OCFS2 DLMFS 1.5.0 >> Mar 17 12:54:28 app2 kernel: [ 26.770625] OCFS2 User DLM kernel >> interface loaded >> Mar 17 12:54:35 app2 kernel: [ 33.819716] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:54:36 app2 kernel: [ 34.882978] OCFS2 1.5.0 >> Mar 17 12:54:38 app2 kernel: [ 36.822648] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:54:41 app2 kernel: [ 39.825565] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:54:44 app2 kernel: [ 42.828493] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:54:47 app2 kernel: [ 45.831420] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:54:50 app2 kernel: [ 48.834359] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:54:53 app2 kernel: [ 51.837318] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:54:56 app2 kernel: [ 54.840224] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:54:59 app2 kernel: [ 57.843135] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:55:01 app2 ntpdate[2490]: Can't find host ntp.ubuntu.com: >> Name or service not known (-2) >> Mar 17 12:55:01 app2 ntpdate[2490]: no servers can be used, exiting >> Mar 17 12:55:02 app2 kernel: [ 60.846069] o2net: Connection to node >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 >> Mar 17 12:55:02 app2 kernel: [ 60.850050] o2net: No connection >> established with node 1 after 30.0 seconds, giving up. >> Mar 17 12:55:20 app2 ntpdate[3496]: no server suitable for synchronization >> found >> Mar 17 12:55:34 app2 kernel: [ 92.881306] o2net: No connection >> established with node 1 after 30.0 seconds, giving up. >> Mar 17 12:55:36 app2 kernel: [ 95.179544] o2cb: This node could not >> connect to nodes: 1. >> Mar 17 12:55:36 app2 kernel: [ 95.184416] o2cb: Cluster check >> failed. Fix errors before retrying. >> Mar 17 12:55:36 app2 kernel: [ 95.189351] >> (mount.ocfs2,3465,3):ocfs2_dlm_init:3004 ERROR: status = -107 >> Mar 17 12:55:36 app2 kernel: [ 95.194479] >> (mount.ocfs2,3465,3):ocfs2_mount_volume:1881 ERROR: status = -107 >> Mar 17 12:55:36 app2 kernel: [ 95.199577] ocfs2: Unmounting device >> (252,2) on (node 0) >> Mar 17 12:55:36 app2 kernel: [ 95.199589] >> (mount.ocfs2,3465,3):ocfs2_fill_super:1229 ERROR: status = -107 >> >> 2015-03-17 12:29 GMT+08:00 Zhen Ren <zren at suse.com>: >> > Congratulations! >> > I also learned something from you. Thanks for your share. >> > >> > >> > -- >> > Best regards, >> > Zhen, Ren >> > HA team, SUSE >> >> >
Zhen Ren
2015-Mar-17 05:43 UTC
[Ocfs2-users] 答复: Re: What do I need so ocfs2 logical volume mounted during boot
I don't think you have to get familiar with corosync/pacemaker, but pacemaker cluster stack can make things easier to get done. Here is an example specific for SUSE linux. https://www.suse.com/documentation/sle_ha/book_sleha/data/cha_ha_ocfs2.html It may show another way to do your task,but I have no idea how to do that on ubuntu. -- Best regards, Zhen, Ren HA team, SUSE>>> > so I put nc to check until port 7777 is up, then proceed with starting > ocfs2 & o2cb like below, which works > > while ! nc -z 10.224.202.192 7777 > do > sleep 5 > done > > I'm not familiar with corosync/pacemaker. Do I need fencing device for > 2 nodes? Any configuration samples? > > 2015-03-17 13:14 GMT+08:00 Zhen Ren <zren at suse.com>: > > Yeah, it seems nodes in the same cluster cannot fully reach each other. > > This may make o2cb fail to manage nodes in low level, and then cause ocfs2 > > not work. > > > > In my end,I use corosync/pacemaker cluster stack. Nodes in cluster must > ssh > > each other without passwd. > > Hope it can help. > > > > Also,please add the mail list ;-). > > > > -- > > Best regards, > > Zhen, Ren > > HA team, SUSE > > > > > >>>> > >> There's one problem though. > >> > >> I tried on another node this method, but result is different. Its > >> network interface (em2) does not seem to be able to reach first node. > >> > >> Mar 17 12:54:23 app2 kernel: [ 21.952527] igb: em2 NIC Link is Up > >> 1000 Mbps Full Duplex, Flow Control: RX/TX > >> Mar 17 12:54:23 app2 kernel: [ 21.952737] IPv6: > >> ADDRCONF(NETDEV_CHANGE): em2: link becomes ready > >> Mar 17 12:54:28 app2 kernel: [ 26.756863] OCFS2 Node Manager 1.5.0 > >> Mar 17 12:54:28 app2 kernel: [ 26.760481] OCFS2 DLM 1.5.0 > >> Mar 17 12:54:28 app2 kernel: [ 26.761550] ocfs2: Registered cluster > >> interface o2cb > >> Mar 17 12:54:28 app2 kernel: [ 26.770524] OCFS2 DLMFS 1.5.0 > >> Mar 17 12:54:28 app2 kernel: [ 26.770625] OCFS2 User DLM kernel > >> interface loaded > >> Mar 17 12:54:35 app2 kernel: [ 33.819716] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:54:36 app2 kernel: [ 34.882978] OCFS2 1.5.0 > >> Mar 17 12:54:38 app2 kernel: [ 36.822648] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:54:41 app2 kernel: [ 39.825565] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:54:44 app2 kernel: [ 42.828493] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:54:47 app2 kernel: [ 45.831420] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:54:50 app2 kernel: [ 48.834359] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:54:53 app2 kernel: [ 51.837318] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:54:56 app2 kernel: [ 54.840224] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:54:59 app2 kernel: [ 57.843135] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:55:01 app2 ntpdate[2490]: Can't find host ntp.ubuntu.com: > >> Name or service not known (-2) > >> Mar 17 12:55:01 app2 ntpdate[2490]: no servers can be used, exiting > >> Mar 17 12:55:02 app2 kernel: [ 60.846069] o2net: Connection to node > >> web1 (num 1) at 192.168.56.100:7777 shutdown, state 7 > >> Mar 17 12:55:02 app2 kernel: [ 60.850050] o2net: No connection > >> established with node 1 after 30.0 seconds, giving up. > >> Mar 17 12:55:20 app2 ntpdate[3496]: no server suitable for synchronization > >> found > >> Mar 17 12:55:34 app2 kernel: [ 92.881306] o2net: No connection > >> established with node 1 after 30.0 seconds, giving up. > >> Mar 17 12:55:36 app2 kernel: [ 95.179544] o2cb: This node could not > >> connect to nodes: 1. > >> Mar 17 12:55:36 app2 kernel: [ 95.184416] o2cb: Cluster check > >> failed. Fix errors before retrying. > >> Mar 17 12:55:36 app2 kernel: [ 95.189351] > >> (mount.ocfs2,3465,3):ocfs2_dlm_init:3004 ERROR: status = -107 > >> Mar 17 12:55:36 app2 kernel: [ 95.194479] > >> (mount.ocfs2,3465,3):ocfs2_mount_volume:1881 ERROR: status = -107 > >> Mar 17 12:55:36 app2 kernel: [ 95.199577] ocfs2: Unmounting device > >> (252,2) on (node 0) > >> Mar 17 12:55:36 app2 kernel: [ 95.199589] > >> (mount.ocfs2,3465,3):ocfs2_fill_super:1229 ERROR: status = -107 > >> > >> 2015-03-17 12:29 GMT+08:00 Zhen Ren <zren at suse.com>: > >> > Congratulations! > >> > I also learned something from you. Thanks for your share. > >> > > >> > > >> > -- > >> > Best regards, > >> > Zhen, Ren > >> > HA team, SUSE > >> > >> > > > >