Hi Alain,
On 11/08/2010 11:08 PM, Alain.Moulle wrote:> Hi,
>
> I have a problem on Fedora13 with releases :
> ocfs2 1.4.3-5.fc13.x86_64
> dlm_tool 3.0.17
>
> With a 3 nodes ocfs2 cluster, I can't mount FS on the three nodes at
the same time
> but only on two nodes among the 3 nodes , whatever the two nodes are
among the 3 nodes.
>
> The errors are :
> "(1475,0):o2net_connect_expired:1656 ERROR: no connection established
> with node 2 after 30.0 seconds, giving up and returning errors.
> (2175,0):dlm_request_join:1035 ERROR: status = -107
> (2175,0):dlm_try_to_join_domain:1209 ERROR: status = -107
> (2175,0):dlm_join_domain:1487 ERROR: status = -107
> (2175,0):dlm_register_domain:1753 ERROR: status = -107
> (2175,0):o2cb_cluster_connect:313 ERROR: status = -107
> (2175,0):ocfs2_dlm_init:2995 ERROR: status = -107
> (2175,0):ocfs2_mount_volume:1789 ERROR: status = -107
> ocfs2: Unmounting device (8,16) on (node 0)
> o2net: no longer connected to node selfxl-4 (num 0) at
> 10.197.189.204:7777
> o2net: connected to node selfxl-4 (num 0) at 10.197.189.204:7777
>
> It seems to be a lock management problem
> Is it an already known issue ?
> Is there an available patch ?
It doesn't look like a dlm problem, but a network problem. ;)
So your first error is o2net_connect_expired.
So it seems that the 3rd node can't connect with node 2.
Could you please check the error message in node 2?
btw, I would deem that the cluster.conf is the same among the 3 nodes,
and you you can connect to 7777(which is used by ocfs2) of node 2 from
node 3.
Regards,
Tao