John Murphy
2009-Jun-10 16:23 UTC
[Ocfs2-users] mount.ocfs2: Transport endpoint is not connected while mounting
Hi All, I have a cluster of four nodes and one will not join the cluster. I have two IPs on each node, one external and one internal. I have tried changing around the IPs in the /etc/ocfs2/cluster.conf and that helped - at least I recovered three of the machines. Any suggestions on where else to look? Best Regards John (23359,2):o2net_connect_expired:1637 ERROR: no connection established with node 1 after 30.0 seconds, giving up and returning errors. (23359,2):o2net_connect_expired:1637 ERROR: no connection established with node 2 after 30.0 seconds, giving up and returning errors. (23359,2):o2net_connect_expired:1637 ERROR: no connection established with node 3 after 30.0 seconds, giving up and returning errors. (23571,2):dlm_request_join:1033 ERROR: status = -107 (23571,2):dlm_try_to_join_domain:1207 ERROR: status = -107 (23571,2):dlm_join_domain:1485 ERROR: status = -107 (23571,2):dlm_register_domain:1732 ERROR: status = -107 (23571,2):ocfs2_dlm_init:2662 ERROR: status = -107 (23571,2):ocfs2_mount_volume:1251 ERROR: status = -107 ocfs2: Unmounting device (8,80) on (node 0) -- John Murphy Technical And Managing Director MANDAC Ltd Kandoy House 2 Fairview Strand Dublin 3 p: +353 1 5143001 m: +353 85 711 6844 e: john.murphy at mandac.eu w: www.mandac.eu
Bengtsson Anders
2009-Jun-10 16:45 UTC
[Ocfs2-users] mount.ocfs2: Transport endpoint is not connected while mounting
ocfs2: Unmounting device (8,80) on (node 0) Looks like you have the wrong nodenumber in /etc/ocfs2/cluster.conf, try to change it to like node = 5 and restart o2cb, and remount. mvh // Anders -----Ursprungligt meddelande----- Fr?n: ocfs2-users-bounces at oss.oracle.com [mailto:ocfs2-users-bounces at oss.oracle.com] F?r John Murphy Skickat: den 10 juni 2009 18:24 Till: ocfs2-users ?mne: [Ocfs2-users] mount.ocfs2: Transport endpoint is not connected while mounting Hi All, I have a cluster of four nodes and one will not join the cluster. I have two IPs on each node, one external and one internal. I have tried changing around the IPs in the /etc/ocfs2/cluster.conf and that helped - at least I recovered three of the machines. Any suggestions on where else to look? Best Regards John (23359,2):o2net_connect_expired:1637 ERROR: no connection established with node 1 after 30.0 seconds, giving up and returning errors. (23359,2):o2net_connect_expired:1637 ERROR: no connection established with node 2 after 30.0 seconds, giving up and returning errors. (23359,2):o2net_connect_expired:1637 ERROR: no connection established with node 3 after 30.0 seconds, giving up and returning errors. (23571,2):dlm_request_join:1033 ERROR: status = -107 (23571,2):dlm_try_to_join_domain:1207 ERROR: status = -107 (23571,2):dlm_join_domain:1485 ERROR: status = -107 (23571,2):dlm_register_domain:1732 ERROR: status = -107 (23571,2):ocfs2_dlm_init:2662 ERROR: status = -107 (23571,2):ocfs2_mount_volume:1251 ERROR: status = -107 ocfs2: Unmounting device (8,80) on (node 0) -- John Murphy Technical And Managing Director MANDAC Ltd Kandoy House 2 Fairview Strand Dublin 3 p: +353 1 5143001 m: +353 85 711 6844 e: john.murphy at mandac.eu w: www.mandac.eu _______________________________________________ Ocfs2-users mailing list Ocfs2-users at oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Sunil Mushran
2009-Jun-10 17:18 UTC
[Ocfs2-users] mount.ocfs2: Transport endpoint is not connected while mounting
ensure iptables is either off or has rules for the interconnect traffic. use tcpdump to see if packets are coming thru. The connect request is initiated between two nodes is initiated when both of them first mount a common volume. Also, the connect request is always from the higher node to the lower node number. John Murphy wrote:> Hi All, > > I have a cluster of four nodes and one will not join the cluster. I have > two IPs on each node, one external and one internal. I have tried > changing around the IPs in the /etc/ocfs2/cluster.conf and that helped - > at least I recovered three of the machines. Any suggestions on where > else to look? > > Best Regards > > John > > (23359,2):o2net_connect_expired:1637 ERROR: no connection established > with node 1 after 30.0 seconds, giving up and returning errors. > (23359,2):o2net_connect_expired:1637 ERROR: no connection established > with node 2 after 30.0 seconds, giving up and returning errors. > (23359,2):o2net_connect_expired:1637 ERROR: no connection established > with node 3 after 30.0 seconds, giving up and returning errors. > (23571,2):dlm_request_join:1033 ERROR: status = -107 > (23571,2):dlm_try_to_join_domain:1207 ERROR: status = -107 > (23571,2):dlm_join_domain:1485 ERROR: status = -107 > (23571,2):dlm_register_domain:1732 ERROR: status = -107 > (23571,2):ocfs2_dlm_init:2662 ERROR: status = -107 > (23571,2):ocfs2_mount_volume:1251 ERROR: status = -107 > ocfs2: Unmounting device (8,80) on (node 0) > >
John Murphy
2009-Jun-11 09:24 UTC
[Ocfs2-users] mount.ocfs2: Transport endpoint is not connected while mounting
Hi All, Many thanks for the replies. I am running four servers in high-availability and load-balancing mode using Piranha which invokes iptables. What should I add to iptables to enable interconnect traffic? TIA John On Wed, 2009-06-10 at 10:18 -0700, Sunil Mushran wrote:> ensure iptables is either off or has rules for the interconnect traffic. > > use tcpdump to see if packets are coming thru. The connect request > is initiated between two nodes is initiated when both of them first > mount a common volume. Also, the connect request is always from > the higher node to the lower node number. > > John Murphy wrote: > > Hi All, > > > > I have a cluster of four nodes and one will not join the cluster. I have > > two IPs on each node, one external and one internal. I have tried > > changing around the IPs in the /etc/ocfs2/cluster.conf and that helped - > > at least I recovered three of the machines. Any suggestions on where > > else to look? > > > > Best Regards > > > > John > > > > (23359,2):o2net_connect_expired:1637 ERROR: no connection established > > with node 1 after 30.0 seconds, giving up and returning errors. > > (23359,2):o2net_connect_expired:1637 ERROR: no connection established > > with node 2 after 30.0 seconds, giving up and returning errors. > > (23359,2):o2net_connect_expired:1637 ERROR: no connection established > > with node 3 after 30.0 seconds, giving up and returning errors. > > (23571,2):dlm_request_join:1033 ERROR: status = -107 > > (23571,2):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (23571,2):dlm_join_domain:1485 ERROR: status = -107 > > (23571,2):dlm_register_domain:1732 ERROR: status = -107 > > (23571,2):ocfs2_dlm_init:2662 ERROR: status = -107 > > (23571,2):ocfs2_mount_volume:1251 ERROR: status = -107 > > ocfs2: Unmounting device (8,80) on (node 0) > > > > > >-- John Murphy Technical And Managing Director MANDAC Ltd Kandoy House 2 Fairview Strand Dublin 3 p: +353 1 5143001 m: +353 85 711 6844 e: john.murphy at mandac.eu w: www.mandac.eu
Sunil Mushran
2009-Jun-11 14:33 UTC
[Ocfs2-users] mount.ocfs2: Transport endpoint is not connected while mounting
Add a rule to allow traffic on port 7777 (or whatever it is in cluster.conf) on the interconnect interface. On Jun 11, 2009, at 2:24 AM, John Murphy <john.murphy at mandac.eu> wrote:> Hi All, > > Many thanks for the replies. I am running four servers in > high-availability and load-balancing mode using Piranha which invokes > iptables. What should I add to iptables to enable interconnect > traffic? > > TIA > > John > > On Wed, 2009-06-10 at 10:18 -0700, Sunil Mushran wrote: >> ensure iptables is either off or has rules for the interconnect >> traffic. >> >> use tcpdump to see if packets are coming thru. The connect request >> is initiated between two nodes is initiated when both of them first >> mount a common volume. Also, the connect request is always from >> the higher node to the lower node number. >> >> John Murphy wrote: >>> Hi All, >>> >>> I have a cluster of four nodes and one will not join the cluster. >>> I have >>> two IPs on each node, one external and one internal. I have tried >>> changing around the IPs in the /etc/ocfs2/cluster.conf and that >>> helped - >>> at least I recovered three of the machines. Any suggestions on where >>> else to look? >>> >>> Best Regards >>> >>> John >>> >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection >>> established >>> with node 1 after 30.0 seconds, giving up and returning errors. >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection >>> established >>> with node 2 after 30.0 seconds, giving up and returning errors. >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection >>> established >>> with node 3 after 30.0 seconds, giving up and returning errors. >>> (23571,2):dlm_request_join:1033 ERROR: status = -107 >>> (23571,2):dlm_try_to_join_domain:1207 ERROR: status = -107 >>> (23571,2):dlm_join_domain:1485 ERROR: status = -107 >>> (23571,2):dlm_register_domain:1732 ERROR: status = -107 >>> (23571,2):ocfs2_dlm_init:2662 ERROR: status = -107 >>> (23571,2):ocfs2_mount_volume:1251 ERROR: status = -107 >>> ocfs2: Unmounting device (8,80) on (node 0) >>> >>> >> >> > -- > John Murphy > Technical And Managing Director > MANDAC Ltd > Kandoy House > 2 Fairview Strand > Dublin 3 > p: +353 1 5143001 > m: +353 85 711 6844 > e: john.murphy at mandac.eu > w: www.mandac.eu > >
John Murphy
2009-Jun-11 15:29 UTC
[Ocfs2-users] mount.ocfs2: Transport endpoint is not connected while mounting
Hi Sunil, Many thanks, John On Thu, 2009-06-11 at 07:33 -0700, Sunil Mushran wrote:> Add a rule to allow traffic on port 7777 (or whatever it is in > cluster.conf) on the interconnect interface. > > > On Jun 11, 2009, at 2:24 AM, John Murphy <john.murphy at mandac.eu> wrote: > > > Hi All, > > > > Many thanks for the replies. I am running four servers in > > high-availability and load-balancing mode using Piranha which invokes > > iptables. What should I add to iptables to enable interconnect > > traffic? > > > > TIA > > > > John > > > > On Wed, 2009-06-10 at 10:18 -0700, Sunil Mushran wrote: > >> ensure iptables is either off or has rules for the interconnect > >> traffic. > >> > >> use tcpdump to see if packets are coming thru. The connect request > >> is initiated between two nodes is initiated when both of them first > >> mount a common volume. Also, the connect request is always from > >> the higher node to the lower node number. > >> > >> John Murphy wrote: > >>> Hi All, > >>> > >>> I have a cluster of four nodes and one will not join the cluster. > >>> I have > >>> two IPs on each node, one external and one internal. I have tried > >>> changing around the IPs in the /etc/ocfs2/cluster.conf and that > >>> helped - > >>> at least I recovered three of the machines. Any suggestions on where > >>> else to look? > >>> > >>> Best Regards > >>> > >>> John > >>> > >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection > >>> established > >>> with node 1 after 30.0 seconds, giving up and returning errors. > >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection > >>> established > >>> with node 2 after 30.0 seconds, giving up and returning errors. > >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection > >>> established > >>> with node 3 after 30.0 seconds, giving up and returning errors. > >>> (23571,2):dlm_request_join:1033 ERROR: status = -107 > >>> (23571,2):dlm_try_to_join_domain:1207 ERROR: status = -107 > >>> (23571,2):dlm_join_domain:1485 ERROR: status = -107 > >>> (23571,2):dlm_register_domain:1732 ERROR: status = -107 > >>> (23571,2):ocfs2_dlm_init:2662 ERROR: status = -107 > >>> (23571,2):ocfs2_mount_volume:1251 ERROR: status = -107 > >>> ocfs2: Unmounting device (8,80) on (node 0) > >>> > >>> > >> > >> > > -- > > John Murphy > > Technical And Managing Director > > MANDAC Ltd > > Kandoy House > > 2 Fairview Strand > > Dublin 3 > > p: +353 1 5143001 > > m: +353 85 711 6844 > > e: john.murphy at mandac.eu > > w: www.mandac.eu > > > > >-- John Murphy Technical And Managing Director MANDAC Ltd Kandoy House 2 Fairview Strand Dublin 3 p: +353 1 5143001 m: +353 85 711 6844 e: john.murphy at mandac.eu w: www.mandac.eu
John Murphy
2009-Jun-11 16:10 UTC
[Ocfs2-users] mount.ocfs2: Transport endpoint is not connected while mounting
Hi, Does this mean I need this mean I need to unmount all devices on the three working nodes in order to run /etc/init.d/o2cb offline on each one followed by /etc/init.d/o2cb start? Best Regards John On Thu, 2009-06-11 at 07:33 -0700, Sunil Mushran wrote:> Add a rule to allow traffic on port 7777 (or whatever it is in > cluster.conf) on the interconnect interface. > > > On Jun 11, 2009, at 2:24 AM, John Murphy <john.murphy at mandac.eu> wrote: > > > Hi All, > > > > Many thanks for the replies. I am running four servers in > > high-availability and load-balancing mode using Piranha which invokes > > iptables. What should I add to iptables to enable interconnect > > traffic? > > > > TIA > > > > John > > > > On Wed, 2009-06-10 at 10:18 -0700, Sunil Mushran wrote: > >> ensure iptables is either off or has rules for the interconnect > >> traffic. > >> > >> use tcpdump to see if packets are coming thru. The connect request > >> is initiated between two nodes is initiated when both of them first > >> mount a common volume. Also, the connect request is always from > >> the higher node to the lower node number. > >> > >> John Murphy wrote: > >>> Hi All, > >>> > >>> I have a cluster of four nodes and one will not join the cluster. > >>> I have > >>> two IPs on each node, one external and one internal. I have tried > >>> changing around the IPs in the /etc/ocfs2/cluster.conf and that > >>> helped - > >>> at least I recovered three of the machines. Any suggestions on where > >>> else to look? > >>> > >>> Best Regards > >>> > >>> John > >>> > >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection > >>> established > >>> with node 1 after 30.0 seconds, giving up and returning errors. > >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection > >>> established > >>> with node 2 after 30.0 seconds, giving up and returning errors. > >>> (23359,2):o2net_connect_expired:1637 ERROR: no connection > >>> established > >>> with node 3 after 30.0 seconds, giving up and returning errors. > >>> (23571,2):dlm_request_join:1033 ERROR: status = -107 > >>> (23571,2):dlm_try_to_join_domain:1207 ERROR: status = -107 > >>> (23571,2):dlm_join_domain:1485 ERROR: status = -107 > >>> (23571,2):dlm_register_domain:1732 ERROR: status = -107 > >>> (23571,2):ocfs2_dlm_init:2662 ERROR: status = -107 > >>> (23571,2):ocfs2_mount_volume:1251 ERROR: status = -107 > >>> ocfs2: Unmounting device (8,80) on (node 0) > >>> > >>> > >> > >> > > -- > > John Murphy > > Technical And Managing Director > > MANDAC Ltd > > Kandoy House > > 2 Fairview Strand > > Dublin 3 > > p: +353 1 5143001 > > m: +353 85 711 6844 > > e: john.murphy at mandac.eu > > w: www.mandac.eu > > > > >-- John Murphy Technical And Managing Director MANDAC Ltd Kandoy House 2 Fairview Strand Dublin 3 p: +353 1 5143001 m: +353 85 711 6844 e: john.murphy at mandac.eu w: www.mandac.eu