tu.qiuping
2024-Sep-24 13:11 UTC
[Samba] [CTDB] Loop print disconnect, unable to establish tcplink
Thank you very much for your reply, Martin. I am confident that the nodes files for all nodes are the same and there are no comments inside. If there are nodes with different node files, then some nodes should print 'Refused connection from unknown node', but this print is not found among all nodes. All servers are not virtual machines, but I cannot confirm if the network is abnormal or if there are rogue ctdbd processes, because when I checked, it was no longer the first scene. I have been thinking for a long time and have also looked at the ctdb code, but I still have no idea. May I ask if you can identify any possible reasons? Below are full logs of three nodes: host-188-64: https://github.com/Tu-qiu/logs/blob/main/full-log-host-188-64 host-188-66: https://github.com/Tu-qiu/logs/blob/main/full-log-host-188-66 host-188-68: https://github.com/Tu-qiu/logs/blob/main/full-log-host-188-68 ------------------ Original ------------------ From: Martin Schwenke <martin at meltin.net> Date: Tue,Sep 24,2024 7:14 PM To: tu.qiuping <tu.qiuping at qq.com> Cc: samba <samba at lists.samba.org> Subject: Re: [Samba] [CTDB] Loop print disconnect, unable to establish tcplink Hi, On Tue, 24 Sep 2024 17:13:17 +0800, "tu.qiuping via samba" wrote: > My ctdb version is 4.17.7, My ctdb cluster configuration is correct and the cluster is healthy before operation. > > > > > I want to add several new public IPs. After modifying the publicaddressees file, I restarted the ctdb service on all nodes. > > However, two of the servers were unable to establish tcp connections and kept disconnecting in a loop. > > > > > I can't understand why host-188-64 and host-188-66 have been unable to establish a tcp connection. > > [...] Hmmm... not much information to go on. * The nodes files are identical on all nodes, including comment lines? * These are real machines? Nothing strange happening with virtual network bridges? * This is very suspicious: 2024-08-28T15:40:33.804428+08:00 host-188-64 ctdbd[1303012]: Tearing down connection to dead node :1 It appears to be logged before the connection comes up! You're not accidentally injecting network errors, are you? * Check for rogue ctdbd processes that might have been left behind on one of those 2 nodes? I don't know why they wouldn't be logging... but something strange is happening. Sorry, no other ideas right now... peace & happiness, martin