thr3ads.net - search: "ctdb

[ctdb]Unable to run startrecovery event(if mail content is encrypted, please see the attached file)

2018 Sep 05

1

[ctdb]Unable to run startrecovery event(if mail content is encrypted, please see the attached file)

...eat logs: 2018/09/04 04:35:06.414369 ctdbd[10129]: Recovery has started 2018/09/04 04:35:06.414944 ctdbd[10129]: connect() failed, errno=111 2018/09/04 04:35:06.415076 ctdbd[10129]: Unable to run startrecovery event node2: repeat logs: 2018/09/04 04:35:09.412368 ctdb-recoverd[9437]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/09/04 04:35:09.412689 ctdb-recoverd[9437]: Already holding recovery lock 2018/09/04 04:35:09.412700 ctdb-recoverd[9437]: ../ctdb/server/ctdb_recoverd.c:1326 Recovery initiated due to problem with node 1 2018/09/04 04:35:09.412974 ctdb-recoverd[9437]: ../ctdb/server/...

Failed to start CTDB first time after install

2013 Apr 09

0

Failed to start CTDB first time after install

...557839872 2013/04/09 16:10:00.252688 [30575]: server/ctdb_daemon.c:182 Registered message handler for srvid=17654110539292344320 2013/04/09 16:10:00.252776 [30575]: server/ctdb_daemon.c:182 Registered message handler for srvid=18086737578496622592 2013/04/09 16:10:00.253577 [recoverd:30648]: server/ctdb_recoverd.c:3415 Initial recovery master set - forcing election 2013/04/09 16:10:00.253609 [recoverd:30648]: server/ctdb_recoverd.c:2521 Force an election 2013/04/09 16:10:00.253673 [30575]: Freeze priority 1 2013/04/09 16:10:00.253783 [30575]: Freeze priority 2 2013/04/09 16:10:00.253901 [30575]: Freeze pri...

[ctdb] Unable to take recovery lock - contention

2018 Feb 26

2

[ctdb] Unable to take recovery lock - contention

...2018/02/12 19:38:56.902874 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:57.885800 ctdb-recoverd[7182]: Election period ended 2018/02/12 19:38:57.886134 ctdb-recoverd[7182]: Node:1 was in recovery mode. Start recovery process 2018/02/12 19:38:57.886160 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:57.886187 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:57.886243 ctdb-recoverd[7182]: Set cluster mutex helper to "/usr/libexec/ctdb/ctdb_mutex_fcntl_helper" 2018/02/12 19:38:57.89...

答复: [ctdb] Unable to take recovery lock - contention

2018 Feb 26

0

答复: [ctdb] Unable to take recovery lock - contention

...2018/02/12 19:38:56.902874 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:57.885800 ctdb-recoverd[7182]: Election period ended 2018/02/12 19:38:57.886134 ctdb-recoverd[7182]: Node:1 was in recovery mode. Start recovery process 2018/02/12 19:38:57.886160 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:57.886187 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:57.886243 ctdb-recoverd[7182]: Set cluster mutex helper to "/usr/libexec/ctdb/ctdb_mutex_fcntl_helper" 2018/02/12 19:38:57.89...

答复: [ctdb] Unable to take recovery lock - contention

2018 Feb 26

2

答复: [ctdb] Unable to take recovery lock - contention

...2018/02/12 19:38:56.902874 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:57.885800 ctdb-recoverd[7182]: Election period ended 2018/02/12 19:38:57.886134 ctdb-recoverd[7182]: Node:1 was in recovery mode. Start recovery process 2018/02/12 19:38:57.886160 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:57.886187 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:57.886243 ctdb-recoverd[7182]: Set cluster mutex helper to "/usr/libexec/ctdb/ctdb_mutex_fcntl_helper" 2018/02/12 19:38:57.89...

[ctdb] Unable to take recovery lock - contention

2018 Feb 26

0

[ctdb] Unable to take recovery lock - contention

...2018/02/12 19:38:56.902874 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:57.885800 ctdb-recoverd[7182]: Election period ended 2018/02/12 19:38:57.886134 ctdb-recoverd[7182]: Node:1 was in recovery mode. Start recovery process 2018/02/12 19:38:57.886160 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:57.886187 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:57.886243 ctdb-recoverd[7182]: Set cluster mutex helper to "/usr/libexec/ctdb/ctdb_mutex_fcntl_helper" 2018/02/12 19:38:57.89...

smbstatus hang with CTDB 2.5.4 and Samba 4.1.13

2014 Oct 29

1

smbstatus hang with CTDB 2.5.4 and Samba 4.1.13

...b06a26d 2014/10/29 11:12:45.429592 [3932342]: Recovery daemon ping timeout. Count : 0 2014/10/29 11:12:45.430655 [3932342]: Handling event took 195 seconds! 2014/10/29 11:12:45.452636 [3932342]: pnn 1 Invalid reqid 220668 in ctdb_reply_control 2014/10/29 11:12:48.462334 [recoverd:6488266]: server/ctdb_recoverd.c:3990 Remote node:0 has different flags for node 1. It has 0x02 vs our 0x00 2014/10/29 11:12:48.462448 [recoverd:6488266]: Use flags 0x00 from local recmaster node for cluster update of node 1 flags 2014/10/29 11:12:48.483362 [3932342]: Freeze priority 1 2014/10/29 11:12:58.574548 [3932342]: /u...

ctdb split brain nodes doesn't see each other

2014 Jul 03

0

ctdb split brain nodes doesn't see each other

...pcode:80 dstnode:1 2014/07/03 16:10:39.962203 [33287]: client/ctdb_client.c:870 ctdb_control_recv failed 2014/07/03 16:10:39.962219 [33287]: Async operation failed with state 3, opcode:80 2014/07/03 16:10:39.962235 [33287]: Async wait failed - fail_count=1 2014/07/03 16:10:39.962251 [33287]: server/ctdb_recoverd.c:251 Failed to read node capabilities. 2014/07/03 16:10:39.962264 [33287]: server/ctdb_recoverd.c:3041 Unable to update node capabilities. 2014/07/03 16:11:00.984133 [33287]: client/ctdb_client.c:759 control timed out. reqid:67841 opcode:80 dstnode:1 2014/07/03 16:11:00.984201 [33287]: client/ctdb...

CTDB question about "shared file system"

2020 Aug 08

1

CTDB question about "shared file system"

...CMASTER, but we (node 1) have - force an election ctdbd[1220]: Recovery mode set to ACTIVE ctdbd[1220]: This node (1) is now the recovery master ctdb-recoverd[1236]: Election period ended ctdb-recoverd[1236]: Node:1 was in recovery mode. Start recovery process ctdb-recoverd[1236]: ../../ctdb/server/ctdb_recoverd.c:1347 Starting do_recovery ctdb-recoverd[1236]: Attempting to take recovery lock (!/usr/local/bin/lockctl elect --endpoints REDACTED:2379 SM ctdbd[1220]: High RECLOCK latency 4.268180s for operation recd reclock ctdb-recoverd[1236]: Recovery lock taken successfully ctdb-recoverd[1236]: ../../ctdb/...

Is it possible to use quorum for CTDB to prevent split-brain and removing lockfile in the cluster file system

2012 May 24

0

Is it possible to use quorum for CTDB to prevent split-brain and removing lockfile in the cluster file system

...t the opportunity to build CTDB with split-brain prevention without lockfile. Using quorum concepts to ban a node might be an option and I do a little modification of the CTDB source code. The modification checks whether there are more than (nodemap->num)/2 connected nodes in main_loop of server/ctdb_recoverd.c. If not, ban the node itslef and logs an error "Node %u in the group without quorum". In server/ctdb_recoverd.c: static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec, TALLOC_CTX *mem_ctx) ... /* count how many active nodes there are */ rec->num_...

CTDB node stucks in " ctdb-eventd[13184]: 50.samba: samba not listening on TCP port 445"

2019 May 16

2

CTDB node stucks in " ctdb-eventd[13184]: 50.samba: samba not listening on TCP port 445"

...peration failed with state 3, opcode:80 May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: Async wait failed - fail_count=1 May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: ../ctdb/server/ctdb_client.c:1931 Failed to read node capabilities. May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: ../ctdb/server/ctdb_recoverd.c:370 Failed to get node capabilities May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: ../ctdb/server/ctdb_recoverd.c:2757 Unable to update node capabilities. May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: Initial interface fetched May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: Trigger takeoverrun...

ctdb_client.c control timed out - banning nodes

2015 May 19

0

ctdb_client.c control timed out - banning nodes

...nnected' 2015/05/18 14:24:39.235572 [13021]: ctdb_control error: 'node is disconnected' 2015/05/18 14:24:39.235583 [13021]: Async operation failed with ret=-1 res=-1 opcode=80 2015/05/18 14:24:39.235590 [13021]: Async wait failed - fail_count=1 2015/05/18 14:24:39.235598 [13021]: server/ctdb_recoverd.c:251 Failed to read node capabilities. 2015/05/18 14:24:39.235605 [13021]: server/ctdb_recoverd.c:3025 Unable to update node capabilities. 2015/05/18 14:24:59.405500 [12968]: Freeze priority 1 2015/05/18 14:25:03.257579 [12968]: Could not add client IP 155.198.17.133. This is not a public address....

CTDB and IPv6

2012 Jun 28

1

CTDB and IPv6

...8 10:54:43.313918 [ 1820]: Async operation failed with ret=0 res=1 opcode=0 2012/06/28 10:54:43.313929 [ 1820]: Async wait failed - fail_count=2 2012/06/28 10:54:43.313934 [ 1820]: server/ctdb_takeover.c:1517 Async control CTDB_CONTROL_TAKEOVER_IP failed 2012/06/28 10:54:43.313941 [ 1820]: server/ctdb_recoverd.c:1588 Unable to setup public takeover addresses 2012/06/28 10:54:44.316099 [ 1820]: Taking out recovery lock from recovery daemon 2012/06/28 10:54:44.316129 [ 1820]: Take the recovery lock 2012/06/28 10:54:44.317788 [ 1820]: Recovery lock taken successfully 2012/06/28 10:54:44.317839 [ 1820]: Re...

CTDB Path

2018 May 04

2

CTDB Path

Hello, at this time i want to install a CTDB Cluster with SAMBA 4.7.7 from SOURCE! I compiled samba as follow: |./configure| |--with-cluster-support ||--with-shared-modules=idmap_rid,idmap_tdb2,idmap_ad| The whole SAMBA enviroment is located in /usr/local/samba/. CTDB is located in /usr/local/samba/etc/ctdb. I guess right that the correct path of ctdbd.conf (node file, public address file

CTDB question about "shared file system"

2020 Aug 06

2

CTDB question about "shared file system"

Very helpful. Thank you, Martin. I'd like to share the information below with you and solicit your fine feedback :-) I provide additional detail in case there is something else you feel strongly we should consider. We made some changes last night, let me share those with you. The error that is repeating itself and causing these failures is: Takeover run starting RELEASE_IP 10.200.1.230

CTDB Debug Help

2014 Feb 26

0

CTDB Debug Help

...eeze priority 1 2014/02/26 12:10:30.173525 [10027232]: Freeze priority 2 2014/02/26 12:10:30.173710 [10027232]: Freeze priority 3 2014/02/26 12:10:33.500359 [10027232]: Freeze priority 1 2014/02/26 12:10:33.639712 [10027232]: Freeze priority 2 2014/02/26 12:10:33.677081 [recoverd:11272374]: server/ctdb_recoverd.c:3699 Current recmaster node 0 does not have CAP_RECMASTER, but we (node 1) have - force an election 2014/02/26 12:10:33.677186 [10027232]: Freeze priority 1 2014/02/26 12:10:33.677336 [10027232]: Freeze priority 2 2014/02/26 12:10:33.677491 [10027232]: Freeze priority 3 2014/02/26 12:10:33.78964...

CTDB node stucks in " ctdb-eventd[13184]: 50.samba: samba not listening on TCP port 445"

2019 May 16

0

CTDB node stucks in " ctdb-eventd[13184]: 50.samba: samba not listening on TCP port 445"

...opcode:80 > May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: Async wait failed - > fail_count=1 > May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: > ../ctdb/server/ctdb_client.c:1931 Failed to read node capabilities. > May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: > ../ctdb/server/ctdb_recoverd.c:370 Failed to get node capabilities > May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: > ../ctdb/server/ctdb_recoverd.c:2757 Unable to update node capabilities. > May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]: Initial interface fetched > May 16 11:25:50 ctdb-lbn1 ctdb-recoverd[13293]:...

CTDB / Samba4. Nodes don't become healthy on first startup

2011 Apr 27

1

CTDB / Samba4. Nodes don't become healthy on first startup

...as recmaster. This was a fluke that I cannot reproduce. Typical log messages 2011/04/27 07:33:23.633970 [25303]: CTDB_WAIT_UNTIL_RECOVERED 2011/04/27 07:33:23.634002 [25303]: server/ctdb_monitor.c:232 generation is INVALID. Wait one more second 2011/04/27 07:33:23.695058 [recoverd:25365]: server/ctdb_recoverd.c:1812 Send election request to all active nodes 2011/04/27 07:33:24.196099 [recoverd:25365]: server/ctdb_recoverd.c:1812 Send election request to all active nodes So I have a number of questions 1) What data is CTDB actually managing in the case of SAMBA4? Presumably the temporary .tdb files tha...

CTDB Path

2018 May 07

2

CTDB Path

...018/05/07 15:31:44.100002 ctdbd[2093]: CTDB_WAIT_UNTIL_RECOVERED 2018/05/07 15:31:44.103288 ctdb-recoverd[2160]: Election period ended 2018/05/07 15:31:44.105712 ctdb-recoverd[2160]: Node:1 was in recovery mode. Start recovery process 2018/05/07 15:31:44.105826 ctdb-recoverd[2160]: ../ctdb/server/ctdb_recoverd.c:1268 Starting do_recovery 2018/05/07 15:31:44.105909 ctdb-recoverd[2160]: ../ctdb/server/ctdb_recoverd.c:1327 Recovery initiated due to problem with node 0 2018/05/07 15:31:44.106152 ctdb-recoverd[2160]: ../ctdb/server/ctdb_recoverd.c:1352 Recovery - created remote databases 2018/05/07 15:31:4...

[ctdb]Unable to run startrecovery event(if mail contentis encrypted, please see the attached file)

2018 Sep 05

0

[ctdb]Unable to run startrecovery event(if mail contentis encrypted, please see the attached file)

...ction period ended > 2018/09/04 04:29:55.469404 ctdb-recoverd[11302]: Node 2 has changed flags - now 0x8 was 0x0 > 2018/09/04 04:29:55.469475 ctdb-recoverd[11302]: Remote node 2 had flags 0x8, local had 0x0 - updating local > 2018/09/04 04:29:55.469514 ctdb-recoverd[11302]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery > 2018/09/04 04:29:55.469525 ctdb-recoverd[11302]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) > 2018/09/04 04:29:55.563522 ctdb-recoverd[11302]: Unable to take recovery lock - contention > 2018/09/04 04:29:55.563573 ctdb-recoverd[1130...

search for: ctdb_recoverd