zhu.shangzhong at zte.com.cn
2018-Feb-26 09:26 UTC
[Samba] 答复: [ctdb] Unable to take recovery lock - contention
------------------原始邮件------------------ 发件人:朱尚忠10137461 收件人:samba@lists.samba.org <samba@lists.samba.org> 日 期 :2018年02月26日 17:10 主 题 :[ctdb] Unable to take recovery lock - contention When the ctdb is starting, the "Unable to take recovery lock - contention" log will be output all the time. Which cases will the "unable to take lock" errror be output? Thanks! The following the ctdb logs: 2018/02/12 19:38:51.147959 ctdbd[5615]: CTDB starting on node 2018/02/12 19:38:51.528921 ctdbd[6602]: Starting CTDBD (Version 4.6.10) as PID: 6602 2018/02/12 19:38:51.529060 ctdbd[6602]: Created PID file /run/ctdb/ctdbd.pid 2018/02/12 19:38:51.529120 ctdbd[6602]: Listening to ctdb socket /var/run/ctdb/ctdbd.socket 2018/02/12 19:38:51.529146 ctdbd[6602]: Set real-time scheduler priority 2018/02/12 19:38:51.648117 ctdbd[6602]: Starting event daemon /usr/libexec/ctdb/ctdb_eventd -e /etc/ctdb/events.d -s /var/run/ctdb/eventd.sock -P 6602 -l file:/var/log/log.ctdb -d NOTICE 2018/02/12 19:38:51.648390 ctdbd[6602]: connect() failed, errno=2 2018/02/12 19:38:51.693790 ctdb-eventd[6633]: listening on /var/run/ctdb/eventd.sock 2018/02/12 19:38:51.693893 ctdb-eventd[6633]: daemon started, pid=6633 2018/02/12 19:38:52.648474 ctdbd[6602]: Set runstate to INIT (1) 2018/02/12 19:38:54.505780 ctdbd[6602]: PNN is 1 2018/02/12 19:38:54.574993 ctdbd[6602]: Vacuuming is disabled for persistent database ctdb.tdb 2018/02/12 19:38:54.576297 ctdbd[6602]: Attached to database '/var/lib/ctdb/persistent/ctdb.tdb.1' with flags 0x400 2018/02/12 19:38:54.576322 ctdbd[6602]: Ignoring persistent database 'ctdb.tdb.2' 2018/02/12 19:38:54.576339 ctdbd[6602]: Ignoring persistent database 'ctdb.tdb.0' 2018/02/12 19:38:54.576364 ctdbd[6602]: Freeze db: ctdb.tdb 2018/02/12 19:38:54.576393 ctdbd[6602]: Set lock helper to "/usr/libexec/ctdb/ctdb_lock_helper" 2018/02/12 19:38:54.579527 ctdbd[6602]: Set runstate to SETUP (2) 2018/02/12 19:38:54.881828 ctdbd[6602]: Keepalive monitoring has been started 2018/02/12 19:38:54.881873 ctdbd[6602]: Set runstate to FIRST_RECOVERY (3) 2018/02/12 19:38:54.882020 ctdb-recoverd[7182]: monitor_cluster starting 2018/02/12 19:38:54.882620 ctdb-recoverd[7182]: Initial recovery master set - forcing election 2018/02/12 19:38:54.882702 ctdbd[6602]: This node (1) is now the recovery master 2018/02/12 19:38:55.882735 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:56.902874 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:57.885800 ctdb-recoverd[7182]: Election period ended 2018/02/12 19:38:57.886134 ctdb-recoverd[7182]: Node:1 was in recovery mode. Start recovery process 2018/02/12 19:38:57.886160 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:57.886187 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:57.886243 ctdb-recoverd[7182]: Set cluster mutex helper to "/usr/libexec/ctdb/ctdb_mutex_fcntl_helper" 2018/02/12 19:38:57.899722 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:57.899763 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:57.903138 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:58.887310 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:58.887353 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:58.893531 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:58.893571 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:58.903314 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:59.891024 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:59.891080 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:59.898336 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:59.898397 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:59.904710 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:00.893673 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:00.893741 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:00.901094 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:00.901152 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:00.911007 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:01.895044 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:01.895106 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:01.902379 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:01.902451 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:01.912054 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:02.896539 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:02.896597 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:02.904674 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:02.904736 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:02.912896 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:03.898495 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:03.898548 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:03.904876 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:03.904929 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:03.913736 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:04.899872 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:04.899928 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:04.907784 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:04.907837 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:04.914048 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED
Harry Jede
2018-Feb-26 14:08 UTC
[Samba] 答复: [ctdb] Unable to take recovery lock - contention
Am Montag, 26. Februar 2018, 17:26:06 CET schrieb zhu.shangzhong--- via samba: Decoded base64 encoded body with some chinese characters: ------------------原始邮件------------------ 发件人:朱尚忠10137461 收件人:samba at lists.samba.org <samba at lists.samba.org> 日 期 :2018年02月26日 17:10 主 题 :[ctdb] Unable to take recovery lock - contention When the ctdb is starting, the "Unable to take recovery lock - contention" log will be output all the time. Which cases will the "unable to take lock" errror be output? Thanks! The following the ctdb logs: 2018/02/12 19:38:51.147959 ctdbd[5615]: CTDB starting on node 2018/02/12 19:38:51.528921 ctdbd[6602]: Starting CTDBD (Version 4.6.10) as PID: 6602 2018/02/12 19:38:51.529060 ctdbd[6602]: Created PID file /run/ctdb/ctdbd.pid 2018/02/12 19:38:51.529120 ctdbd[6602]: Listening to ctdb socket /var/run/ctdb/ctdbd.socket 2018/02/12 19:38:51.529146 ctdbd[6602]: Set real-time scheduler priority 2018/02/12 19:38:51.648117 ctdbd[6602]: Starting event daemon /usr/libexec/ctdb/ctdb_eventd -e /etc/ctdb/events.d -s /var/run /ctdb/eventd.sock -P 6602 -l file:/var/log/log.ctdb -d NOTICE 2018/02/12 19:38:51.648390 ctdbd[6602]: connect() failed, errno=2 2018/02/12 19:38:51.693790 ctdb-eventd[6633]: listening on /var/run/ctdb/eventd.sock 2018/02/12 19:38:51.693893 ctdb-eventd[6633]: daemon started, pid=6633 2018/02/12 19:38:52.648474 ctdbd[6602]: Set runstate to INIT (1) 2018/02/12 19:38:54.505780 ctdbd[6602]: PNN is 1 2018/02/12 19:38:54.574993 ctdbd[6602]: Vacuuming is disabled for persistent database ctdb.tdb 2018/02/12 19:38:54.576297 ctdbd[6602]: Attached to database '/var/lib/ctdb/persistent/ctdb.tdb.1' with flags 0x400 2018/02/12 19:38:54.576322 ctdbd[6602]: Ignoring persistent database 'ctdb.tdb.2' 2018/02/12 19:38:54.576339 ctdbd[6602]: Ignoring persistent database 'ctdb.tdb.0' 2018/02/12 19:38:54.576364 ctdbd[6602]: Freeze db: ctdb.tdb 2018/02/12 19:38:54.576393 ctdbd[6602]: Set lock helper to "/usr/libexec/ctdb/ctdb_lock_helper" 2018/02/12 19:38:54.579527 ctdbd[6602]: Set runstate to SETUP (2) 2018/02/12 19:38:54.881828 ctdbd[6602]: Keepalive monitoring has been started 2018/02/12 19:38:54.881873 ctdbd[6602]: Set runstate to FIRST_RECOVERY (3) 2018/02/12 19:38:54.882020 ctdb-recoverd[7182]: monitor_cluster starting 2018/02/12 19:38:54.882620 ctdb-recoverd[7182]: Initial recovery master set - forcing election 2018/02/12 19:38:54.882702 ctdbd[6602]: This node (1) is now the recovery master 2018/02/12 19:38:55.882735 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:56.902874 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:57.885800 ctdb-recoverd[7182]: Election period ended 2018/02/12 19:38:57.886134 ctdb-recoverd[7182]: Node:1 was in recovery mode. Start recovery process 2018/02/12 19:38:57.886160 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:57.886187 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:57.886243 ctdb-recoverd[7182]: Set cluster mutex helper to "/usr/libexec/ctdb/ctdb_mutex_fcntl_helper" 2018/02/12 19:38:57.899722 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:57.899763 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:57.903138 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:58.887310 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:58.887353 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:58.893531 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:58.893571 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:58.903314 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:59.891024 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:59.891080 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:59.898336 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:59.898397 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:59.904710 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:00.893673 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:00.893741 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:00.901094 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:00.901152 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:00.911007 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:01.895044 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:01.895106 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:01.902379 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:01.902451 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:01.912054 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:02.896539 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:02.896597 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:02.904674 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:02.904736 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:02.912896 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:03.898495 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:03.898548 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:03.904876 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:03.904929 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:03.913736 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:04.899872 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:04.899928 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:04.907784 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:04.907837 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:04.914048 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED------------------原始邮件------------------ 发件人:朱尚忠10137461 收件人:samba at lists.samba.org <samba at lists.samba.org> 日 期 :2018年02月26日 17:10 主 题 :[ctdb] Unable to take recovery lock - contention When the ctdb is starting, the "Unable to take recovery lock - contention" log will be output all the time. Which cases will the "unable to take lock" errror be output? Thanks! The following the ctdb logs: 2018/02/12 19:38:51.147959 ctdbd[5615]: CTDB starting on node 2018/02/12 19:38:51.528921 ctdbd[6602]: Starting CTDBD (Version 4.6.10) as PID: 6602 2018/02/12 19:38:51.529060 ctdbd[6602]: Created PID file /run/ctdb/ctdbd.pid 2018/02/12 19:38:51.529120 ctdbd[6602]: Listening to ctdb socket /var/run/ctdb/ctdbd.socket 2018/02/12 19:38:51.529146 ctdbd[6602]: Set real-time scheduler priority 2018/02/12 19:38:51.648117 ctdbd[6602]: Starting event daemon /usr/libexec/ctdb/ctdb_eventd -e /etc/ctdb/events.d -s /var/run/ctdb/eventd.sock -P 6602 -l file:/var/log/log.ctdb -d NOTICE 2018/02/12 19:38:51.648390 ctdbd[6602]: connect() failed, errno=2 2018/02/12 19:38:51.693790 ctdb-eventd[6633]: listening on /var/run/ctdb/eventd.sock 2018/02/12 19:38:51.693893 ctdb-eventd[6633]: daemon started, pid=6633 2018/02/12 19:38:52.648474 ctdbd[6602]: Set runstate to INIT (1) 2018/02/12 19:38:54.505780 ctdbd[6602]: PNN is 1 2018/02/12 19:38:54.574993 ctdbd[6602]: Vacuuming is disabled for persistent database ctdb.tdb 2018/02/12 19:38:54.576297 ctdbd[6602]: Attached to database '/var/lib/ctdb/persistent/ctdb.tdb.1' with flags 0x400 2018/02/12 19:38:54.576322 ctdbd[6602]: Ignoring persistent database 'ctdb.tdb.2' 2018/02/12 19:38:54.576339 ctdbd[6602]: Ignoring persistent database 'ctdb.tdb.0' 2018/02/12 19:38:54.576364 ctdbd[6602]: Freeze db: ctdb.tdb 2018/02/12 19:38:54.576393 ctdbd[6602]: Set lock helper to "/usr/libexec/ctdb/ctdb_lock_helper" 2018/02/12 19:38:54.579527 ctdbd[6602]: Set runstate to SETUP (2) 2018/02/12 19:38:54.881828 ctdbd[6602]: Keepalive monitoring has been started 2018/02/12 19:38:54.881873 ctdbd[6602]: Set runstate to FIRST_RECOVERY (3) 2018/02/12 19:38:54.882020 ctdb-recoverd[7182]: monitor_cluster starting 2018/02/12 19:38:54.882620 ctdb-recoverd[7182]: Initial recovery master set - forcing election 2018/02/12 19:38:54.882702 ctdbd[6602]: This node (1) is now the recovery master 2018/02/12 19:38:55.882735 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:56.902874 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:57.885800 ctdb-recoverd[7182]: Election period ended 2018/02/12 19:38:57.886134 ctdb-recoverd[7182]: Node:1 was in recovery mode. Start recovery process 2018/02/12 19:38:57.886160 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:57.886187 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:57.886243 ctdb-recoverd[7182]: Set cluster mutex helper to "/usr/libexec/ctdb/ctdb_mutex_fcntl_helper" 2018/02/12 19:38:57.899722 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:57.899763 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:57.903138 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:58.887310 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:58.887353 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:58.893531 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:58.893571 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:58.903314 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:38:59.891024 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:38:59.891080 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:38:59.898336 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:38:59.898397 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:38:59.904710 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:00.893673 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:00.893741 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:00.901094 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:00.901152 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:00.911007 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:01.895044 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:01.895106 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:01.902379 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:01.902451 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:01.912054 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:02.896539 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:02.896597 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:02.904674 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:02.904736 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:02.912896 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:03.898495 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:03.898548 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:03.904876 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:03.904929 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:03.913736 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED 2018/02/12 19:39:04.899872 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery 2018/02/12 19:39:04.899928 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) 2018/02/12 19:39:04.907784 ctdb-recoverd[7182]: Unable to take recovery lock - contention 2018/02/12 19:39:04.907837 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery 2018/02/12 19:39:04.914048 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED -- Gruss Harry Jede
Martin Schwenke
2018-Feb-26 22:27 UTC
[Samba] 答复: [ctdb] Unable to take recovery lock - contention
[Thanks Harry!] Am Montag, 26. Februar 2018, 17:26:06 CET schrieb zhu.shangzhong--- via samba:> ------------------原始邮件------------------ > 发件人:朱尚忠10137461 > 收件人:samba at lists.samba.org <samba at lists.samba.org> > 日 期 :2018年02月26日 17:10 > 主 题 :[ctdb] Unable to take recovery lock - contention > When the ctdb is starting, the "Unable to take recovery lock - contention" log will be output all the time. > Which cases will the "unable to take lock" errror be output? > Thanks! > > The following the ctdb logs: > 2018/02/12 19:38:51.147959 ctdbd[5615]: CTDB starting on node > 2018/02/12 19:38:51.528921 ctdbd[6602]: Starting CTDBD (Version 4.6.10) as PID: 6602 > 2018/02/12 19:38:51.529060 ctdbd[6602]: Created PID file /run/ctdb/ctdbd.pid > 2018/02/12 19:38:51.529120 ctdbd[6602]: Listening to ctdb socket /var/run/ctdb/ctdbd.socket > 2018/02/12 19:38:51.529146 ctdbd[6602]: Set real-time scheduler priority > 2018/02/12 19:38:51.648117 ctdbd[6602]: Starting event daemon /usr/libexec/ctdb/ctdb_eventd -e /etc/ctdb/events.d -s /var/run > /ctdb/eventd.sock -P 6602 -l file:/var/log/log.ctdb -d NOTICE > 2018/02/12 19:38:51.648390 ctdbd[6602]: connect() failed, errno=2 > 2018/02/12 19:38:51.693790 ctdb-eventd[6633]: listening on /var/run/ctdb/eventd.sock > 2018/02/12 19:38:51.693893 ctdb-eventd[6633]: daemon started, pid=6633 > 2018/02/12 19:38:52.648474 ctdbd[6602]: Set runstate to INIT (1) > 2018/02/12 19:38:54.505780 ctdbd[6602]: PNN is 1 > 2018/02/12 19:38:54.574993 ctdbd[6602]: Vacuuming is disabled for persistent database ctdb.tdb > 2018/02/12 19:38:54.576297 ctdbd[6602]: Attached to database '/var/lib/ctdb/persistent/ctdb.tdb.1' with flags 0x400 > 2018/02/12 19:38:54.576322 ctdbd[6602]: Ignoring persistent database 'ctdb.tdb.2' > 2018/02/12 19:38:54.576339 ctdbd[6602]: Ignoring persistent database 'ctdb.tdb.0' > 2018/02/12 19:38:54.576364 ctdbd[6602]: Freeze db: ctdb.tdb > 2018/02/12 19:38:54.576393 ctdbd[6602]: Set lock helper to "/usr/libexec/ctdb/ctdb_lock_helper" > 2018/02/12 19:38:54.579527 ctdbd[6602]: Set runstate to SETUP (2) > 2018/02/12 19:38:54.881828 ctdbd[6602]: Keepalive monitoring has been started > 2018/02/12 19:38:54.881873 ctdbd[6602]: Set runstate to FIRST_RECOVERY (3) > 2018/02/12 19:38:54.882020 ctdb-recoverd[7182]: monitor_cluster starting > 2018/02/12 19:38:54.882620 ctdb-recoverd[7182]: Initial recovery master set - forcing election > 2018/02/12 19:38:54.882702 ctdbd[6602]: This node (1) is now the recovery master > 2018/02/12 19:38:55.882735 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED > 2018/02/12 19:38:56.902874 ctdbd[6602]: CTDB_WAIT_UNTIL_RECOVERED > 2018/02/12 19:38:57.885800 ctdb-recoverd[7182]: Election period ended > 2018/02/12 19:38:57.886134 ctdb-recoverd[7182]: Node:1 was in recovery mode. Start recovery process > 2018/02/12 19:38:57.886160 ctdb-recoverd[7182]: ../ctdb/server/ctdb_recoverd.c:1267 Starting do_recovery > 2018/02/12 19:38:57.886187 ctdb-recoverd[7182]: Attempting to take recovery lock (/share-fs/export/ctdb/.ctdb/reclock) > 2018/02/12 19:38:57.886243 ctdb-recoverd[7182]: Set cluster mutex helper to "/usr/libexec/ctdb/ctdb_mutex_fcntl_helper" > 2018/02/12 19:38:57.899722 ctdb-recoverd[7182]: Unable to take recovery lock - contention > 2018/02/12 19:38:57.899763 ctdb-recoverd[7182]: Unable to get recovery lock - retrying recovery > [...]First, I would check that the recovery lock file actually exists, to make sure the error message is sane. For example, using: ls -l /share-fs/export/ctdb/.ctdb/reclock If the file doesn't exist, does the directory exist? If the file does exist, the next step would be to see what processes have that file open (and potentially locked). Try: fuser -v /share-fs/export/ctdb/.ctdb/reclock You should find that a /usr/libexec/ctdb/ctdb_mutex_fcntl_helper process has it open. This process should exit if its parent (CTDB recovery daemon) goes away, so check what the parent process is. You can also use ls -i /share-fs/export/ctdb/.ctdb/reclock to determine the inode of the file and then something like this to determine if a process has the file locked: awk -v inode=<INODE#> '$6 ~ ".*:.*:" inode { print $5 }' /proc/locks If nothing has the file locked then there might be something weird about your cluster filesystem. Try running the helper by hand under strace and seeing what fails: strace /usr/libexec/ctdb/ctdb_mutex_fcntl_helper /share-fs/export/ctdb/.ctdb/reclock Good luck! peace & happiness, martin