On Tue, Oct 7, 2014 at 11:27 PM, Min Wai Chan <dcmwai at gmail.com> wrote:
> Hello all,
>
> I've some CTDB issue which I'm not sure where to start...
> I follow this guide by steve which is nice.
> The different on what have is that I don't have drbd running...
>
What version of CTDB are you running?
>
> Also I've 4 x GE which is all Connected to a switch with different ip
but
> same subnet
> This is suppose to load balance the traffic as under samba dns, they all
> have the same name.
>
> I'm only planing 3 for ctdb
>
>
> My 2 VM have direct access to a shared vmdk which mount as raw ocfs2. So
> should not be disk issue.
> Below are some of my configuration and log
>
>
Is OCFS2 configured properly and working correctly?
> Please advise if you have know where I Should point to...
>
> */etc/conf.d/net*
> config_enp2s0="192.168.11.32/24"
> config_enp2s2="null"
> config_enp2s3="192.168.12.34"
> config_enp2s4="null"
>
> dns_domain="kl01.amtb-m.org.my"
> dns_servers="192.168.11.20 192.168.11.24"
> dns_search="kl01.amtb-m.org.my"
>
> routes_enp2s0="default via 192.168.11.10"
> routes_enp2s2="default via 192.168.11.10"
> routes_enp2s3="default via 192.168.11.10"
> routes_enp2s4="default via 192.168.11.10"
>
> */etc/conf.d/ctdb*
> CTDB_RECOVERY_LOCK=/amtb/.ctdb.lock
> CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses
> CTDB_MANAGES_SAMBA=yes
> CTDB_SAMBA_SKIP_SHARE_CHECK=yes
> CTDB_NFS_SKIP_SHARE_CHECK=yes
> CTDB_MANAGES_WINBIND=yes
> CTDB_MANAGES_VSFTPD=no
> CTDB_MANAGES_ISCSI=no
> CTDB_MANAGES_NFS=no
> CTDB_MANAGES_HTTPD=no
> CTDB_SYSLOG=yes
> CTDB_DEBUGLEVEL=NOTICE
> CTDB_INIT_STYLE> CTDB_SERVICE_SMB=samba
> CTDB_SERVICE_WINBIND=winbind
> CTDB_NODES=/etc/ctdb/nodes
> CTDB_NOTIFY_SCRIPT=/etc/ctdb/notify.sh
> CTDB_DBDIR=/var/lib/ctdb
> CTDB_DBDIR_PERSISTENT=/var/lib/ctdb/persistent
> CTDB_EVENT_SCRIPT_DIR=/etc/ctdb/events.d
> CTDB_SOCKET=/var/lib/ctdb/ctdb.socket
> CTDB_TRANSPORT="tcp"
> CTDB_MONITOR_FREE_MEMORY=100
> CTDB_START_AS_DISABLED="yes"
> CTDB_CAPABILITY_RECMASTER=yes
> CTDB_CAPABILITY_LMASTER=yes
> NATGW_PUBLIC_IP> NATGW_PUBLIC_IFACE> NATGW_DEFAULT_GATEWAY>
NATGW_PRIVATE_IFACE> NATGW_PRIVATE_NETWORK>
NATGW_NODES=/etc/ctdb/natgw_nodes
> CTDB_LOGFILE=/var/log/messages
> CTDB_DEBUGLEVEL=2
> CTDB_OPTIONS>
> */etc/ctdb/nodes*
> 192.168.12.30
> 192.168.12.34
>
> */etc/ctdb/public_addresses*
> 192.168.11.29/24 enp2s2
> 192.168.11.33/24 enp2s2
> 192.168.11.30/24 enp2s3
> 192.168.11.34/24 enp2s3
> 192.168.11.31/24 enp2s4
> 192.168.11.35/24 enp2s4
>
>
> And my log
>
> 2014/08/27 03:00:13.846218 [set_recmode: 8939]: ctdb_recovery_lock: Got
> recovery lock on '/amtb/.ctdb.lock'
> 2014/08/27 03:00:13.846322 [set_recmode: 8939]: ERROR: recovery lock file
> /amtb/.ctdb.lock not locked when recovering!
> 2014/08/27 03:00:14.563975 [ 4411]: server/ctdb_monitor.c:293 in recovery.
> Wait one more second
> 2014/08/27 03:00:14.583565 [ 4411]: Freeze priority 1
> 2014/08/27 03:00:14.585431 [ 4411]: Freeze priority 2
> 2014/08/27 03:00:14.587089 [ 4411]: Freeze priority 3
> 2014/08/27 03:00:14.589524 [ 4411]: server/ctdb_recover.c:988 startrecovery
> eventscript has been invoked
> 2014/08/27 03:00:14.845167 [ 4411]: server/ctdb_recover.c:612 Recovery mode
> set to NORMAL
> 2014/08/27 03:00:14.845236 [ 4411]: Thawing priority 1
> 2014/08/27 03:00:14.845254 [ 4411]: Release freeze handler for prio 1
> 2014/08/27 03:00:14.845295 [ 4411]: Thawing priority 2
> 2014/08/27 03:00:14.845311 [ 4411]: Release freeze handler for prio 2
> 2014/08/27 03:00:14.845342 [ 4411]: Thawing priority 3
> 2014/08/27 03:00:14.845357 [ 4411]: Release freeze handler for prio 3
> 2014/08/27 03:00:14.846202 [set_recmode: 9003]: ctdb_recovery_lock: Got
> recovery lock on '/amtb/.ctdb.lock'
> 2014/08/27 03:00:14.846325 [set_recmode: 9003]: ERROR: recovery lock file
> /amtb/.ctdb.lock not locked when recovering!
> 2014/08/27 03:00:15.564794 [ 4411]: server/ctdb_monitor.c:293 in recovery.
> Wait one more second
> 2014/08/27 03:00:15.584563 [ 4411]: Freeze priority 1
> 2014/08/27 03:00:15.586399 [ 4411]: Freeze priority 2
> 2014/08/27 03:00:15.587946 [ 4411]: Freeze priority 3
> 2014/08/27 03:00:15.589802 [ 4411]: server/ctdb_recover.c:988 startrecovery
> eventscript has been invoked
> 2014/08/27 03:00:15.849306 [ 4411]: server/ctdb_recover.c:612 Recovery mode
> set to NORMAL
> 2014/08/27 03:00:15.849395 [ 4411]: Thawing priority 1
> 2014/08/27 03:00:15.849416 [ 4411]: Release freeze handler for prio 1
> 2014/08/27 03:00:15.849460 [ 4411]: Thawing priority 2
> 2014/08/27 03:00:15.849477 [ 4411]: Release freeze handler for prio 2
> 2014/08/27 03:00:15.849510 [ 4411]: Thawing priority 3
> 2014/08/27 03:00:15.849525 [ 4411]: Release freeze handler for prio 3
> 2014/08/27 03:00:15.850398 [set_recmode: 9073]: ctdb_recovery_lock: Got
> recovery lock on '/amtb/.ctdb.lock'
> 2014/08/27 03:00:15.850519 [set_recmode: 9073]: ERROR: recovery lock file
> /amtb/.ctdb.lock not locked when recovering!
> 2014/08/27 03:00:16.565054 [ 4411]: server/ctdb_monitor.c:293 in recovery.
> Wait one more second
> 2014/08/27 03:00:16.585483 [ 4411]: Freeze priority 1
> 2014/08/27 03:00:16.587275 [ 4411]: Freeze priority 2
> 2014/08/27 03:00:16.588665 [ 4411]: Freeze priority 3
> 2014/08/27 03:00:16.590517 [ 4411]: server/ctdb_recover.c:988 startrecovery
> eventscript has been invoked
>
This is not complete log. Do you have the logs from the beginning. Also,
what about the logs from the other node? Without complete logs from both
the nodes, it's difficult to figure out what's going wrong.
>From the error it appears that the cluster file system is not configured
correctly and recovery lock file can be locked from both the nodes
simultaneously.
Amitay.