thr3ads.net - samba - [Samba] CTDB Node unnecessarily banning other nodes [Jul 2009]

If this information is useful, please help other people find it:
Share via:

tim clusters

2009-Jul-31 15:57 UTC

[Samba] CTDB Node unnecessarily banning other nodes

Hi,

We are using CTDB version 1.0.77 and yesterday we saw an instance of node
running into issues and banning itself to recover (as listed below):

node1:
2009/07/29 23:23:37.748251 [22371]: Banning node 0 for 300 seconds
2009/07/29 23:23:37.748263 [22371]: self ban - lowering our election
priority
2009/07/29 23:23:37.748503 [22275]: This node has been banned - forcing
freeze and recovery

Now other nodes part of CTDB cluster receives the ban message, but even
though the ID does not belong to its CURRENT ID, other nodes bans itself and
goes into recovery mode.  I guess this is not supposed to happen?

node2 (should not ban itself):
2009/07/29 23:23:37.748659 [19905]: Got a ban request for pnn:0 but our pnn
is 1. Ignoring ban request
2009/07/29 23:23:37.748994 [19776]: This node has been banned - forcing
freeze and recovery

node3 (should not ban itself):
2009/07/29 23:23:37.748506 [19892]: Got a ban request for pnn:0 but our pnn
is 2. Ignoring ban request
2009/07/29 23:23:37.749575 [19750]: This node has been banned - forcing
freeze and recovery

Existing Version 1.0.77: ctdb-1.0.77/ctdb_monitor.c

241         if ((node->flags & NODE_FLAGS_BANNED) &&
!(c->old_flags &
NODE_FLAGS    _BANNED)) {
242                 /* make sure we are frozen */
243                 DEBUG(DEBUG_NOTICE,("This node has been banned -
forcing
fre    eze and recovery\n"));

--

I see a condition added in the "ban algorithm" in the latest 1.0.88 
to
ensure the banned node ID matches with node's PNN ID ((node->pnn
=ctdb->pnn))

--

Version 1.0.88:

311         /* if we have become banned, we should go into recovery mode */
312         if ((node->flags & NODE_FLAGS_BANNED) &&
!(c->old_flags &
NODE_FLAGS    _BANNED) && (node->pnn == ctdb->pnn)) {
313                 /* make sure we are frozen */
314                 DEBUG(DEBUG_NOTICE,("This node has been banned -
forcing
fre    eze and recovery\n"));

Can you please confirm if upgrading to 1.0.88 would fix the issue of a node
getting banned does not cause banning of other nodes, unnecessarily?

Thanks,
-Tim

Maybe Matching Threads

Search for more seemingly similar threads

samba - Jul 2009 - CTDB Node unnecessarily banning other nodes

[Samba] CTDB Node unnecessarily banning other nodes

Maybe Matching Threads

Wisdom of the Ancients