We've only seen this error once, just occurred. We were having domain join issues. Enter ******@*******.local's password: ../../source3/lib/dbwrap/dbwrap_ctdb.c:855 ERROR: new_seqnum[10] !old_seqnum[8] + (0 or 1) after failed TRANS3_COMMIT - this should not happen! secrets_store_JoinCtx: dbwrap_transaction_commit() failed for SOMDEV libnet_join_joindomain_store_secrets: secrets_store_JoinCtx() failed NT_STATUS_INTERNAL_DB_ERROR Failed to join domain: This machine is not currently joined to a domain. What does this mean? Is there a simple remedy? -- BOB BUCK SENIOR PLATFORM SOFTWARE ENGINEER SKIDMORE, OWINGS & MERRILL 7 WORLD TRADE CENTER 250 GREENWICH STREET NEW YORK, NY 10007 T (212) 298-9624 ROBERT.BUCK at SOM.COM
Hi Bob, On Fri, 23 Oct 2020 16:40:10 -0400, Robert Buck via samba <samba at lists.samba.org> wrote:> We've only seen this error once, just occurred. We were having domain join > issues. > > Enter ******@*******.local's password: > ../../source3/lib/dbwrap/dbwrap_ctdb.c:855 ERROR: new_seqnum[10] !> old_seqnum[8] + (0 or 1) after failed TRANS3_COMMIT - this should not > happen! > secrets_store_JoinCtx: dbwrap_transaction_commit() failed for SOMDEV > libnet_join_joindomain_store_secrets: secrets_store_JoinCtx() failed > NT_STATUS_INTERNAL_DB_ERROR > Failed to join domain: This machine is not currently joined to a domain. > > What does this mean? Is there a simple remedy?A transaction to attempt to commit changes to a persistent (and replicated) database (in this case secrets.tdb) failed. The most likely reason is because there was a database recovery. The recovery should either: 1. throw away the changes (because the attempt to commit occurred too late) resulting in no change to the sequence number; or 2. it should implicitly commit the changes (because they were already stored on 1 or more nodes) resulting in the sequence number being incremented This could happen due to a bug... but the code looks good. The other possibility is what I'll call an "implicit split brain" because I'm too tired to check whether there's a better term for this. This can happen most easily with a 2 node cluster, but you'll be able to extrapolate to see how it can occur with more nodes. Nodes A & B are running. 1. secrets.tdb has been updated several times so that its sequence number is N 2. Node A is shut down 3. secrets.tdb is updated 5 times, so the sequence number on node B is N+5 4. Node B is shut down The persistent databases, which use sequence numbers, are now partitioned. There are 2 sets of databases that aren't connected via recovery. 5. Node A is brought up 6. secrets.tdb is updated 3 times, do the sequence number on node A is N+3 7. While attempting the next secrets.tdb update, node B comes up 8. Recovery, during the attempted commit, uses the database from node B, with sequence number N+5... so the sequence number increases by 2 Checking the logs would obviously tell you which nodes were up/down when and whether this is a reasonable explanation. I would have to do some reading to properly remember options for how to design replicated databases to avoid this. I think that one option mentioned in "Designing Data Intensive Applications" is to use timestamps instead of sequence numbers... but I guess then you might have issues with timestamp granularity. I think quorum can also help here (need, say, 2 of 3 nodes active to make progress). Raft might also avoid this for similar readings. I need to more time to re-read things I've read before... but am happy to take advice... :-) The summary is that the persistent databases in CTDB use sequence numbers so you need to ensure that one node in the cluster is up at all times. If you flip-flop between nodes, with intervening downtime, then you can get unexpected results. peace & happiness, martin
Great helpful info as always, Martin, thanks! On Thu, Nov 5, 2020 at 5:50 AM Martin Schwenke <martin at meltin.net> wrote:> Hi Bob, > > On Fri, 23 Oct 2020 16:40:10 -0400, Robert Buck via samba > <samba at lists.samba.org> wrote: > > > We've only seen this error once, just occurred. We were having domain > join > > issues. > > > > Enter ******@*******.local's password: > > ../../source3/lib/dbwrap/dbwrap_ctdb.c:855 ERROR: new_seqnum[10] !> > old_seqnum[8] + (0 or 1) after failed TRANS3_COMMIT - this should not > > happen! > > secrets_store_JoinCtx: dbwrap_transaction_commit() failed for SOMDEV > > libnet_join_joindomain_store_secrets: secrets_store_JoinCtx() failed > > NT_STATUS_INTERNAL_DB_ERROR > > Failed to join domain: This machine is not currently joined to a domain. > > > > What does this mean? Is there a simple remedy? > > A transaction to attempt to commit changes to a persistent (and > replicated) database (in this case secrets.tdb) failed. The most > likely reason is because there was a database recovery. The recovery > should either: > > 1. throw away the changes (because the attempt to commit occurred too > late) resulting in no change to the sequence number; or > > 2. it should implicitly commit the changes (because they were > already stored on 1 or more nodes) resulting in the sequence number > being incremented > > This could happen due to a bug... but the code looks good. > > The other possibility is what I'll call an "implicit split brain" > because I'm too tired to check whether there's a better term for this. > This can happen most easily with a 2 node cluster, but you'll be able > to extrapolate to see how it can occur with more nodes. > > Nodes A & B are running. > > 1. secrets.tdb has been updated several times so that its sequence > number is N > > 2. Node A is shut down > > 3. secrets.tdb is updated 5 times, so the sequence number on node B is > N+5 > > 4. Node B is shut down > > The persistent databases, which use sequence numbers, are now > partitioned. There are 2 sets of databases that aren't connected via > recovery. > > 5. Node A is brought up > > 6. secrets.tdb is updated 3 times, do the sequence number on node A is > N+3 > > 7. While attempting the next secrets.tdb update, node B comes up > > 8. Recovery, during the attempted commit, uses the database from node B, > with sequence number N+5... so the sequence number increases by 2 > > Checking the logs would obviously tell you which nodes were up/down > when and whether this is a reasonable explanation. > > I would have to do some reading to properly remember options for how to > design replicated databases to avoid this. I think that one option > mentioned in "Designing Data Intensive Applications" is to use > timestamps instead of sequence numbers... but I guess then you might > have issues with timestamp granularity. I think quorum can also help > here (need, say, 2 of 3 nodes active to make progress). Raft might also > avoid this for similar readings. I need to more time to re-read > things I've read before... but am happy to take advice... :-) > > The summary is that the persistent databases in CTDB use sequence > numbers so you need to ensure that one node in the cluster is up at > all times. If you flip-flop between nodes, with intervening downtime, > then you can get unexpected results. > > peace & happiness, > martin > >-- BOB BUCK SENIOR PLATFORM SOFTWARE ENGINEER SKIDMORE, OWINGS & MERRILL 7 WORLD TRADE CENTER 250 GREENWICH STREET NEW YORK, NY 10007 T (212) 298-9624 ROBERT.BUCK at SOM.COM