Erik Jacobson
2020-Apr-04 15:42 UTC
[Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request
I had a co-worker look through this with me (Scott Titus). He has a more
analyitcal mind than I do. Here is what he said with some edits by me.
My edits were formatting and adjusting some words. So we were hoping
that, given this analysis, the community could let us know if it raises
any red flags that would lead to a solution to the problem (whether it
be setup, settings, or code). If needed, I can get Scott to work with me
and dig further but it was starting to get painful where Scott stopped.
Scott's words (edited):
(all backtraces match - at least up to the point I'm concerned with at this
time)
Error was passed from afr_inode_refresh_done() into afr_txn_refresh_done() as
afr_inode_refresh_done()'s call frame has 'error=0'
while afr_txn_refresh_done() has 'err=5' in the call frame.
#0 afr_read_txn_refresh_done (frame=0x7ffc949cf7c8, this=0x7fff640137b0,
err=5) at afr-read-txn.c:281
#1 0x00007fff68901fdb in afr_txn_refresh_done (
frame=frame at entry=0x7ffc949cf7c8, this=this at entry=0x7fff640137b0,
err=5,
err at entry=0) at afr-common.c:1223
#2 0x00007fff689022b3 in afr_inode_refresh_done (
frame=frame at entry=0x7ffc949cf7c8, this=this at entry=0x7fff640137b0,
error=0)
at afr-common.c:1295
#3 0x00007fff6890f3fb in afr_inode_refresh_subvol_cbk (frame=0x7ffc949cf7c8,
cookie=<optimized out>, this=0x7fff640137b0, op_ret=<optimized
out>,
op_errno=<optimized out>, buf=buf at entry=0x7ffd53ffdaa0,
xdata=0x7ffd3c6764f8, par=0x7ffd53ffdb40) at afr-common.c:1333
Within afr_inode_refresh_done(), the only two ways it can generate an error
within is via setting it to EINVAL or resulting from a failure status from
afr_has_quorum(). Since EINVAL is 22, not 5, the quorum test failed.
Within the afr_has_quorum() conditional, an error could be set
from afr_final_errno() or afr_quorum_errno(). Digging reveals
afr_quorum_errno() just returns ENOTCONN which is 107, so that is not it.
This leaves us with afr_quorum_errno() returning the error.
(Scott provided me with source code with pieces bolded but I don't think
you need that).
afr_final_errno() iterates through the 'children', looking for
valid errors within the replies for the transaction (refresh transaction?).
The function returns the highest valued error, which must be EIO (value of 5)
in this case.
I have not looked into how or what would set the error value in the
replies array, as this being a distributed system the error could have been
generated on another server. Unless this path needs to be investigated, I'd
rather not get mired into finding which iteration (value of 'i') has the
error
and what system? thread? added the error to the reply unless it is
information that is required.
Any suggested next steps?
>
> On 01/04/20 8:57 am, Erik Jacobson wrote:
> > Here are some back traces. They make my head hurt. Maybe you can
suggest
> > something else to try next? In the morning I'll try to unwind this
> > myself too in the source code but I suspect it will be tough for me.
> >
> >
> > (gdb) break xlators/cluster/afr/src/afr-read-txn.c:280 if err == 5
> > Breakpoint 1 at 0x7fff688e057b: file afr-read-txn.c, line 281.
> > (gdb) continue
> > Continuing.
> > [Switching to Thread 0x7ffecffff700 (LWP 50175)]
> >
> > Thread 15 "glfs_epoll007" hit Breakpoint 1,
afr_read_txn_refresh_done (
> > frame=0x7fff48325d78, this=0x7fff640137b0, err=5) at
afr-read-txn.c:281
> > 281 if (err) {
> > (gdb) bt
> > #0 afr_read_txn_refresh_done (frame=0x7fff48325d78,
this=0x7fff640137b0,
> > err=5) at afr-read-txn.c:281
> > #1 0x00007fff68901fdb in afr_txn_refresh_done (
> > frame=frame at entry=0x7fff48325d78, this=this at
entry=0x7fff640137b0, err=5,
> > err at entry=0) at afr-common.c:1223
> > #2 0x00007fff689022b3 in afr_inode_refresh_done (
> > frame=frame at entry=0x7fff48325d78, this=this at
entry=0x7fff640137b0, error=0)
> > at afr-common.c:1295
> Hmm, afr_inode_refresh_done() is called with error=0 and by the time we
> reach afr_txn_refresh_done(), it becomes 5(i.e. EIO).
> So afr_inode_refresh_done() is changing it to 5. Maybe you can put
> breakpoints/ log messages in afr_inode_refresh_done() at the places where
> error is getting changed and see where the assignment happens.
>
>
> Regards,
> Ravi
Ravishankar N
2020-Apr-05 08:35 UTC
[Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request
On 04/04/20 9:12 pm, Erik Jacobson wrote:> This leaves us with afr_quorum_errno() returning the error. > > afr_final_errno() iterates through the 'children', looking for > valid errors within the replies for the transaction (refresh transaction?). > The function returns the highest valued error, which must be EIO (value of 5) > in this case. > > I have not looked into how or what would set the error value in the > replies array,The errror numbers that you see in the replies array in afr_final_errno() are set in afr_inode_refresh_subvol_cbk(). During inode refresh (which is essentially a lookup), AFR sends the the lookup request on all its connected children and the replies from each one of them are captured in afr_inode_refresh_subvol_cbk(). So adding a log here can identify if we got EIO from any of its children. See attached patch for an example. After we hear from all children, afr_inode_refresh_subvol_cbk() then calls afr_inode_refresh_done()-->afr_txn_refresh_done()-->afr_read_txn_refresh_done(). But you already know this flow now. -------------- next part -------------- A non-text attachment was scrubbed... Name: log.patch Type: text/x-patch Size: 840 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200405/fbd75665/attachment.bin>