Erik Jacobson
2020-Apr-04 15:42 UTC
[Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request
I had a co-worker look through this with me (Scott Titus). He has a more analyitcal mind than I do. Here is what he said with some edits by me. My edits were formatting and adjusting some words. So we were hoping that, given this analysis, the community could let us know if it raises any red flags that would lead to a solution to the problem (whether it be setup, settings, or code). If needed, I can get Scott to work with me and dig further but it was starting to get painful where Scott stopped. Scott's words (edited): (all backtraces match - at least up to the point I'm concerned with at this time) Error was passed from afr_inode_refresh_done() into afr_txn_refresh_done() as afr_inode_refresh_done()'s call frame has 'error=0' while afr_txn_refresh_done() has 'err=5' in the call frame. #0 afr_read_txn_refresh_done (frame=0x7ffc949cf7c8, this=0x7fff640137b0, err=5) at afr-read-txn.c:281 #1 0x00007fff68901fdb in afr_txn_refresh_done ( frame=frame at entry=0x7ffc949cf7c8, this=this at entry=0x7fff640137b0, err=5, err at entry=0) at afr-common.c:1223 #2 0x00007fff689022b3 in afr_inode_refresh_done ( frame=frame at entry=0x7ffc949cf7c8, this=this at entry=0x7fff640137b0, error=0) at afr-common.c:1295 #3 0x00007fff6890f3fb in afr_inode_refresh_subvol_cbk (frame=0x7ffc949cf7c8, cookie=<optimized out>, this=0x7fff640137b0, op_ret=<optimized out>, op_errno=<optimized out>, buf=buf at entry=0x7ffd53ffdaa0, xdata=0x7ffd3c6764f8, par=0x7ffd53ffdb40) at afr-common.c:1333 Within afr_inode_refresh_done(), the only two ways it can generate an error within is via setting it to EINVAL or resulting from a failure status from afr_has_quorum(). Since EINVAL is 22, not 5, the quorum test failed. Within the afr_has_quorum() conditional, an error could be set from afr_final_errno() or afr_quorum_errno(). Digging reveals afr_quorum_errno() just returns ENOTCONN which is 107, so that is not it. This leaves us with afr_quorum_errno() returning the error. (Scott provided me with source code with pieces bolded but I don't think you need that). afr_final_errno() iterates through the 'children', looking for valid errors within the replies for the transaction (refresh transaction?). The function returns the highest valued error, which must be EIO (value of 5) in this case. I have not looked into how or what would set the error value in the replies array, as this being a distributed system the error could have been generated on another server. Unless this path needs to be investigated, I'd rather not get mired into finding which iteration (value of 'i') has the error and what system? thread? added the error to the reply unless it is information that is required. Any suggested next steps?> > On 01/04/20 8:57 am, Erik Jacobson wrote: > > Here are some back traces. They make my head hurt. Maybe you can suggest > > something else to try next? In the morning I'll try to unwind this > > myself too in the source code but I suspect it will be tough for me. > > > > > > (gdb) break xlators/cluster/afr/src/afr-read-txn.c:280 if err == 5 > > Breakpoint 1 at 0x7fff688e057b: file afr-read-txn.c, line 281. > > (gdb) continue > > Continuing. > > [Switching to Thread 0x7ffecffff700 (LWP 50175)] > > > > Thread 15 "glfs_epoll007" hit Breakpoint 1, afr_read_txn_refresh_done ( > > frame=0x7fff48325d78, this=0x7fff640137b0, err=5) at afr-read-txn.c:281 > > 281 if (err) { > > (gdb) bt > > #0 afr_read_txn_refresh_done (frame=0x7fff48325d78, this=0x7fff640137b0, > > err=5) at afr-read-txn.c:281 > > #1 0x00007fff68901fdb in afr_txn_refresh_done ( > > frame=frame at entry=0x7fff48325d78, this=this at entry=0x7fff640137b0, err=5, > > err at entry=0) at afr-common.c:1223 > > #2 0x00007fff689022b3 in afr_inode_refresh_done ( > > frame=frame at entry=0x7fff48325d78, this=this at entry=0x7fff640137b0, error=0) > > at afr-common.c:1295 > Hmm, afr_inode_refresh_done() is called with error=0 and by the time we > reach afr_txn_refresh_done(), it becomes 5(i.e. EIO). > So afr_inode_refresh_done() is changing it to 5. Maybe you can put > breakpoints/ log messages in afr_inode_refresh_done() at the places where > error is getting changed and see where the assignment happens. > > > Regards, > Ravi
Ravishankar N
2020-Apr-05 08:35 UTC
[Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request
On 04/04/20 9:12 pm, Erik Jacobson wrote:> This leaves us with afr_quorum_errno() returning the error. > > afr_final_errno() iterates through the 'children', looking for > valid errors within the replies for the transaction (refresh transaction?). > The function returns the highest valued error, which must be EIO (value of 5) > in this case. > > I have not looked into how or what would set the error value in the > replies array,The errror numbers that you see in the replies array in afr_final_errno() are set in afr_inode_refresh_subvol_cbk(). During inode refresh (which is essentially a lookup), AFR sends the the lookup request on all its connected children and the replies from each one of them are captured in afr_inode_refresh_subvol_cbk(). So adding a log here can identify if we got EIO from any of its children. See attached patch for an example. After we hear from all children, afr_inode_refresh_subvol_cbk() then calls afr_inode_refresh_done()-->afr_txn_refresh_done()-->afr_read_txn_refresh_done(). But you already know this flow now. -------------- next part -------------- A non-text attachment was scrubbed... Name: log.patch Type: text/x-patch Size: 840 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200405/fbd75665/attachment.bin>