Ravishankar N
2020-Apr-15 08:35 UTC
[Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request
On 10/04/20 2:06 am, Erik Jacobson wrote:> Once again thanks for sticking with us. Here is a reply from Scott > Titus. If you have something for us to try, we'd love it. The code had > your patch applied when gdb was run: > > > Here is the addr2line output for those addresses. Very interesting command, of > which I was not aware. > > [root at leader3 ~]# addr2line -f -e/usr/lib64/glusterfs/7.2/xlator/cluster/ > afr.so 0x6f735 > afr_lookup_metadata_heal_check > afr-common.c:2803 > [root at leader3 ~]# addr2line -f -e/usr/lib64/glusterfs/7.2/xlator/cluster/ > afr.so 0x6f0b9 > afr_lookup_done > afr-common.c:2455 > [root at leader3 ~]# addr2line -f -e/usr/lib64/glusterfs/7.2/xlator/cluster/ > afr.so 0x5c701 > afr_inode_event_gen_reset > afr-common.c:755 >Right, so afr_lookup_done() is resetting the event gen to zero. This looks like a race between lookup and inode refresh code paths. We made some changes to the event generation logic in AFR. Can you apply the attached patch and see if it fixes the split-brain issue? It should apply cleanly on glusterfs-7.4. Thanks, Ravi -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-afr-mark-pending-xattrs-as-a-part-of-metadata-heal.patch Type: text/x-patch Size: 3813 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200415/404b33fd/attachment.bin>
Ravishankar N
2020-Apr-15 08:39 UTC
[Gluster-users] gnfs split brain when 1 server in 3x1 down (high load) - help request
Attached the wrong patch by mistake in my previous mail. Sending the correct one now. -Ravi On 15/04/20 2:05 pm, Ravishankar N wrote:> > On 10/04/20 2:06 am, Erik Jacobson wrote: >> Once again thanks for sticking with us. Here is a reply from Scott >> Titus. If you have something for us to try, we'd love it. The code had >> your patch applied when gdb was run: >> >> >> Here is the addr2line output for those addresses.? Very interesting >> command, of >> which I was not aware. >> >> [root at leader3 ~]# addr2line -f >> -e/usr/lib64/glusterfs/7.2/xlator/cluster/ >> afr.so 0x6f735 >> afr_lookup_metadata_heal_check >> afr-common.c:2803 >> [root at leader3 ~]# addr2line -f >> -e/usr/lib64/glusterfs/7.2/xlator/cluster/ >> afr.so 0x6f0b9 >> afr_lookup_done >> afr-common.c:2455 >> [root at leader3 ~]# addr2line -f >> -e/usr/lib64/glusterfs/7.2/xlator/cluster/ >> afr.so 0x5c701 >> afr_inode_event_gen_reset >> afr-common.c:755 >> > Right, so afr_lookup_done() is resetting the event gen to zero. This > looks like a race between lookup and inode refresh code paths. We made > some changes to the event generation logic in AFR. Can you apply the > attached patch and see if it fixes the split-brain issue? It should > apply cleanly on glusterfs-7.4. > > Thanks, > Ravi > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200415/22c551da/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-afr-event-gen-changes.patch Type: text/x-patch Size: 10509 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20200415/22c551da/attachment.bin>