Zhou, Cynthia (NSB - CN/Hangzhou)
2017-Sep-28 06:41 UTC
[Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !
The version I am using is glusterfs 3.6.9 Best regards, Cynthia ???? MBB SM HETRAN SW3 MATRIX Storage Mobile: +86 (0)18657188311 From: Karthik Subrahmanya [mailto:ksubrahm at redhat.com] Sent: Thursday, September 28, 2017 2:37 PM To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com> Cc: Gluster-users at gluster.org; gluster-devel at gluster.org Subject: Re: [Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command ! On Thu, Sep 28, 2017 at 11:41 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote: Hi, Thanks for reply! I?ve checked [1]. But the problem is that there is nothing shown in command ?gluster volume heal <volume-name> info?. So these split-entry files could only be detected when app try to visit them. I can find gfid mismatch for those in-split-brain entries from mount log, however, nothing show in shd log, the shd log does not know those split-brain entries. Because there is nothing in indices/xattrop directory. I guess it was there before, and then it got cleared by one of the heal process either client side or server side. I wanted to check that by examining the logs. Which version of gluster you are running by the way? The log is not available right now, when it reproduced, I will provide it to your, Thanks! Ok. Best regards, Cynthia ???? MBB SM HETRAN SW3 MATRIX Storage Mobile: +86 (0)18657188311 From: Karthik Subrahmanya [mailto:ksubrahm at redhat.com<mailto:ksubrahm at redhat.com>] Sent: Thursday, September 28, 2017 2:02 PM To: Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> Cc: Gluster-users at gluster.org<mailto:Gluster-users at gluster.org>; gluster-devel at gluster.org<mailto:gluster-devel at gluster.org> Subject: Re: [Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command ! Hi, To resolve the gfid split-brain you can follow the steps at [1]. Since we don't have the pending markers set on the files, it is not showing in the heal info. To debug this issue, need some more data from you. Could you provide these things? 1. volume info 2. mount log 3. brick logs 4. shd log May I also know which version of gluster you are running. From the info you have provided it looks like an old version. If it is, then it would be great if you can upgarde to one of the latest supported release. [1] http://docs.gluster.org/en/latest/Troubleshooting/split-brain/#fixing-directory-entry-split-brain Thanks & Regards, Karthik On Wed, Sep 27, 2017 at 9:42 AM, Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com<mailto:cynthia.zhou at nokia-sbell.com>> wrote: HI gluster experts, I meet a tough problem about ?split-brain? issue. Sometimes, after hard reboot, we will find some files in split-brain, however its parent directory or anything could be shown in command ?gluster volume heal <volume-name> info?, also, no entry in .glusterfs/indices/xattrop directory, can you help to shed some lights on this issue? Thanks! Following is some info from our env, Checking from sn-0 cliet, nothing is shown in-split-brain! [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] # gluster v heal services info Brick sn-0:/mnt/bricks/services/brick/ Number of entries: 0 Brick sn-1:/mnt/bricks/services/brick/ Number of entries: 0 [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] # gluster v heal services info split-brain Gathering list of split brain entries on volume services has been successful Brick sn-0.local:/mnt/bricks/services/brick Number of entries: 0 Brick sn-1.local:/mnt/bricks/services/brick Number of entries: 0 [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] # ls -l /mnt/services/netserv/ethip/ ls: cannot access '/mnt/services/netserv/ethip/sn-2': Input/output error ls: cannot access '/mnt/services/netserv/ethip/mn-1': Input/output error total 3 -rw-r--r-- 1 root root 144 Sep 26 20:35 as-0 -rw-r--r-- 1 root root 144 Sep 26 20:35 as-1 -rw-r--r-- 1 root root 145 Sep 26 20:35 as-2 -rw-r--r-- 1 root root 237 Sep 26 20:36 mn-0 -????????? ? ? ? ? ? mn-1 -rw-r--r-- 1 root root 73 Sep 26 20:35 sn-0 -rw-r--r-- 1 root root 73 Sep 26 20:35 sn-1 -????????? ? ? ? ? ? sn-2 [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] Checking from glusterfs server side, the gfid of mn-1 on sn-0 and sn-1 is different [SN-0] [root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3] # getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip getfattr: Removing leading '/' from absolute path names # file: mnt/bricks/services/brick/netserv/ethip trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4 trusted.glusterfs.dht=0x000000010000000000000000ffffffff [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] # getfattr -m . -d -e hex mn-1 # file: mn-1 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.services-client-0=0x000000000000000000000000 trusted.afr.services-client-1=0x000000000000000000000000 trusted.gfid=0x53a33f437464475486f31c4e44d83afd [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] # stat mn-1 File: mn-1 Size: 237 Blocks: 16 IO Block: 4096 regular file Device: fd51h/64849d Inode: 2536 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2017-09-26 20:30:25.679000000 +0300 Modify: 2017-09-26 20:30:24.604000000 +0300 Change: 2017-09-26 20:30:24.610000000 +0300 Birth: - [root at sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop] # ls xattrop-63f8bbcb-7fa6-4fc8-b721-675a05de0ab3 [root at sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop] [root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3] # ls 53a33f43-7464-4754-86f3-1c4e44d83afd [root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3] # stat 53a33f43-7464-4754-86f3-1c4e44d83afd File: 53a33f43-7464-4754-86f3-1c4e44d83afd Size: 237 Blocks: 16 IO Block: 4096 regular file Device: fd51h/64849d Inode: 2536 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2017-09-26 20:30:25.679000000 +0300 Modify: 2017-09-26 20:30:24.604000000 +0300 Change: 2017-09-26 20:30:24.610000000 +0300 Birth: - # [SN-1] [root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1] # getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip getfattr: Removing leading '/' from absolute path names # file: mnt/bricks/services/brick/netserv/ethip trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4 trusted.glusterfs.dht=0x000000010000000000000000ffffffff [root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1] # [root at sn-1:/mnt/bricks/services/brick/netserv/ethip] # getfattr -m . -d -e hex mn-1 # file: mn-1 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.services-client-0=0x000000000000000000000000 trusted.afr.services-client-1=0x000000000000000000000000 trusted.gfid=0xf7f10f980acc4041a015e48018571d4a [root at sn-1:/mnt/bricks/services/brick/netserv/ethip] # stat mn-1 File: mn-1 Size: 237 Blocks: 16 IO Block: 4096 regular file Device: fd41h/64833d Inode: 2608 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2017-09-26 20:31:48.231000000 +0300 Modify: 2017-09-26 20:31:46.872000000 +0300 Change: 2017-09-26 20:31:46.875000000 +0300 Birth: - [root at sn-1:/mnt/bricks/services/brick/.glusterfs/indices/xattrop] # ls xattrop-240713ea-eda3-4914-a55d-7dd4aed724ed [root at sn-1:/mnt/bricks/services/brick/.glusterfs/indices/xattrop] [root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1] # stat f7f10f98-0acc-4041-a015-e48018571d4a File: f7f10f98-0acc-4041-a015-e48018571d4a Size: 237 Blocks: 16 IO Block: 4096 regular file Device: fd41h/64833d Inode: 2608 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2017-09-26 20:31:48.231000000 +0300 Modify: 2017-09-26 20:31:46.872000000 +0300 Change: 2017-09-26 20:31:46.875000000 +0300 Birth: - Best regards, Cynthia ???? MBB SM HETRAN SW3 MATRIX Storage Mobile: +86 (0)18657188311 Best regards, Cynthia ???? MBB SM HETRAN SW3 MATRIX Storage Mobile: +86 (0)18657188311 _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org<mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170928/ca2cd461/attachment.html>
Karthik Subrahmanya
2017-Sep-28 07:16 UTC
[Gluster-users] after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !
On Thu, Sep 28, 2017 at 12:11 PM, Zhou, Cynthia (NSB - CN/Hangzhou) < cynthia.zhou at nokia-sbell.com> wrote:> > > The version I am using is glusterfs 3.6.9 >This is a very old version which is EOL. If you can upgrade to any of the supported version (3.10 or 3.12) would be great. They have many new features, bug fixes & performance improvements. If you can try to reproduce the issue on that would be very helpful. Regards, Karthik> Best regards, > *Cynthia **????* > > MBB SM HETRAN SW3 MATRIX > > Storage > Mobile: +86 (0)18657188311 > > > > *From:* Karthik Subrahmanya [mailto:ksubrahm at redhat.com] > *Sent:* Thursday, September 28, 2017 2:37 PM > > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com> > *Cc:* Gluster-users at gluster.org; gluster-devel at gluster.org > *Subject:* Re: [Gluster-users] after hard reboot, split-brain happened, > but nothing showed in gluster voluem heal info command ! > > > > > > > > On Thu, Sep 28, 2017 at 11:41 AM, Zhou, Cynthia (NSB - CN/Hangzhou) < > cynthia.zhou at nokia-sbell.com> wrote: > > Hi, > > Thanks for reply! > > I?ve checked [1]. But the problem is that there is nothing shown in > command ?gluster volume heal <volume-name> info?. So these split-entry > files could only be detected when app try to visit them. > > I can find gfid mismatch for those in-split-brain entries from mount log, > however, nothing show in shd log, the shd log does not know those > split-brain entries. Because there is nothing in indices/xattrop directory. > > I guess it was there before, and then it got cleared by one of the heal > process either client side or server side. I wanted to check that by > examining the logs. > > Which version of gluster you are running by the way? > > > > The log is not available right now, when it reproduced, I will provide it > to your, Thanks! > > Ok. > > > > Best regards, > *Cynthia **????* > > MBB SM HETRAN SW3 MATRIX > > Storage > Mobile: +86 (0)18657188311 > > > > *From:* Karthik Subrahmanya [mailto:ksubrahm at redhat.com] > *Sent:* Thursday, September 28, 2017 2:02 PM > *To:* Zhou, Cynthia (NSB - CN/Hangzhou) <cynthia.zhou at nokia-sbell.com> > *Cc:* Gluster-users at gluster.org; gluster-devel at gluster.org > *Subject:* Re: [Gluster-users] after hard reboot, split-brain happened, > but nothing showed in gluster voluem heal info command ! > > > > Hi, > > To resolve the gfid split-brain you can follow the steps at [1]. > > Since we don't have the pending markers set on the files, it is not > showing in the heal info. > To debug this issue, need some more data from you. Could you provide these > things? > > 1. volume info > > 2. mount log > > 3. brick logs > > 4. shd log > > > > May I also know which version of gluster you are running. From the info > you have provided it looks like an old version. > > If it is, then it would be great if you can upgarde to one of the latest > supported release. > > > [1] http://docs.gluster.org/en/latest/Troubleshooting/split- > brain/#fixing-directory-entry-split-brain > > > > Thanks & Regards, > > Karthik > > On Wed, Sep 27, 2017 at 9:42 AM, Zhou, Cynthia (NSB - CN/Hangzhou) < > cynthia.zhou at nokia-sbell.com> wrote: > > > > HI gluster experts, > > > > I meet a tough problem about ?split-brain? issue. Sometimes, after hard > reboot, we will find some files in split-brain, however its parent > directory or anything could be shown in command ?gluster volume heal > <volume-name> info?, also, no entry in .glusterfs/indices/xattrop > directory, can you help to shed some lights on this issue? Thanks! > > > > > > > > Following is some info from our env, > > > > *Checking from sn-0 cliet, nothing is shown in-split-brain!* > > > > [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] > > # gluster v heal services info > > Brick sn-0:/mnt/bricks/services/brick/ > > Number of entries: 0 > > > > Brick sn-1:/mnt/bricks/services/brick/ > > Number of entries: 0 > > > > [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] > > [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] > > # gluster v heal services info split-brain > > Gathering list of split brain entries on volume services has been > successful > > > > Brick sn-0.local:/mnt/bricks/services/brick > > Number of entries: 0 > > > > Brick sn-1.local:/mnt/bricks/services/brick > > Number of entries: 0 > > > > [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] > > # ls -l /mnt/services/netserv/ethip/ > > ls: cannot access '/mnt/services/netserv/ethip/sn-2': Input/output error > > ls: cannot access '/mnt/services/netserv/ethip/mn-1': Input/output error > > total 3 > > -rw-r--r-- 1 root root 144 Sep 26 20:35 as-0 > > -rw-r--r-- 1 root root 144 Sep 26 20:35 as-1 > > -rw-r--r-- 1 root root 145 Sep 26 20:35 as-2 > > -rw-r--r-- 1 root root 237 Sep 26 20:36 mn-0 > > -????????? ? ? ? ? ? mn-1 > > -rw-r--r-- 1 root root 73 Sep 26 20:35 sn-0 > > -rw-r--r-- 1 root root 73 Sep 26 20:35 sn-1 > > -????????? ? ? ? ? ? sn-2 > > [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] > > > > *Checking from glusterfs server side, the gfid of mn-1 on sn-0 and sn-1 is > different* > > > > *[SN-0]* > > [root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3] > > # getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip > > getfattr: Removing leading '/' from absolute path names > > # file: mnt/bricks/services/brick/netserv/ethip > > trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4 > > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > > > > [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] > > # getfattr -m . -d -e hex mn-1 > > # file: mn-1 > > trusted.afr.dirty=0x000000000000000000000000 > > trusted.afr.services-client-0=0x000000000000000000000000 > > trusted.afr.services-client-1=0x000000000000000000000000 > > trusted.gfid=0x53a33f437464475486f31c4e44d83afd > > [root at sn-0:/mnt/bricks/services/brick/netserv/ethip] > > # stat mn-1 > > File: mn-1 > > Size: 237 Blocks: 16 IO Block: 4096 regular file > > Device: fd51h/64849d Inode: 2536 Links: 2 > > Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) > > Access: 2017-09-26 20:30:25.679000000 +0300 > > Modify: 2017-09-26 20:30:24.604000000 +0300 > > Change: 2017-09-26 20:30:24.610000000 +0300 > > Birth: - > > [root at sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop] > > # ls > > xattrop-63f8bbcb-7fa6-4fc8-b721-675a05de0ab3 > > [root at sn-0:/mnt/bricks/services/brick/.glusterfs/indices/xattrop] > > > > [root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3] > > # ls > > 53a33f43-7464-4754-86f3-1c4e44d83afd > > [root at sn-0:/mnt/bricks/services/brick/.glusterfs/53/a3] > > # stat 53a33f43-7464-4754-86f3-1c4e44d83afd > > File: 53a33f43-7464-4754-86f3-1c4e44d83afd > > Size: 237 Blocks: 16 IO Block: 4096 regular file > > Device: fd51h/64849d Inode: 2536 Links: 2 > > Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) > > Access: 2017-09-26 20:30:25.679000000 +0300 > > Modify: 2017-09-26 20:30:24.604000000 +0300 > > Change: 2017-09-26 20:30:24.610000000 +0300 > > Birth: - > > > > # > > *[SN-1]* > > > > [root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1] > > # getfattr -m . -d -e hex /mnt/bricks/services/brick/netserv/ethip > > getfattr: Removing leading '/' from absolute path names > > # file: mnt/bricks/services/brick/netserv/ethip > > trusted.gfid=0xee71d19ac0f84f60b11eb42a083644e4 > > trusted.glusterfs.dht=0x000000010000000000000000ffffffff > > > > [root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1] > > *#* > > [root at sn-1:/mnt/bricks/services/brick/netserv/ethip] > > # getfattr -m . -d -e hex mn-1 > > # file: mn-1 > > trusted.afr.dirty=0x000000000000000000000000 > > trusted.afr.services-client-0=0x000000000000000000000000 > > trusted.afr.services-client-1=0x000000000000000000000000 > > trusted.gfid=0xf7f10f980acc4041a015e48018571d4a > > > > [root at sn-1:/mnt/bricks/services/brick/netserv/ethip] > > # stat mn-1 > > File: mn-1 > > Size: 237 Blocks: 16 IO Block: 4096 regular file > > Device: fd41h/64833d Inode: 2608 Links: 2 > > Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) > > Access: 2017-09-26 20:31:48.231000000 +0300 > > Modify: 2017-09-26 20:31:46.872000000 +0300 > > Change: 2017-09-26 20:31:46.875000000 +0300 > > Birth: - > > [root at sn-1:/mnt/bricks/services/brick/.glusterfs/indices/xattrop] > > # ls > > xattrop-240713ea-eda3-4914-a55d-7dd4aed724ed > > [root at sn-1:/mnt/bricks/services/brick/.glusterfs/indices/xattrop] > > > > [root at sn-1:/mnt/bricks/services/brick/.glusterfs/f7/f1] > > # stat f7f10f98-0acc-4041-a015-e48018571d4a > > File: f7f10f98-0acc-4041-a015-e48018571d4a > > Size: 237 Blocks: 16 IO Block: 4096 regular file > > Device: fd41h/64833d Inode: 2608 Links: 2 > > Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) > > Access: 2017-09-26 20:31:48.231000000 +0300 > > Modify: 2017-09-26 20:31:46.872000000 +0300 > > Change: 2017-09-26 20:31:46.875000000 +0300 > > Birth: - > > > > > > Best regards, > *Cynthia **????* > > MBB SM HETRAN SW3 MATRIX > > Storage > Mobile: +86 (0)18657188311 > > > > > > > > Best regards, > *Cynthia **????* > > MBB SM HETRAN SW3 MATRIX > > Storage > Mobile: +86 (0)18657188311 > > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170928/c11e7e15/attachment.html>
Apparently Analagous Threads
- after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !
- after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !
- after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !
- after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !
- after hard reboot, split-brain happened, but nothing showed in gluster voluem heal info command !