"/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" is a binary file. Here is the output of gluster volume info: -------------------------------------------------------------------------------------- [root at ovirt-node03 ~]# gluster volume info Volume Name: RaidVolB Type: Replicate Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: ovirt-node03.example.local:/raidvol/volb/brick Brick2: ovirt-node04.example.local:/raidvol/volb/brick Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off auth.allow: * user.cifs: disable nfs.disable: on [root at ovirt-node04 ~]# gluster volume info Volume Name: RaidVolB Type: Replicate Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: ovirt-node03.example.local:/raidvol/volb/brick Brick2: ovirt-node04.example.local:/raidvol/volb/brick Options Reconfigured: nfs.disable: on user.cifs: disable auth.allow: * performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable storage.owner-uid: 36 storage.owner-gid: 36 Here is the getfattr command in node03 and node 04: -------------------------------------------------------------------------------------- getfattr -d -m . -e hex /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids getfattr: Removing leading '/' from absolute path names # file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids trusted.afr.RaidVolB-client-0=0x000000000000000000000000 trusted.afr.RaidVolB-client-1=0x000000000000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73 [root at ovirt-node04 ~]# getfattr -d -m . -e hex /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids getfattr: Removing leading '/' from absolute path names # file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids trusted.afr.RaidVolB-client-0=0x000000000000000000000000 trusted.afr.RaidVolB-client-1=0x000000000000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73 Am i supposed to run those commands on the mounted brick?: -------------------------------------------------------------------------------------- 127.0.0.1:RaidVolB on /rhev/data-center/mnt/glusterSD/127.0.0.1:RaidVolB type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) At the very beginning i thought i removed the file with "rm /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" hoping gluster would then fix itself somehow :) It was gone but it seems to be here again. Dunno if this is any help. Here is gluster volume heal RaidVolB info on both nodes: -------------------------------------------------------------------------------------- [root at ovirt-node03 ~]# gluster volume heal RaidVolB info Brick ovirt-node03.example.local:/raidvol/volb/brick/ Number of entries: 0 Brick ovirt-node04.example.local:/raidvol/volb/brick/ Number of entries: 0 [root at ovirt-node04 ~]# gluster volume heal RaidVolB info Brick ovirt-node03.example.local:/raidvol/volb/brick/ Number of entries: 0 Brick ovirt-node04.example.local:/raidvol/volb/brick/ Number of entries: 0 Thanks a lot, Mario On Wed, Jan 28, 2015 at 4:57 PM, Ravishankar N <ravishankar at redhat.com> wrote:> > On 01/28/2015 08:34 PM, Ml Ml wrote: >> >> Hello Ravi, >> >> thanks a lot for your reply. >> >> The Data on ovirt-node03 is the one which i want. >> >> Here are the infos collected by following the howto: >> >> https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md >> >> >> >> [root at ovirt-node03 ~]# gluster volume heal RaidVolB info split-brain >> Gathering list of split brain entries on volume RaidVolB has been >> successful >> >> Brick ovirt-node03.example.local:/raidvol/volb/brick >> Number of entries: 0 >> >> Brick ovirt-node04.example.local:/raidvol/volb/brick >> Number of entries: 14 >> at path on brick >> ----------------------------------- >> 2015-01-27 17:33:00 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73> >> 2015-01-27 17:34:01 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73> >> 2015-01-27 17:35:04 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73> >> 2015-01-27 17:36:05 <gfid:cd411b57-6078-4f3c-80d1-0ac1455186a6>/ids >> 2015-01-27 17:37:06 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:37:07 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:38:08 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:38:21 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:39:22 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:40:23 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:41:24 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:42:25 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:43:26 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> 2015-01-27 17:44:27 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> >> [root at ovirt-node03 ~]# gluster volume heal RaidVolB info >> Brick ovirt-node03.example.local:/raidvol/volb/brick/ >> Number of entries: 0 >> >> Brick ovirt-node04.example.local:/raidvol/volb/brick/ >> Number of entries: 0 > > > Hi Mario, > Is "/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" a file or a directory? > Whatever it is, it should be shown in the output of heal info /heal info > split-brain command of both nodes. But I see it being listed only under > node03. > Also, heal info is showing zero entries for both nodes which is strange. > > Are node03 and node04 bricks of the same replica pair? Can you share > 'gluster volume info` of RaidVolB? > How did you infer that there is a split-brain? Does accessing the file(s) > from the mount give input/output error? > >> >> [root at ovirt-node03 ~]# getfattr -d -m . -e hex >> /raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> getfattr: Removing leading '/' from absolute path names >> # file: raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >> trusted.afr.RaidVolB-client-0=0x000000000000000000000000 >> trusted.afr.RaidVolB-client-1=0x000000000000000000000000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73 > > What is the getfattr output of this file on the other brick? The afr > specific xattrs being all zeros certainly don't indicate the possibility of > a split-brain > >> >> The "Resetting the relevant changelogs to resolve the split-brain: " >> part of the howto is now a little complictaed. Do i have a data or >> meta split brain now? >> I guess i have a data split brain in my case, right? >> >> What are my next setfattr commands nowin my case if i want to keep the >> data from node03? >> >> Thanks a lot! >> >> Mario >> >> >> On Wed, Jan 28, 2015 at 9:44 AM, Ravishankar N <ravishankar at redhat.com> >> wrote: >>> >>> On 01/28/2015 02:02 PM, Ml Ml wrote: >>>> >>>> I want to either take the file from node03 or node04. i really don?t >>>>> >>>>> mind. Can i not just tell gluster that it should use one node as the >>>>> ?current? one? >>> >>> Policy based split-brain resolution [1] which does just that, has been >>> merged in master and should be available in glusterfs 3.7. >>> For the moment, you would have to modify the xattrs on the one of the >>> bricks >>> and trigger heal. You can see >>> >>> https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md >>> on how to do it. >>> >>> Hope this helps, >>> Ravi >>> >>> [1] http://review.gluster.org/#/c/9377/ >>> >
Ravishankar N
2015-Jan-29 04:19 UTC
[Gluster-users] ... i was able to produce a split brain...
On 01/28/2015 10:58 PM, Ml Ml wrote:> "/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" is a binary file. > > > Here is the output of gluster volume info: > -------------------------------------------------------------------------------------- > > > [root at ovirt-node03 ~]# gluster volume info > > Volume Name: RaidVolB > Type: Replicate > Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: ovirt-node03.example.local:/raidvol/volb/brick > Brick2: ovirt-node04.example.local:/raidvol/volb/brick > Options Reconfigured: > storage.owner-gid: 36 > storage.owner-uid: 36 > network.remote-dio: enable > cluster.eager-lock: enable > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > auth.allow: * > user.cifs: disable > nfs.disable: on > > > > > [root at ovirt-node04 ~]# gluster volume info > > Volume Name: RaidVolB > Type: Replicate > Volume ID: e952fd41-45bf-42d9-b494-8e0195cb9756 > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: ovirt-node03.example.local:/raidvol/volb/brick > Brick2: ovirt-node04.example.local:/raidvol/volb/brick > Options Reconfigured: > nfs.disable: on > user.cifs: disable > auth.allow: * > performance.quick-read: off > performance.read-ahead: off > performance.io-cache: off > performance.stat-prefetch: off > cluster.eager-lock: enable > network.remote-dio: enable > storage.owner-uid: 36 > storage.owner-gid: 36 > > > Here is the getfattr command in node03 and node 04: > -------------------------------------------------------------------------------------- > > > getfattr -d -m . -e hex > /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids > getfattr: Removing leading '/' from absolute path names > # file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids > trusted.afr.RaidVolB-client-0=0x000000000000000000000000 > trusted.afr.RaidVolB-client-1=0x000000000000000000000000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73 > > > > [root at ovirt-node04 ~]# getfattr -d -m . -e hex > /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids > getfattr: Removing leading '/' from absolute path names > # file: raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids > trusted.afr.RaidVolB-client-0=0x000000000000000000000000 > trusted.afr.RaidVolB-client-1=0x000000000000000000000000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73 >These xattrs seem to indicate there is no split-brain for the file, heal-info also shows 0 entries on both bricks. Are you getting I/O error when you read "1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" from the mount? If yes, is there a difference in file size on both nodes? How about the contents (check if md5sum is same)?> Am i supposed to run those commands on the mounted brick?: > -------------------------------------------------------------------------------------- > 127.0.0.1:RaidVolB on > /rhev/data-center/mnt/glusterSD/127.0.0.1:RaidVolB type fuse.glusterfs > (rw,default_permissions,allow_other,max_read=131072) > > > At the very beginning i thought i removed the file with "rm > /raidvol/volb/brick//1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" > hoping gluster would then fix itself somehow :) > It was gone but it seems to be here again. Dunno if this is any help. > > > Here is gluster volume heal RaidVolB info on both nodes: > -------------------------------------------------------------------------------------- > > [root at ovirt-node03 ~]# gluster volume heal RaidVolB info > Brick ovirt-node03.example.local:/raidvol/volb/brick/ > Number of entries: 0 > > Brick ovirt-node04.example.local:/raidvol/volb/brick/ > Number of entries: 0 > > > [root at ovirt-node04 ~]# gluster volume heal RaidVolB info > Brick ovirt-node03.example.local:/raidvol/volb/brick/ > Number of entries: 0 > > Brick ovirt-node04.example.local:/raidvol/volb/brick/ > Number of entries: 0 > > > Thanks a lot, > Mario > > > > On Wed, Jan 28, 2015 at 4:57 PM, Ravishankar N <ravishankar at redhat.com> wrote: >> On 01/28/2015 08:34 PM, Ml Ml wrote: >>> Hello Ravi, >>> >>> thanks a lot for your reply. >>> >>> The Data on ovirt-node03 is the one which i want. >>> >>> Here are the infos collected by following the howto: >>> >>> https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md >>> >>> >>> >>> [root at ovirt-node03 ~]# gluster volume heal RaidVolB info split-brain >>> Gathering list of split brain entries on volume RaidVolB has been >>> successful >>> >>> Brick ovirt-node03.example.local:/raidvol/volb/brick >>> Number of entries: 0 >>> >>> Brick ovirt-node04.example.local:/raidvol/volb/brick >>> Number of entries: 14 >>> at path on brick >>> ----------------------------------- >>> 2015-01-27 17:33:00 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73> >>> 2015-01-27 17:34:01 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73> >>> 2015-01-27 17:35:04 <gfid:1c15d0cb-1cca-4627-841c-395f7b712f73> >>> 2015-01-27 17:36:05 <gfid:cd411b57-6078-4f3c-80d1-0ac1455186a6>/ids >>> 2015-01-27 17:37:06 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:37:07 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:38:08 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:38:21 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:39:22 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:40:23 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:41:24 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:42:25 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:43:26 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> 2015-01-27 17:44:27 /1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> >>> [root at ovirt-node03 ~]# gluster volume heal RaidVolB info >>> Brick ovirt-node03.example.local:/raidvol/volb/brick/ >>> Number of entries: 0 >>> >>> Brick ovirt-node04.example.local:/raidvol/volb/brick/ >>> Number of entries: 0 >> >> Hi Mario, >> Is "/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids" a file or a directory? >> Whatever it is, it should be shown in the output of heal info /heal info >> split-brain command of both nodes. But I see it being listed only under >> node03. >> Also, heal info is showing zero entries for both nodes which is strange. >> >> Are node03 and node04 bricks of the same replica pair? Can you share >> 'gluster volume info` of RaidVolB? >> How did you infer that there is a split-brain? Does accessing the file(s) >> from the mount give input/output error? >> >>> [root at ovirt-node03 ~]# getfattr -d -m . -e hex >>> /raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> getfattr: Removing leading '/' from absolute path names >>> # file: raidvol/volb/brick/1701d5ae-6a44-4374-8b29-61c699da870b/dom_md/ids >>> trusted.afr.RaidVolB-client-0=0x000000000000000000000000 >>> trusted.afr.RaidVolB-client-1=0x000000000000000000000000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1c15d0cb1cca4627841c395f7b712f73 >> What is the getfattr output of this file on the other brick? The afr >> specific xattrs being all zeros certainly don't indicate the possibility of >> a split-brain >> >>> The "Resetting the relevant changelogs to resolve the split-brain: " >>> part of the howto is now a little complictaed. Do i have a data or >>> meta split brain now? >>> I guess i have a data split brain in my case, right? >>> >>> What are my next setfattr commands nowin my case if i want to keep the >>> data from node03? >>> >>> Thanks a lot! >>> >>> Mario >>> >>> >>> On Wed, Jan 28, 2015 at 9:44 AM, Ravishankar N <ravishankar at redhat.com> >>> wrote: >>>> On 01/28/2015 02:02 PM, Ml Ml wrote: >>>>> I want to either take the file from node03 or node04. i really don?t >>>>>> mind. Can i not just tell gluster that it should use one node as the >>>>>> ?current? one? >>>> Policy based split-brain resolution [1] which does just that, has been >>>> merged in master and should be available in glusterfs 3.7. >>>> For the moment, you would have to modify the xattrs on the one of the >>>> bricks >>>> and trigger heal. You can see >>>> >>>> https://github.com/GlusterFS/glusterfs/blob/master/doc/debugging/split-brain.md >>>> on how to do it. >>>> >>>> Hope this helps, >>>> Ravi >>>> >>>> [1] http://review.gluster.org/#/c/9377/ >>>>