Alexandru Coseru
2013-Dec-09 13:51 UTC
Gluster - replica - Unable to self-heal contents of ''/'' (possible split-brain)
Hello, I''m trying to build a replica volume, on two servers. The servers are: blade6 and blade7. (another blade1 in the peer, but with no volumes) The volume seems ok, but I cannot mount it from NFS. Here are some logs: [root@blade6 stor1]# df -h /dev/mapper/gluster_stor1 882G 200M 837G 1% /gluster/stor1 [root@blade7 stor1]# df -h /dev/mapper/gluster_fast 846G 158G 646G 20% /gluster/stor_fast /dev/mapper/gluster_stor1 882G 72M 837G 1% /gluster/stor1 [root@blade6 stor1]# pwd /gluster/stor1 [root@blade6 stor1]# ls -lh total 0 [root@blade7 stor1]# pwd /gluster/stor1 [root@blade7 stor1]# ls -lh total 0 [root@blade6 stor1]# gluster volume info Volume Name: stor_fast Type: Distribute Volume ID: ad82b554-8ff0-4903-be32-f8dcb9420f31 Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: blade7.xen:/gluster/stor_fast Options Reconfigured: nfs.port: 2049 Volume Name: stor1 Type: Replicate Volume ID: 6bd88164-86c2-40f6-9846-b21e90303e73 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: blade7.xen:/gluster/stor1 Brick2: blade6.xen:/gluster/stor1 Options Reconfigured: nfs.port: 2049 [root@blade7 stor1]# gluster volume info Volume Name: stor_fast Type: Distribute Volume ID: ad82b554-8ff0-4903-be32-f8dcb9420f31 Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: blade7.xen:/gluster/stor_fast Options Reconfigured: nfs.port: 2049 Volume Name: stor1 Type: Replicate Volume ID: 6bd88164-86c2-40f6-9846-b21e90303e73 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: blade7.xen:/gluster/stor1 Brick2: blade6.xen:/gluster/stor1 Options Reconfigured: nfs.port: 2049 [root@blade6 stor1]# gluster volume status Status of volume: stor_fast Gluster process Port Online Pid ---------------------------------------------------------------------------- -- Brick blade7.xen:/gluster/stor_fast 49152 Y 1742 NFS Server on localhost 2049 Y 20074 NFS Server on blade1.xen 2049 Y 22255 NFS Server on blade7.xen 2049 Y 7574 There are no active volume tasks Status of volume: stor1 Gluster process Port Online Pid ---------------------------------------------------------------------------- -- Brick blade7.xen:/gluster/stor1 49154 Y 7562 Brick blade6.xen:/gluster/stor1 49154 Y 20053 NFS Server on localhost 2049 Y 20074 Self-heal Daemon on localhost N/A Y 20079 NFS Server on blade1.xen 2049 Y 22255 Self-heal Daemon on blade1.xen N/A Y 22260 NFS Server on blade7.xen 2049 Y 7574 Self-heal Daemon on blade7.xen N/A Y 7578 There are no active volume tasks [root@blade7 stor1]# gluster volume status Status of volume: stor_fast Gluster process Port Online Pid ---------------------------------------------------------------------------- -- Brick blade7.xen:/gluster/stor_fast 49152 Y 1742 NFS Server on localhost 2049 Y 7574 NFS Server on blade6.xen 2049 Y 20074 NFS Server on blade1.xen 2049 Y 22255 There are no active volume tasks Status of volume: stor1 Gluster process Port Online Pid ---------------------------------------------------------------------------- -- Brick blade7.xen:/gluster/stor1 49154 Y 7562 Brick blade6.xen:/gluster/stor1 49154 Y 20053 NFS Server on localhost 2049 Y 7574 Self-heal Daemon on localhost N/A Y 7578 NFS Server on blade1.xen 2049 Y 22255 Self-heal Daemon on blade1.xen N/A Y 22260 NFS Server on blade6.xen 2049 Y 20074 Self-heal Daemon on blade6.xen N/A Y 20079 There are no active volume tasks [root@blade6 stor1]# gluster peer status Number of Peers: 2 Hostname: blade1.xen Port: 24007 Uuid: 194a57a7-cb0e-43de-a042-0ac4026fd07b State: Peer in Cluster (Connected) Hostname: blade7.xen Port: 24007 Uuid: 574eb256-30d2-4639-803e-73d905835139 State: Peer in Cluster (Connected) [root@blade7 stor1]# gluster peer status Number of Peers: 2 Hostname: blade6.xen Port: 24007 Uuid: a65cadad-ef79-4821-be41-5649fb204f3e State: Peer in Cluster (Connected) Hostname: blade1.xen Uuid: 194a57a7-cb0e-43de-a042-0ac4026fd07b State: Peer in Cluster (Connected) [root@blade6 stor1]# gluster volume heal stor1 info Gathering Heal info on volume stor1 has been successful Brick blade7.xen:/gluster/stor1 Number of entries: 0 Brick blade6.xen:/gluster/stor1 Number of entries: 0 [root@blade7 stor1]# gluster volume heal stor1 info Gathering Heal info on volume stor1 has been successful Brick blade7.xen:/gluster/stor1 Number of entries: 0 Brick blade6.xen:/gluster/stor1 Number of entries: 0 When I''m trying to mount the volume with NFS, I have the following errors: [2013-12-09 13:20:52.066978] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-stor1-replicate-0: Unable to self-heal contents of ''/'' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 2 ] [ 2 0 ] ] [2013-12-09 13:20:52.067386] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-stor1-replicate-0: background meta-data self-heal failed on / [2013-12-09 13:20:52.067452] E [mount3.c:290:mnt3svc_lookup_mount_cbk] 0-nfs: error=Input/output error [2013-12-09 13:20:53.092039] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-stor1-replicate-0: Unable to self-heal contents of ''/'' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 2 ] [ 2 0 ] ] [2013-12-09 13:20:53.092497] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-stor1-replicate-0: background meta-data self-heal failed on / [2013-12-09 13:20:53.092559] E [mount3.c:290:mnt3svc_lookup_mount_cbk] 0-nfs: error=Input/output error What I''m doing wrong ? PS: Volume stor_fast works like a charm. Best Regards, _______________________________________________ Gluster-users mailing list Gluster-users-+FkPdpiNhgJAfugRpC6u6w@public.gmane.org http://supercolony.gluster.org/mailman/listinfo/gluster-users