Hi all, I have 4 glusterd servers running a single glusterfs volume. The volume was created using the gluster command line, with no changes from default. The same machines all mount the volume using the native glusterfs client: [root at localhost ~]# gluster volume create datastore replica 2 transport tcp 192.168.253.1:/glusterfs/primary 192.168.253.3:/glusterfs/secondary 192.168.253.2:/glusterfs/primary 192.168.253.4:/glusterfs/secondary 192.168.253.3:/glusterfs/primary 192.168.253.1:/glusterfs/secondary 192.168.253.4:/glusterfs/primary 192.168.253.2:/glusterfs/secondary [root at localhost ~]# cat /etc/fstab ... /dev/cciss/c0d0p6 /glusterfs/primary ext4 defaults,noatime 1 2 /dev/cciss/c0d1p6 /glusterfs/secondary ext4 defaults,noatime 1 2 192.168.253.1:/datastore /mnt/datastore glusterfs defaults,_netdev 0 0 [root at localhost ~]# gluster volume info Volume Name: datastore Type: Distributed-Replicate Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: 192.168.253.1:/glusterfs/primary Brick2: 192.168.253.3:/glusterfs/secondary Brick3: 192.168.253.2:/glusterfs/primary Brick4: 192.168.253.4:/glusterfs/secondary Brick5: 192.168.253.3:/glusterfs/primary Brick6: 192.168.253.1:/glusterfs/secondary Brick7: 192.168.253.4:/glusterfs/primary Brick8: 192.168.253.2:/glusterfs/secondary The platform is not currently running production data and I have been testing the redundancy of the setup (pulling cables etc.). All my servers are now logging the following messages every 1 minute or so: [2010-11-11 14:18:49.636327] I [afr-common.c:672:afr_lookup_done] datastore-replicate-0: split brain detected during lookup of /. [2010-11-11 14:18:49.636388] I [afr-common.c:716:afr_lookup_done] datastore-replicate-0: background meta-data data self-heal triggered. path: / [2010-11-11 14:18:49.636863] E [afr-self-heal-metadata.c:524:afr_sh_metadata_fix] datastore-replicate-0: Unable to self-heal permissions/ownership of '/' (possible split-brain). Please fix the file on all backend volumes [2010-11-11 14:18:49.637080] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] datastore-replicate-0: background meta-data data self-heal completed on / [2010-11-11 14:18:49.637561] I [afr-common.c:672:afr_lookup_done] datastore-replicate-0: split brain detected during lookup of /. [2010-11-11 14:18:49.637588] I [afr-common.c:716:afr_lookup_done] datastore-replicate-0: background meta-data data self-heal triggered. path: / [2010-11-11 14:18:49.638064] E [afr-self-heal-metadata.c:524:afr_sh_metadata_fix] datastore-replicate-0: Unable to self-heal permissions/ownership of '/' (possible split-brain). Please fix the file on all backend volumes [2010-11-11 14:18:49.638265] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] datastore-replicate-0: background meta-data data self-heal completed on / Can anyone tell me what I need to do to fix this? Thanks, Aaron
Hi all, I have 4 glusterd servers running a single glusterfs volume. The volume was created using the gluster command line, with no changes from default. The same machines all mount the volume using the native glusterfs client: [root at localhost ~]# gluster volume create datastore replica 2 transport tcp 192.168.253.1:/glusterfs/primary 192.168.253.3:/glusterfs/secondary 192.168.253.2:/glusterfs/primary 192.168.253.4:/glusterfs/secondary 192.168.253.3:/glusterfs/primary 192.168.253.1:/glusterfs/secondary 192.168.253.4:/glusterfs/primary 192.168.253.2:/glusterfs/secondary [root at localhost ~]# cat /etc/fstab ... /dev/cciss/c0d0p6 /glusterfs/primary ext4 defaults,noatime 1 2 /dev/cciss/c0d1p6 /glusterfs/secondary ext4 defaults,noatime 1 2 192.168.253.1:/datastore /mnt/datastore glusterfs defaults,_netdev 0 0 [root at localhost ~]# gluster volume info Volume Name: datastore Type: Distributed-Replicate Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: 192.168.253.1:/glusterfs/primary Brick2: 192.168.253.3:/glusterfs/secondary Brick3: 192.168.253.2:/glusterfs/primary Brick4: 192.168.253.4:/glusterfs/secondary Brick5: 192.168.253.3:/glusterfs/primary Brick6: 192.168.253.1:/glusterfs/secondary Brick7: 192.168.253.4:/glusterfs/primary Brick8: 192.168.253.2:/glusterfs/secondary The platform is not currently running production data and I have been testing the redundancy of the setup (pulling cables etc.). All my servers are now logging the following messages every 1 minute or so: [2010-11-11 14:18:49.636327] I [afr-common.c:672:afr_lookup_done] datastore-replicate-0: split brain detected during lookup of /. [2010-11-11 14:18:49.636388] I [afr-common.c:716:afr_lookup_done] datastore-replicate-0: background meta-data data self-heal triggered. path: / [2010-11-11 14:18:49.636863] E [afr-self-heal-metadata.c:524:afr_sh_metadata_fix] datastore-replicate-0: Unable to self-heal permissions/ownership of '/' (possible split-brain). Please fix the file on all backend volumes [2010-11-11 14:18:49.637080] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] datastore-replicate-0: background meta-data data self-heal completed on / [2010-11-11 14:18:49.637561] I [afr-common.c:672:afr_lookup_done] datastore-replicate-0: split brain detected during lookup of /. [2010-11-11 14:18:49.637588] I [afr-common.c:716:afr_lookup_done] datastore-replicate-0: background meta-data data self-heal triggered. path: / [2010-11-11 14:18:49.638064] E [afr-self-heal-metadata.c:524:afr_sh_metadata_fix] datastore-replicate-0: Unable to self-heal permissions/ownership of '/' (possible split-brain). Please fix the file on all backend volumes [2010-11-11 14:18:49.638265] I [afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] datastore-replicate-0: background meta-data data self-heal completed on / Can anyone tell me what I need to do to fix this? Thanks, Aaron