maokx at sina.com
2015-May-26 03:18 UTC
[Gluster-users] split-brain sanlock ids automation
Hi all: I want to solve the problem of ids spit-brain automation. This problem is caused by network interruption. My hand is such a solution: find /data/data1/ -samefile /data/data1/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids -print -delete The problem is caused by sanlock,for example:sanlock add_lockspace -s lockspace1:1:/dom_md/ids :0 Now I want to automation to solve this problem my log is: [root at www ~]# tail -f /var/log/glusterfs/rhev-data-center-mnt-glusterSD-192.168.7.246\:_pool01.log [2015-05-25 06:37:33.613126] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b9801d2d0 & waitq = 0x7f4b9801dd70 [2015-05-25 06:37:33.613157] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 246963: READ => -1 (Input/output error) [2015-05-25 06:37:37.422720] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-pool01-replicate-0: Unable to self-heal contents of '/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 4 ] [ 4 0 ] ] [2015-05-25 06:37:37.422981] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-pool01-replicate-0: background data self-heal failed on /3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids [2015-05-25 06:37:37.423231] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b9800c520 & waitq = 0x7f4b9801aff0 [2015-05-25 06:37:37.423259] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 246978: READ => -1 (Input/output error) [2015-05-25 06:37:43.650389] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-pool01-replicate-0: Unable to self-heal contents of '/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 4 ] [ 4 0 ] ] [2015-05-25 06:37:43.650740] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-pool01-replicate-0: background data self-heal failed on /3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids [2015-05-25 06:37:43.650994] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b9802adc0 & waitq = 0x7f4b98017e50 [2015-05-25 06:37:43.651021] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 246997: READ => -1 (Input/output error) [2015-05-25 06:37:50.931622] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-pool01-replicate-0: Unable to self-heal contents of '/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 4 ] [ 4 0 ] ] [2015-05-25 06:37:50.931906] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-pool01-replicate-0: background data self-heal failed on /3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids [2015-05-25 06:37:50.932211] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b980211e0 & waitq = 0x7f4b980065e0 [2015-05-25 06:37:50.932240] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 247012: READ => -1 (Input/output error) [2015-05-25 06:37:53.688445] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-pool01-replicate-0: Unable to self-heal contents of '/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 4 ] [ 4 0 ] ] [2015-05-25 06:37:53.688821] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-pool01-replicate-0: background data self-heal failed on /3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids [2015-05-25 06:37:53.689128] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b9800acb0 & waitq = 0x7f4b9802b4d0 [2015-05-25 06:37:53.689152] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 247031: READ => -1 (Input/output error) I am such a environment: maokx at sina.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150526/9737b2c0/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: wpsE1CA.tmp.jpg Type: image/jpeg Size: 82928 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150526/9737b2c0/attachment.jpg>
On 05/26/2015 08:48 AM, maokx at sina.com wrote:> Hi all: > I want to solve the problem of ids spit-brain automation. > This problem is caused by network interruption. > My hand is such a solution: > find /data/data1/ -samefile > /data/data1/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids -print -delete > The problem is caused by sanlock,for example:sanlock add_lockspace -s > lockspace1:1:/dom_md/ids :0 > Now I want to automation to solve this problemYou can use the the gluster CLI commands (glusterfs 3.6 onwards) (or) the get/setfattr commands from the mount (glusterfs 3.7 onwards) to heal files that are in split-brain. Usage can be found at https://github.com/gluster/glusterfs/blob/master/doc/features/heal-info-and-split-brain-resolution.md If you are using ovirt (and hence the sanlock file) with gluster, you can achieve better split-brain protection using replica 3 volumes with cluster.quorum-type set to auto. Hope that helps, Ravi> *my log is:* > > [root at www ~]# tail -f /var/log/glusterfs/rhev-data-center-mnt-glusterSD-192.168.7.246\:_pool01.log > > > [2015-05-25 06:37:33.613126] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b9801d2d0 & waitq = 0x7f4b9801dd70 > > [2015-05-25 06:37:33.613157] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 246963: READ => -1 (Input/output error) > > [2015-05-25 06:37:37.422720] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-pool01-replicate-0: Unable to self-heal contents of '/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 4 ] [ 4 0 ] ] > > [2015-05-25 06:37:37.422981] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-pool01-replicate-0: background data self-heal failed on /3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids > > [2015-05-25 06:37:37.423231] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b9800c520 & waitq = 0x7f4b9801aff0 > > [2015-05-25 06:37:37.423259] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 246978: READ => -1 (Input/output error) > > [2015-05-25 06:37:43.650389] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-pool01-replicate-0: Unable to self-heal contents of '/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 4 ] [ 4 0 ] ] > > [2015-05-25 06:37:43.650740] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-pool01-replicate-0: background data self-heal failed on > /3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids > > [2015-05-25 06:37:43.650994] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b9802adc0 & waitq = 0x7f4b98017e50 > > [2015-05-25 06:37:43.651021] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 246997: READ => -1 (Input/output error) > > [2015-05-25 06:37:50.931622] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-pool01-replicate-0: Unable to self-heal contents of '/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 4 ] [ 4 0 ] ] > > [2015-05-25 06:37:50.931906] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-pool01-replicate-0: background data self-heal failed on /3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids > > [2015-05-25 06:37:50.932211] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b980211e0 & waitq = 0x7f4b980065e0 > > [2015-05-25 06:37:50.932240] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 247012: READ => -1 (Input/output error) > > [2015-05-25 06:37:53.688445] E [afr-self-heal-common.c:197:afr_sh_print_split_brain_log] 0-pool01-replicate-0: Unable to self-heal contents of '/3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 4 ] [ 4 0 ] ] > > [2015-05-25 06:37:53.688821] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-pool01-replicate-0: background data self-heal failed on /3b9e1af1-f452-4ed4-9cbf-a38d8f7a395e/dom_md/ids > > [2015-05-25 06:37:53.689128] W [page.c:991:__ioc_page_error] 0-pool01-io-cache: page error for page = 0x7f4b9800acb0 & waitq = 0x7f4b9802b4d0 > > [2015-05-25 06:37:53.689152] W [fuse-bridge.c:2089:fuse_readv_cbk] 0-glusterfs-fuse: 247031: READ => -1 (Input/output error) > > *I am such a environment:* > > > ------------------------------------------------------------------------ > maokx at sina.com > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150526/01e6c7a0/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 82928 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150526/01e6c7a0/attachment.jpe>