PEPONNET, Cyril (Cyril)
2014-Apr-17 23:37 UTC
[Gluster-users] Conflicting entries for symlinks between bricks (Trusted.gfid not consistent)
Hi gluster people ! I would like some help regarding an issue we have with our early production glusterfs setup. Our Topology: 2 Bricks in Replicate mode: [root at myBrick1 /]# cat /etc/redhat-release CentOS release 6.5 (Final) [root at myBrick1 /]# glusterfs --version glusterfs 3.4.2 built on Jan 3 2014 12:38:05 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/> [root at myBrick1 /]# gluster volume info Volume Name: myVol Type: Replicate Volume ID: 58f5d775-acb5-416d-bee6-5209f7b20363 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: myBrick1.company.lan:/export/raid/myVol Brick2: myBrick2.company.lan:/export/raid/myVol Options Reconfigured: nfs.enable-ino32: on The issue: We power down a brick (myBrick1) for hardware maintenance, when we power it up, issues starts with some files (symlinks in fact), auto healing seems not working fine for all the files? Let's take a look with one faulty symlink: Using fuse.glusterfs (sometimes it works sometimes not) [root at myBrick2 /]mount ... myBrick2.company.lan:/myVol on /images type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) ... [root at myBrick2 /]# stat /images/myProject1/2.1_stale/current File: `/images/myProject1/2.1_stale/current' -> `current-59a77422' Size: 16 Blocks: 0 IO Block: 131072 symbolic link Device: 13h/19d Inode: 11422905275486058235 Links: 1 Access: (0777/lrwxrwxrwx) Uid: ( 499/ testlab) Gid: ( 499/ testlab) Access: 2014-04-17 14:05:54.488238322 -0700 Modify: 2014-04-16 19:46:05.033299589 -0700 Change: 2014-04-17 14:05:54.487238322 -0700 [root at myBrick2 /]# stat /images/myProject1/2.1_stale/current stat: cannot stat `/images/myProject1/2.1_stale/current': Input/output error I type the above commands with few seconds between them. Let's try with the other brick [root at myBrick1 ~]mount ... myBrick1.company.lan:/myVol on /images type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072) ... [root at myBrick1 ~]# stat /images/myProject1/2.1_stale/current stat: cannot stat `/images/myProject1/2.1_stale/current': Input/output error With this one it always fail? (myBrick1 is the server we powered up after maintenance). Using nfs: It never works (tested with two bricks) [root at station-localdomain myProject1]# mount ... myBrick1:/myVol on /images type nfs (rw,relatime,vers=3,rsize=8192,wsize=8192,namlen=255,hard,proto=tcp,timeo=14,retrans=2,sec=sys,mountaddr=10.0.0.57,mountvers=3,mountport=38465,mountproto=tcp,local_lock=none,addr=10.0.0.57) ... [root at station-localdomain myProject1]# ls 2.1_stale ls: cannot access 2.1_stale: Input/output error In both cases here are the logs: ==> /var/log/glusterfs/glustershd.log <=[2014-04-17 10:20:25.861003] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: <gfid:fcbbe770-6388-4d74-a78a-7939b17e36aa>: Performing conservative merge [2014-04-17 10:20:25.895143] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: <gfid:ae058719-61de-47de-82dc-6cb8a3d80afe>: Performing conservative merge [2014-04-17 10:20:25.949176] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: <gfid:868e3eb7-03e6-4b6b-a75a-16b31bdf8a10>: Performing conservative merge [2014-04-17 10:20:25.995289] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: <gfid:115efb83-2154-4f9d-8c70-a31f476db110>: Performing conservative merge [2014-04-17 10:20:26.013995] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: <gfid:0982330a-2e08-4b97-9ea5-cf991d295e41>: Performing conservative merge [2014-04-17 10:20:26.050693] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: <gfid:3a15b54b-b92c-4ed5-875e-1af0a3b94e0c>: Performing conservative merge ==> /var/log/glusterfs/usr-global.log <=[2014-04-17 10:20:38.281705] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: /images/myProject1/2.1_stale: Performing conservative merge [2014-04-17 10:20:38.286986] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 1 [2014-04-17 10:20:38.287030] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_s [2014-04-17 10:20:38.287169] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 1 [2014-04-17 10:20:38.287202] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_b [2014-04-17 10:20:38.287280] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 [2014-04-17 10:20:38.287308] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_n [2014-04-17 10:20:38.287506] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/current: gfid differs on subvolume 1 [2014-04-17 10:20:38.287538] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/current [2014-04-17 10:20:38.311222] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 0 [2014-04-17 10:20:38.311277] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_s [2014-04-17 10:20:38.311345] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 [2014-04-17 10:20:38.311385] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_b [2014-04-17 10:20:38.311473] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/current: gfid differs on subvolume 0 [2014-04-17 10:20:38.311502] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/current [2014-04-17 10:20:38.332110] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 [2014-04-17 10:20:38.332149] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_n [2014-04-17 10:20:38.332845] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-myVol-replicate-0: background entry self-heal failed on /images/myProject1/2.1_stale [2014-04-17 10:20:41.447911] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: /images/myProject1/2.1_stale: Performing conservative merge [2014-04-17 10:20:41.453950] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 1 [2014-04-17 10:20:41.453998] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_s [2014-04-17 10:20:41.454135] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 1 [2014-04-17 10:20:41.454163] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_b [2014-04-17 10:20:41.454237] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 [2014-04-17 10:20:41.454263] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_n [2014-04-17 10:20:41.454385] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/current: gfid differs on subvolume 1 [2014-04-17 10:20:41.454413] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/current [2014-04-17 10:20:41.479015] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 0 [2014-04-17 10:20:41.479063] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_s [2014-04-17 10:20:41.479149] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 [2014-04-17 10:20:41.479177] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_b [2014-04-17 10:20:41.479252] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/current: gfid differs on subvolume 0 [2014-04-17 10:20:41.479279] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/current [2014-04-17 10:20:41.499291] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 [2014-04-17 10:20:41.499333] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_n [2014-04-17 10:20:41.499995] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-myVol-replicate-0: background entry self-heal failed on /images/myProject1/2.1_stale [2014-04-17 10:20:43.149818] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: /images/myProject1/2.1_stale: Performing conservative merge [2014-04-17 10:20:43.155127] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 1 [2014-04-17 10:20:43.155185] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_s [2014-04-17 10:20:43.155308] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 [2014-04-17 10:20:43.155346] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_b [2014-04-17 10:20:43.155441] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 0 [2014-04-17 10:20:43.155477] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_n [2014-04-17 10:20:43.155628] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/current: gfid differs on subvolume 1 [2014-04-17 10:20:43.155660] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/current [2014-04-17 10:20:43.180271] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 0 [2014-04-17 10:20:43.180324] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_s [2014-04-17 10:20:43.180425] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 [2014-04-17 10:20:43.180455] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_b [2014-04-17 10:20:43.180545] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/current: gfid differs on subvolume 0 [2014-04-17 10:20:43.180578] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/current [2014-04-17 10:20:43.201070] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 [2014-04-17 10:20:43.201112] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_n [2014-04-17 10:20:43.201788] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-myVol-replicate-0: background entry self-heal failed on /images/myProject1/2.1_stale [2014-04-17 10:20:44.646242] I [afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: /images/myProject1/2.1_stale: Performing conservative merge [2014-04-17 10:20:44.652027] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 1 [2014-04-17 10:20:44.652072] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_s [2014-04-17 10:20:44.652207] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 1 [2014-04-17 10:20:44.652239] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_b [2014-04-17 10:20:44.652341] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 [2014-04-17 10:20:44.652372] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_n [2014-04-17 10:20:44.652518] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/current: gfid differs on subvolume 1 [2014-04-17 10:20:44.652550] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/current [2014-04-17 10:20:44.676929] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 0 [2014-04-17 10:20:44.676973] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_s [2014-04-17 10:20:44.677062] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 [2014-04-17 10:20:44.677107] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_b [2014-04-17 10:20:44.677196] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/current: gfid differs on subvolume 0 [2014-04-17 10:20:44.677225] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/current [2014-04-17 10:20:44.698071] W [afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: /images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 [2014-04-17 10:20:44.698113] E [afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] 0-myVol-replicate-0: Conflicting entries for /images/myProject1/2.1_stale/latest_n [2014-04-17 10:20:44.698816] E [afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] 0-myVol-replicate-0: background entry self-heal failed on /images/myProject1/2.1_stale So let?s take a deeper look. Any split-brain ? [root at myBrick1 glusterfs]# gluster volume heal myVol info split-brain Gathering Heal info on volume myVol has been successful Brick myBrick1.company.lan:/export/raid/myVol Number of entries: 0 Brick myBrick2.company.lan:/export/raid/myVol Number of entries: 0 Nop Any heal-failed ? [root at myBrick1 glusterfs]# gluster volume heal myVol info heal-failed |wc -l 380 Plenty? how unique of them ? [root at myBrick1 glusterfs]# gluster volume heal myVol info heal-failed | cut -d " " -f 3 | sort -u | wc -l 18 Digging on it Here are failling entry from gluster volume myVol info heal-failed <gfid:0982330a-2e08-4b97-9ea5-cf991d295e41> <gfid:29337848-ffad-413b-91b1-7bd062b8c939> <gfid:3140c4f6-d95c-41bb-93c4-18a644497160> <gfid:5dd03e08-c9b6-4315-b6ae-efcb45558f18> <gfid:75db102b-99f4-4852-98b6-d43e39c3ccb6> <gfid:9c193529-75bf-4f81-bbd6-95a952d646dd> <gfid:a21c5f72-6d05-4c56-a34f-fcbef48374da> <gfid:f6660583-c8a7-4d4a-88d8-1138fc1030f5> /images/myProject3/2.1 /images/myProject3/2.1_stale /images/myProject2/2.1 /images/myProject2/2.1_stale /images/myProject1/2.1 /images/myProject1/2.1_stale Lets compare their xattr getfattr from file above : myBrick1: /export/raid/myVol/images/myProject3/2.1 -> trusted.gfid=0x2d18a6f72a894f20a260478b5a9602be /export/raid/myVol/images/myProject3/2.1_stale -> trusted.gfid=0x29337848ffad413b91b17bd062b8c939 /export/raid/myVol/images/myProject2/2.1 -> trusted.gfid=0x04cdfe8bb83b4b27b42153df913b5181 /export/raid/myVol/images/myProject2/2.1_stale -> trusted.gfid=0x5dd03e08c9b64315b6aeefcb45558f18 /export/raid/myVol/images/myProject1/2.1 -> trusted.gfid=0xca8fedea8ad64612a33db75ea1ca4421 /export/raid/myVol/images/myProject1/2.1_stale -> trusted.gfid=0xa21c5f726d054c56a34ffcbef48374da myBrick2: /export/raid/myVol/images/myProject3/2.1 -> trusted.gfid=0x2d18a6f72a894f20a260478b5a9602be /export/raid/myVol/images/myProject3/2.1_stale -> trusted.gfid=0x29337848ffad413b91b17bd062b8c939 /export/raid/myVol/images/myProject2/2.1 -> trusted.gfid=0x04cdfe8bb83b4b27b42153df913b5181 /export/raid/myVol/images/myProject2/2.1_stale -> trusted.gfid=0x5dd03e08c9b64315b6aeefcb45558f18 /export/raid/myVol/images/myProject1/2.1 -> trusted.gfid=0xca8fedea8ad64612a33db75ea1ca4421 /export/raid/myVol/images/myProject1/2.1_stale -> trusted.gfid=0xa21c5f726d054c56a34ffcbef48374da Damns they seems good, I made a cross check md5sum all files are the same on both bricks, Let chek the symlinks now myBrick1: /export/raid/myVol/images/myProject3/2.1_stale/current xattr=trusted.gfid=0x95b01ba94e0c482eacf51ebb20c1cba1 /export/raid/myVol/images/myProject3/2.1_stale/latest_b xattr=trusted.gfid=0x5c0165dfe5c84c7ea076731065292135 /export/raid/myVol/images/myProject3/2.1_stale/latest_n xattr=trusted.gfid=0xab4de5d630084cde891ac65a7904f6d0 /export/raid/myVol/images/myProject3/2.1_stale/latest_s xattr=trusted.gfid=0x4c38edac7d4c4e5e8ad8ff44a164f7b8 /export/raid/myVol/images/myProject2/2.1_stale/current xattr=trusted.gfid=0x946ce9e7224f4fc581d817a1ebcec087 /export/raid/myVol/images/myProject2/2.1_stale/latest_b xattr=trusted.gfid=0x8fb0020e97a54be0953e4786c6933f86 /export/raid/myVol/images/myProject2/2.1_stale/latest_n xattr=trusted.gfid=0x8de1551a69f244e3bb0a61cbaba57414 /export/raid/myVol/images/myProject2/2.1_stale/latest_s xattr=trusted.gfid=0xd11e42f54a2944a68ee7fd1f544539a9 /export/raid/myVol/images/myProject1/2.1_stale/current xattr=trusted.gfid=0x1ce5eac809694d83b983023efaea0f64 /export/raid/myVol/images/myProject1/2.1_stale/latest_b xattr=trusted.gfid=0xcc25cfdf98f749caaf259c76fe1b85b1 /export/raid/myVol/images/myProject1/2.1_stale/latest_n xattr=trusted.gfid=0x3889f78789a14b388e9fb5caa2231cc7 /export/raid/myVol/images/myProject1/2.1_stale/latest_s xattr=trusted.gfid=0x2f9901e45c5a47d282faaf65c675cf48 myBrick2: /export/raid/myVol/images/myProject3/2.1_stale/current xattr=trusted.gfid=0xb6d0a17f397b4922a0ac0e3d740ca8c7 /export/raid/myVol/images/myProject3/2.1_stale/latest_b xattr=trusted.gfid=0xc80750eca11b40f48feb99ea6cd07799 /export/raid/myVol/images/myProject3/2.1_stale/latest_n xattr=trusted.gfid=0x57bb90d64bd74c29ab30d844b33528b7 /export/raid/myVol/images/myProject3/2.1_stale/latest_s xattr=trusted.gfid=0x47f94619468a419d98011d8e67a43068 /export/raid/myVol/images/myProject2/2.1_stale/current xattr=trusted.gfid=0xcda7ba2331e6489f95c524a17ae179bf /export/raid/myVol/images/myProject2/2.1_stale/latest_b xattr=trusted.gfid=0xc36205fa8d2c49bda64001d667aab8a6 /export/raid/myVol/images/myProject2/2.1_stale/latest_n xattr=trusted.gfid=0xf71b8f6a4ff14e75951b47ae40817b70 /export/raid/myVol/images/myProject2/2.1_stale/latest_s xattr=trusted.gfid=0xcba23dbde7b14b63acc43d287a1b527e /export/raid/myVol/images/myProject1/2.1_stale/current xattr=trusted.gfid=0x82434b00c17d4d7d88cff71ba4d8c10d /export/raid/myVol/images/myProject1/2.1_stale/latest_b xattr=trusted.gfid=0xbba1934c4f7240bab705bd0548fbdc22 /export/raid/myVol/images/myProject1/2.1_stale/latest_n xattr=trusted.gfid=0x408ef2e2c1c5454fb0b920065eebab2f /export/raid/myVol/images/myProject1/2.1_stale/latest_s xattr=trusted.gfid=0x9fc2992ad16c4186a0d36ef93fea73f1 Damn, trusted.gfid is not consistent between the bricks? how does it append ? When we power up the myBrick1 we have some automated jobs writing on the other brick through nfs, and this job copying a bunch of file and update the symlinks above. So definitely there is an issue with replication? but only with symlinks ??? So here are the questions: 1) Can we still use glusters in read/write operation when adding a new or old brick ? (down for maintenance for example) ? That a key point in our deployment for scalability and flexibility 2) How can I recover / delete the conflicted grid files ? (as symlinks are the same in both side, only xattr differs). Thanks a lot for your help Cyril -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140417/48a4b591/attachment.html>
Joe Julian
2014-Apr-19 16:42 UTC
[Gluster-users] Conflicting entries for symlinks between bricks (Trusted.gfid not consistent)
What would really help is a clear list of steps to reproduce this issue. It sounds like a bug but I can't repro. In your questions you ask in relation to adding or removing bricks whether you can continue to read and write. My understanding is that you're not doing that (gluster volume (add|remove)-brick) but rather just shutting down. If my understanding is correct, then yes. You should be able to continue normal operation. Repairing this issue is the same as healing split-brain. The easiest way is to use splitmount[1] to delete one of them. [1] https://forge.gluster.org/splitmount On April 17, 2014 4:37:32 PM PDT, "PEPONNET, Cyril (Cyril)" <cyril.peponnet at alcatel-lucent.com> wrote:>Hi gluster people ! > >I would like some help regarding an issue we have with our early >production glusterfs setup. > >Our Topology: > >2 Bricks in Replicate mode: > >[root at myBrick1 /]# cat /etc/redhat-release >CentOS release 6.5 (Final) >[root at myBrick1 /]# glusterfs --version >glusterfs 3.4.2 built on Jan 3 2014 12:38:05 >Repository revision: git://git.gluster.com/glusterfs.git >Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/> > > >[root at myBrick1 /]# gluster volume info > >Volume Name: myVol >Type: Replicate >Volume ID: 58f5d775-acb5-416d-bee6-5209f7b20363 >Status: Started >Number of Bricks: 1 x 2 = 2 >Transport-type: tcp >Bricks: >Brick1: myBrick1.company.lan:/export/raid/myVol >Brick2: myBrick2.company.lan:/export/raid/myVol >Options Reconfigured: >nfs.enable-ino32: on > >The issue: > >We power down a brick (myBrick1) for hardware maintenance, when we >power it up, issues starts with some files (symlinks in fact), auto >healing seems not working fine for all the files? > >Let's take a look with one faulty symlink: > >Using fuse.glusterfs (sometimes it works sometimes not) > >[root at myBrick2 /]mount >... >myBrick2.company.lan:/myVol on /images type fuse.glusterfs >(rw,default_permissions,allow_other,max_read=131072) >... > >[root at myBrick2 /]# stat /images/myProject1/2.1_stale/current > File: `/images/myProject1/2.1_stale/current' -> `current-59a77422' > Size: 16 Blocks: 0 IO Block: 131072 symbolic link >Device: 13h/19d Inode: 11422905275486058235 Links: 1 >Access: (0777/lrwxrwxrwx) Uid: ( 499/ testlab) Gid: ( 499/ >testlab) >Access: 2014-04-17 14:05:54.488238322 -0700 >Modify: 2014-04-16 19:46:05.033299589 -0700 >Change: 2014-04-17 14:05:54.487238322 -0700 > >[root at myBrick2 /]# stat /images/myProject1/2.1_stale/current >stat: cannot stat `/images/myProject1/2.1_stale/current': Input/output >error > >I type the above commands with few seconds between them. > >Let's try with the other brick > >[root at myBrick1 ~]mount >... >myBrick1.company.lan:/myVol on /images type fuse.glusterfs >(rw,default_permissions,allow_other,max_read=131072) >... > >[root at myBrick1 ~]# stat /images/myProject1/2.1_stale/current >stat: cannot stat `/images/myProject1/2.1_stale/current': Input/output >error > >With this one it always fail? (myBrick1 is the server we powered up >after maintenance). > >Using nfs: > >It never works (tested with two bricks) > >[root at station-localdomain myProject1]# mount >... >myBrick1:/myVol on /images type nfs >(rw,relatime,vers=3,rsize=8192,wsize=8192,namlen=255,hard,proto=tcp,timeo=14,retrans=2,sec=sys,mountaddr=10.0.0.57,mountvers=3,mountport=38465,mountproto=tcp,local_lock=none,addr=10.0.0.57) >... > >[root at station-localdomain myProject1]# ls 2.1_stale >ls: cannot access 2.1_stale: Input/output error > >In both cases here are the logs: > >==> /var/log/glusterfs/glustershd.log <=>[2014-04-17 10:20:25.861003] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: ><gfid:fcbbe770-6388-4d74-a78a-7939b17e36aa>: Performing conservative >merge >[2014-04-17 10:20:25.895143] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: ><gfid:ae058719-61de-47de-82dc-6cb8a3d80afe>: Performing conservative >merge >[2014-04-17 10:20:25.949176] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: ><gfid:868e3eb7-03e6-4b6b-a75a-16b31bdf8a10>: Performing conservative >merge >[2014-04-17 10:20:25.995289] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: ><gfid:115efb83-2154-4f9d-8c70-a31f476db110>: Performing conservative >merge >[2014-04-17 10:20:26.013995] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: ><gfid:0982330a-2e08-4b97-9ea5-cf991d295e41>: Performing conservative >merge >[2014-04-17 10:20:26.050693] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: ><gfid:3a15b54b-b92c-4ed5-875e-1af0a3b94e0c>: Performing conservative >merge > >==> /var/log/glusterfs/usr-global.log <=>[2014-04-17 10:20:38.281705] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: >/images/myProject1/2.1_stale: Performing conservative merge >[2014-04-17 10:20:38.286986] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 1 >[2014-04-17 10:20:38.287030] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_s >[2014-04-17 10:20:38.287169] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 1 >[2014-04-17 10:20:38.287202] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_b >[2014-04-17 10:20:38.287280] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 >[2014-04-17 10:20:38.287308] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_n >[2014-04-17 10:20:38.287506] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/current: gfid differs on subvolume 1 >[2014-04-17 10:20:38.287538] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/current >[2014-04-17 10:20:38.311222] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 0 >[2014-04-17 10:20:38.311277] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_s >[2014-04-17 10:20:38.311345] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 >[2014-04-17 10:20:38.311385] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_b >[2014-04-17 10:20:38.311473] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/current: gfid differs on subvolume 0 >[2014-04-17 10:20:38.311502] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/current >[2014-04-17 10:20:38.332110] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 >[2014-04-17 10:20:38.332149] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_n >[2014-04-17 10:20:38.332845] E >[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] >0-myVol-replicate-0: background entry self-heal failed on >/images/myProject1/2.1_stale >[2014-04-17 10:20:41.447911] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: >/images/myProject1/2.1_stale: Performing conservative merge >[2014-04-17 10:20:41.453950] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 1 >[2014-04-17 10:20:41.453998] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_s >[2014-04-17 10:20:41.454135] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 1 >[2014-04-17 10:20:41.454163] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_b >[2014-04-17 10:20:41.454237] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 >[2014-04-17 10:20:41.454263] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_n >[2014-04-17 10:20:41.454385] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/current: gfid differs on subvolume 1 >[2014-04-17 10:20:41.454413] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/current >[2014-04-17 10:20:41.479015] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 0 >[2014-04-17 10:20:41.479063] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_s >[2014-04-17 10:20:41.479149] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 >[2014-04-17 10:20:41.479177] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_b >[2014-04-17 10:20:41.479252] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/current: gfid differs on subvolume 0 >[2014-04-17 10:20:41.479279] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/current >[2014-04-17 10:20:41.499291] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 >[2014-04-17 10:20:41.499333] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_n >[2014-04-17 10:20:41.499995] E >[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] >0-myVol-replicate-0: background entry self-heal failed on >/images/myProject1/2.1_stale >[2014-04-17 10:20:43.149818] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: >/images/myProject1/2.1_stale: Performing conservative merge >[2014-04-17 10:20:43.155127] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 1 >[2014-04-17 10:20:43.155185] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_s >[2014-04-17 10:20:43.155308] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 >[2014-04-17 10:20:43.155346] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_b >[2014-04-17 10:20:43.155441] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 0 >[2014-04-17 10:20:43.155477] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_n >[2014-04-17 10:20:43.155628] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/current: gfid differs on subvolume 1 >[2014-04-17 10:20:43.155660] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/current >[2014-04-17 10:20:43.180271] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 0 >[2014-04-17 10:20:43.180324] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_s >[2014-04-17 10:20:43.180425] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 >[2014-04-17 10:20:43.180455] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_b >[2014-04-17 10:20:43.180545] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/current: gfid differs on subvolume 0 >[2014-04-17 10:20:43.180578] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/current >[2014-04-17 10:20:43.201070] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 >[2014-04-17 10:20:43.201112] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_n >[2014-04-17 10:20:43.201788] E >[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] >0-myVol-replicate-0: background entry self-heal failed on >/images/myProject1/2.1_stale >[2014-04-17 10:20:44.646242] I >[afr-self-heal-entry.c:2253:afr_sh_entry_fix] 0-myVol-replicate-0: >/images/myProject1/2.1_stale: Performing conservative merge >[2014-04-17 10:20:44.652027] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 1 >[2014-04-17 10:20:44.652072] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_s >[2014-04-17 10:20:44.652207] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 1 >[2014-04-17 10:20:44.652239] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_b >[2014-04-17 10:20:44.652341] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 >[2014-04-17 10:20:44.652372] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_n >[2014-04-17 10:20:44.652518] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/current: gfid differs on subvolume 1 >[2014-04-17 10:20:44.652550] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/current >[2014-04-17 10:20:44.676929] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_s: gfid differs on subvolume 0 >[2014-04-17 10:20:44.676973] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_s >[2014-04-17 10:20:44.677062] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_b: gfid differs on subvolume 0 >[2014-04-17 10:20:44.677107] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_b >[2014-04-17 10:20:44.677196] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/current: gfid differs on subvolume 0 >[2014-04-17 10:20:44.677225] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/current >[2014-04-17 10:20:44.698071] W >[afr-common.c:1505:afr_conflicting_iattrs] 0-myVol-replicate-0: >/images/myProject1/2.1_stale/latest_n: gfid differs on subvolume 1 >[2014-04-17 10:20:44.698113] E >[afr-self-heal-common.c:1433:afr_sh_common_lookup_cbk] >0-myVol-replicate-0: Conflicting entries for >/images/myProject1/2.1_stale/latest_n >[2014-04-17 10:20:44.698816] E >[afr-self-heal-common.c:2212:afr_self_heal_completion_cbk] >0-myVol-replicate-0: background entry self-heal failed on >/images/myProject1/2.1_stale > > >So let?s take a deeper look. > >Any split-brain ? > >[root at myBrick1 glusterfs]# gluster volume heal myVol info split-brain >Gathering Heal info on volume myVol has been successful > >Brick myBrick1.company.lan:/export/raid/myVol >Number of entries: 0 > >Brick myBrick2.company.lan:/export/raid/myVol >Number of entries: 0 > >Nop > >Any heal-failed ? > >[root at myBrick1 glusterfs]# gluster volume heal myVol info heal-failed >|wc -l >380 > >Plenty? how unique of them ? > >[root at myBrick1 glusterfs]# gluster volume heal myVol info heal-failed | >cut -d " " -f 3 | sort -u | wc -l >18 > >Digging on it > >Here are failling entry from gluster volume myVol info heal-failed > ><gfid:0982330a-2e08-4b97-9ea5-cf991d295e41> ><gfid:29337848-ffad-413b-91b1-7bd062b8c939> ><gfid:3140c4f6-d95c-41bb-93c4-18a644497160> ><gfid:5dd03e08-c9b6-4315-b6ae-efcb45558f18> ><gfid:75db102b-99f4-4852-98b6-d43e39c3ccb6> ><gfid:9c193529-75bf-4f81-bbd6-95a952d646dd> ><gfid:a21c5f72-6d05-4c56-a34f-fcbef48374da> ><gfid:f6660583-c8a7-4d4a-88d8-1138fc1030f5> >/images/myProject3/2.1 >/images/myProject3/2.1_stale >/images/myProject2/2.1 >/images/myProject2/2.1_stale >/images/myProject1/2.1 >/images/myProject1/2.1_stale > >Lets compare their xattr > >getfattr from file above : > >myBrick1: > >/export/raid/myVol/images/myProject3/2.1 -> >trusted.gfid=0x2d18a6f72a894f20a260478b5a9602be >/export/raid/myVol/images/myProject3/2.1_stale -> >trusted.gfid=0x29337848ffad413b91b17bd062b8c939 >/export/raid/myVol/images/myProject2/2.1 -> >trusted.gfid=0x04cdfe8bb83b4b27b42153df913b5181 >/export/raid/myVol/images/myProject2/2.1_stale -> >trusted.gfid=0x5dd03e08c9b64315b6aeefcb45558f18 >/export/raid/myVol/images/myProject1/2.1 -> >trusted.gfid=0xca8fedea8ad64612a33db75ea1ca4421 >/export/raid/myVol/images/myProject1/2.1_stale -> >trusted.gfid=0xa21c5f726d054c56a34ffcbef48374da > > >myBrick2: > >/export/raid/myVol/images/myProject3/2.1 -> >trusted.gfid=0x2d18a6f72a894f20a260478b5a9602be >/export/raid/myVol/images/myProject3/2.1_stale -> >trusted.gfid=0x29337848ffad413b91b17bd062b8c939 >/export/raid/myVol/images/myProject2/2.1 -> >trusted.gfid=0x04cdfe8bb83b4b27b42153df913b5181 >/export/raid/myVol/images/myProject2/2.1_stale -> >trusted.gfid=0x5dd03e08c9b64315b6aeefcb45558f18 >/export/raid/myVol/images/myProject1/2.1 -> >trusted.gfid=0xca8fedea8ad64612a33db75ea1ca4421 >/export/raid/myVol/images/myProject1/2.1_stale -> >trusted.gfid=0xa21c5f726d054c56a34ffcbef48374da > >Damns they seems good, I made a cross check md5sum all files are the >same on both bricks, > >Let chek the symlinks now > >myBrick1: > >/export/raid/myVol/images/myProject3/2.1_stale/current >xattr=trusted.gfid=0x95b01ba94e0c482eacf51ebb20c1cba1 >/export/raid/myVol/images/myProject3/2.1_stale/latest_b >xattr=trusted.gfid=0x5c0165dfe5c84c7ea076731065292135 >/export/raid/myVol/images/myProject3/2.1_stale/latest_n >xattr=trusted.gfid=0xab4de5d630084cde891ac65a7904f6d0 >/export/raid/myVol/images/myProject3/2.1_stale/latest_s >xattr=trusted.gfid=0x4c38edac7d4c4e5e8ad8ff44a164f7b8 >/export/raid/myVol/images/myProject2/2.1_stale/current >xattr=trusted.gfid=0x946ce9e7224f4fc581d817a1ebcec087 >/export/raid/myVol/images/myProject2/2.1_stale/latest_b >xattr=trusted.gfid=0x8fb0020e97a54be0953e4786c6933f86 >/export/raid/myVol/images/myProject2/2.1_stale/latest_n >xattr=trusted.gfid=0x8de1551a69f244e3bb0a61cbaba57414 >/export/raid/myVol/images/myProject2/2.1_stale/latest_s >xattr=trusted.gfid=0xd11e42f54a2944a68ee7fd1f544539a9 >/export/raid/myVol/images/myProject1/2.1_stale/current >xattr=trusted.gfid=0x1ce5eac809694d83b983023efaea0f64 >/export/raid/myVol/images/myProject1/2.1_stale/latest_b >xattr=trusted.gfid=0xcc25cfdf98f749caaf259c76fe1b85b1 >/export/raid/myVol/images/myProject1/2.1_stale/latest_n >xattr=trusted.gfid=0x3889f78789a14b388e9fb5caa2231cc7 >/export/raid/myVol/images/myProject1/2.1_stale/latest_s >xattr=trusted.gfid=0x2f9901e45c5a47d282faaf65c675cf48 > >myBrick2: > >/export/raid/myVol/images/myProject3/2.1_stale/current >xattr=trusted.gfid=0xb6d0a17f397b4922a0ac0e3d740ca8c7 >/export/raid/myVol/images/myProject3/2.1_stale/latest_b >xattr=trusted.gfid=0xc80750eca11b40f48feb99ea6cd07799 >/export/raid/myVol/images/myProject3/2.1_stale/latest_n >xattr=trusted.gfid=0x57bb90d64bd74c29ab30d844b33528b7 >/export/raid/myVol/images/myProject3/2.1_stale/latest_s >xattr=trusted.gfid=0x47f94619468a419d98011d8e67a43068 >/export/raid/myVol/images/myProject2/2.1_stale/current >xattr=trusted.gfid=0xcda7ba2331e6489f95c524a17ae179bf >/export/raid/myVol/images/myProject2/2.1_stale/latest_b >xattr=trusted.gfid=0xc36205fa8d2c49bda64001d667aab8a6 >/export/raid/myVol/images/myProject2/2.1_stale/latest_n >xattr=trusted.gfid=0xf71b8f6a4ff14e75951b47ae40817b70 >/export/raid/myVol/images/myProject2/2.1_stale/latest_s >xattr=trusted.gfid=0xcba23dbde7b14b63acc43d287a1b527e >/export/raid/myVol/images/myProject1/2.1_stale/current >xattr=trusted.gfid=0x82434b00c17d4d7d88cff71ba4d8c10d >/export/raid/myVol/images/myProject1/2.1_stale/latest_b >xattr=trusted.gfid=0xbba1934c4f7240bab705bd0548fbdc22 >/export/raid/myVol/images/myProject1/2.1_stale/latest_n >xattr=trusted.gfid=0x408ef2e2c1c5454fb0b920065eebab2f >/export/raid/myVol/images/myProject1/2.1_stale/latest_s >xattr=trusted.gfid=0x9fc2992ad16c4186a0d36ef93fea73f1 > >Damn, trusted.gfid is not consistent between the bricks? how does it >append ? > >When we power up the myBrick1 we have some automated jobs writing on >the other brick through nfs, and this job copying a bunch of file and >update the symlinks above. > >So definitely there is an issue with replication? but only with >symlinks ??? > >So here are the questions: > >1) Can we still use glusters in read/write operation when adding a new >or old brick ? (down for maintenance for example) ? That a key point in >our deployment for scalability and flexibility >2) How can I recover / delete the conflicted grid files ? (as symlinks >are the same in both side, only xattr differs). > > >Thanks a lot for your help > >Cyril > > >------------------------------------------------------------------------ > >_______________________________________________ >Gluster-users mailing list >Gluster-users at gluster.org >http://supercolony.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140419/a1a464c0/attachment.html>