Dear All, I figured it out, it appeared to be the exact same issue as described here; https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html Another subvolume also had the shard file, only were all 0 bytes and had the dht.linkto for reference; [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 trusted.gfid=0x298147e49f9748b2baf1c8fff897244d trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 trusted.gfid=0x298147e49f9748b2baf1c8fff897244d trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 [root at lease-04 ovirt-backbone-2]# stat .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d? Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fd01h/64769d Inode: 1918631406 Links: 2 Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:etc_runtime_t:s0 Access: 2018-12-17 21:43:36.405735296 +0000 Modify: 2018-12-17 21:43:36.405735296 +0000 Change: 2018-12-17 21:43:36.405735296 +0000 Birth: - removing the shard file and glusterfs file from each node resolved the issue. I also found this thread; https://lists.gluster.org/pipermail/gluster-users/2018-December/035460.html Maybe he suffers from the same issue. Best Olaf Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar <olaf.buitelaar at gmail.com>:> Dear All, > > It appears i've a stale file in one of the volumes, on 2 files. These > files are qemu images (1 raw and 1 qcow2). > I'll just focus on 1 file since the situation on the other seems the same. > > The VM get's paused more or less directly after being booted with error; > [2018-12-18 14:05:05.275713] E [MSGID: 133010] > [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-backbone-2-shard: > Lookup on shard 51500 failed. Base file gfid > f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle] > > investigating the shard; > > #on the arbiter node: > > [root at lease-05 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string > /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 > getfattr: Removing leading '/' from absolute path names > # file: > mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 > glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" > > [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex > .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > > security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 > > trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 > > [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex > .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > > security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 > > trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 > > [root at lease-05 ovirt-backbone-2]# stat > .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? > Size: 0 Blocks: 0 IO Block: 4096 regular empty > file > Device: fd01h/64769d Inode: 537277306 Links: 2 > Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) > Context: system_u:object_r:etc_runtime_t:s0 > Access: 2018-12-17 21:43:36.361984810 +0000 > Modify: 2018-12-17 21:43:36.361984810 +0000 > Change: 2018-12-18 20:55:29.908647417 +0000 > Birth: - > > [root at lease-05 ovirt-backbone-2]# find . -inum 537277306 > ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > > #on the data nodes: > > [root at lease-08 ~]# getfattr -n glusterfs.gfid.string > /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 > getfattr: Removing leading '/' from absolute path names > # file: > mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 > glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" > > [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex > .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > > security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 > > trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 > > [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex > .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > > security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 > > trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 > > [root at lease-08 ovirt-backbone-2]# stat > .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? > Size: 2166784 Blocks: 4128 IO Block: 4096 regular file > Device: fd03h/64771d Inode: 12893624759 Links: 3 > Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) > Context: system_u:object_r:etc_runtime_t:s0 > Access: 2018-12-18 18:52:38.070776585 +0000 > Modify: 2018-12-17 21:43:36.388054443 +0000 > Change: 2018-12-18 21:01:47.810506528 +0000 > Birth: - > > [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759 > ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > > =======================> > [root at lease-11 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string > /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 > getfattr: Removing leading '/' from absolute path names > # file: > mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 > glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" > > [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex > .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > > security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 > > trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 > > [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex > .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > > security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 > trusted.afr.dirty=0x000000000000000000000000 > trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 > > trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 > > [root at lease-11 ovirt-backbone-2]# stat > .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? > Size: 2166784 Blocks: 4128 IO Block: 4096 regular file > Device: fd03h/64771d Inode: 12956094809 Links: 3 > Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) > Context: system_u:object_r:etc_runtime_t:s0 > Access: 2018-12-18 20:11:53.595208449 +0000 > Modify: 2018-12-17 21:43:36.391580259 +0000 > Change: 2018-12-18 19:19:25.888055392 +0000 > Birth: - > > [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809 > ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 > ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > > ===============> > I don't really see any inconsistencies, except the dates on the stat. > However this is only after i tried moving the file out of the volumes to > force a heal, which does happen on the data nodes, but not on the arbiter > node. Before that they were also the same. > I've also compared the file > ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2 nodes and they > are exactly the same. > > Things i've further tried; > - gluster v heal ovirt-backbone-2 full => gluster v heal ovirt-backbone-2 > info reports 0 entries on all nodes > > - stop each glusterd and glusterfsd, pause around 40sec and start them > again on each node, 1 at a time, waiting for the heal to recover before > moving to the next node > > - force a heal by stopping glusterd on a node and perform these steps; > mkdir /mnt/ovirt-backbone-2/trigger > rmdir /mnt/ovirt-backbone-2/trigger > setfattr -n trusted.non-existent-key -v abc /mnt/ovirt-backbone-2/ > setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/ > start glusterd > > - gluster volume rebalance ovirt-backbone-2 start => success > > Whats further interesting is that according the mount log, the volume is > in split-brain; > [2018-12-18 10:06:04.606870] E [MSGID: 108008] > [afr-read-txn.c:90:afr_read_txn_refresh_done] > 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid > 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output > error] > [2018-12-18 10:06:04.606908] E [MSGID: 133014] > [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: > 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] > [2018-12-18 10:06:04.606927] W [fuse-bridge.c:871:fuse_attr_cbk] > 0-glusterfs-fuse: 428090: FSTAT() > /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) > [2018-12-18 10:06:05.107729] E [MSGID: 108008] > [afr-read-txn.c:90:afr_read_txn_refresh_done] > 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid > 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output > error] > [2018-12-18 10:06:05.107770] E [MSGID: 133014] > [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: > 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] > [2018-12-18 10:06:05.107791] W [fuse-bridge.c:871:fuse_attr_cbk] > 0-glusterfs-fuse: 428091: FSTAT() > /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) > [2018-12-18 10:06:05.537244] I [MSGID: 108006] > [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no > subvolumes up > [2018-12-18 10:06:05.538523] E [MSGID: 108008] > [afr-read-txn.c:90:afr_read_txn_refresh_done] > 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid > 00000000-0000-0000-0000-000000000001: split-brain observed. [Input/output > error] > [2018-12-18 10:06:05.538685] I [MSGID: 108006] > [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no > subvolumes up > [2018-12-18 10:06:05.538794] I [MSGID: 108006] > [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no > subvolumes up > [2018-12-18 10:06:05.539342] I [MSGID: 109063] > [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht: Found > anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid > 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0 > [2018-12-18 10:06:05.539372] W [MSGID: 109005] > [dht-selfheal.c:2158:dht_selfheal_directory] 0-ovirt-backbone-2-dht: > Directory selfheal failed: 2 subvolumes down.Not fixing. path > /b1c2c949-aef4-4aec-999b-b179efeef732, gfid > 8c8598ce-1a52-418e-a7b4-435fee34bae8 > [2018-12-18 10:06:05.539694] I [MSGID: 108006] > [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no > subvolumes up > [2018-12-18 10:06:05.540652] I [MSGID: 108006] > [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no > subvolumes up > [2018-12-18 10:06:05.608612] E [MSGID: 108008] > [afr-read-txn.c:90:afr_read_txn_refresh_done] > 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid > 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output > error] > [2018-12-18 10:06:05.608657] E [MSGID: 133014] > [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: > 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] > [2018-12-18 10:06:05.608672] W [fuse-bridge.c:871:fuse_attr_cbk] > 0-glusterfs-fuse: 428096: FSTAT() > /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) > [2018-12-18 10:06:06.109339] E [MSGID: 108008] > [afr-read-txn.c:90:afr_read_txn_refresh_done] > 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid > 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output > error] > [2018-12-18 10:06:06.109378] E [MSGID: 133014] > [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: > 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] > [2018-12-18 10:06:06.109399] W [fuse-bridge.c:871:fuse_attr_cbk] > 0-glusterfs-fuse: 428097: FSTAT() > /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) > > #note i'm able to see ; /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids > [root at lease-11 ovirt-backbone-2]# stat > /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids > File: > ?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids? > Size: 1048576 Blocks: 2048 IO Block: 131072 regular file > Device: 41h/65d Inode: 10492258721813610344 Links: 1 > Access: (0660/-rw-rw----) Uid: ( 36/ vdsm) Gid: ( 36/ kvm) > Context: system_u:object_r:fusefs_t:s0 > Access: 2018-12-19 20:07:39.917573869 +0000 > Modify: 2018-12-19 20:07:39.928573917 +0000 > Change: 2018-12-19 20:07:39.929573921 +0000 > Birth: - > > however checking: gluster v heal ovirt-backbone-2 info split-brain > reports no entries. > > I've also tried mounting the qemu image, and this works fine, i'm able to > see all contents; > losetup /dev/loop0 > /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 > kpartx -a /dev/loop0 > vgscan > vgchange -ay slave-data > mkdir /mnt/slv01 > mount /dev/mapper/slave--data-lvol0 /mnt/slv01/ > > Possible causes for this issue; > 1. the machine "lease-11" suffered from a faulty RAM module (ECC), which > halted the machine and causes an invalid state. (this machine also hosts > other volumes, with similar configurations, which report no issue) > 2. after the RAM module was replaced, the VM using the backing qemu image, > was restored from a backup (the backup was file based within the VM on a > different directory). This is because some files were corrupted. The > backup/recovery obviously causes extra IO, possible introducing race > conditions? The machine did run for about 12h without issues, and in total > for about 36h. > 3. since only the client (maybe only gfapi?) reports errors, something is > broken there? > > The volume info; > root at lease-06 ~# gluster v info ovirt-backbone-2 > > Volume Name: ovirt-backbone-2 > Type: Distributed-Replicate > Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28 > Status: Started > Snapshot Count: 0 > Number of Bricks: 3 x (2 + 1) = 9 > Transport-type: tcp > Bricks: > Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2 > Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2 > Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) > Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2 > Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2 > Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) > Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2 > Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2 > Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) > Options Reconfigured: > nfs.disable: on > transport.address-family: inet > performance.quick-read: off > performance.read-ahead: off > performance.io-cache: off > performance.low-prio-threads: 32 > network.remote-dio: enable > cluster.eager-lock: enable > cluster.quorum-type: auto > cluster.server-quorum-type: server > cluster.data-self-heal-algorithm: full > cluster.locking-scheme: granular > cluster.shd-max-threads: 8 > cluster.shd-wait-qlength: 10000 > features.shard: on > user.cifs: off > storage.owner-uid: 36 > storage.owner-gid: 36 > features.shard-block-size: 64MB > performance.write-behind-window-size: 512MB > performance.cache-size: 384MB > cluster.brick-multiplex: on > > The volume status; > root at lease-06 ~# gluster v status ovirt-backbone-2 > Status of volume: ovirt-backbone-2 > Gluster process TCP Port RDMA Port Online > Pid > > ------------------------------------------------------------------------------ > Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi > rt-backbone-2 49152 0 Y > 7727 > Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi > rt-backbone-2 49152 0 Y > 12620 > Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi > rt-backbone-2 49152 0 Y > 8794 > Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov > irt-backbone-2 49161 0 Y > 22333 > Brick 10.32.9.21:/data0/gfs/bricks/brick1/o > virt-backbone-2 49152 0 Y > 15030 > Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi > rt-backbone-2 49166 0 Y > 24592 > Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov > irt-backbone-2 49153 0 Y > 20148 > Brick 10.32.9.20:/data0/gfs/bricks/brick1/o > virt-backbone-2 49154 0 Y > 15413 > Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi > rt-backbone-2 49152 0 Y > 43120 > Self-heal Daemon on localhost N/A N/A Y > 44587 > Self-heal Daemon on 10.201.0.2 N/A N/A Y > 8401 > Self-heal Daemon on 10.201.0.5 N/A N/A Y > 11038 > Self-heal Daemon on 10.201.0.8 N/A N/A Y > 9513 > Self-heal Daemon on 10.32.9.4 N/A N/A Y > 23736 > Self-heal Daemon on 10.32.9.20 N/A N/A Y > 2738 > Self-heal Daemon on 10.32.9.3 N/A N/A Y > 25598 > Self-heal Daemon on 10.32.9.5 N/A N/A Y > 511 > Self-heal Daemon on 10.32.9.9 N/A N/A Y > 23357 > Self-heal Daemon on 10.32.9.8 N/A N/A Y > 15225 > Self-heal Daemon on 10.32.9.7 N/A N/A Y > 25781 > Self-heal Daemon on 10.32.9.21 N/A N/A Y > 5034 > > Task Status of Volume ovirt-backbone-2 > > ------------------------------------------------------------------------------ > Task : Rebalance > ID : 6dfbac43-0125-4568-9ac3-a2c453faaa3d > Status : completed > > gluster version is @3.12.15 and cluster.op-version=31202 > > =======================> > It would be nice to know if it's possible to mark the files as not stale > or if i should investigate other things? > Or should we consider this volume lost? > Also checking the code at; > https://github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c > it seems the functions shifted quite some (line 1724 vs. 2243), so maybe > it's fixed in a future version? > Any thoughts are welcome. > > Thanks Olaf > > > > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181220/2e73df5a/attachment.html>
Dear All, till now a selected group of VM's still seem to produce new stale file's and getting paused due to this. I've not updated gluster recently, however i did change the op version from 31200 to 31202 about a week before this issue arose. Looking at the .shard directory, i've 100.000+ files sharing the same characteristics as a stale file. which are found till now, they all have the sticky bit set, e.g. file permissions; ---------T. are 0kb in size, and have the trusted.glusterfs.dht.linkto attribute. These files range from long a go (beginning of the year) till now. Which makes me suspect this was laying dormant for some time now..and somehow recently surfaced. Checking other sub-volumes they contain also 0kb files in the .shard directory, but don't have the sticky bit and the linkto attribute. Does anybody else experience this issue? Could this be a bug or an environmental issue? Also i wonder if there is any tool or gluster command to clean all stale file handles? Otherwise i'm planning to make a simple bash script, which iterates over the .shard dir, checks each file for the above mentioned criteria, and (re)moves the file and the corresponding .glusterfs file. If there are other criteria needed to identify a stale file handle, i would like to hear that. If this is a viable and safe operation to do of course. Thanks Olaf Op do 20 dec. 2018 om 13:43 schreef Olaf Buitelaar <olaf.buitelaar at gmail.com>:> Dear All, > > I figured it out, it appeared to be the exact same issue as described > here; > https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html > Another subvolume also had the shard file, only were all 0 bytes and had > the dht.linkto > > for reference; > [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex > .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 > > security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 > trusted.gfid=0x298147e49f9748b2baf1c8fff897244d > > trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 > > trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 > > [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex > .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d > # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d > > security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 > trusted.gfid=0x298147e49f9748b2baf1c8fff897244d > > trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 > > trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 > > [root at lease-04 ovirt-backbone-2]# stat > .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d > File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d? > Size: 0 Blocks: 0 IO Block: 4096 regular empty > file > Device: fd01h/64769d Inode: 1918631406 Links: 2 > Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root) > Context: system_u:object_r:etc_runtime_t:s0 > Access: 2018-12-17 21:43:36.405735296 +0000 > Modify: 2018-12-17 21:43:36.405735296 +0000 > Change: 2018-12-17 21:43:36.405735296 +0000 > Birth: - > > removing the shard file and glusterfs file from each node resolved the > issue. > > I also found this thread; > https://lists.gluster.org/pipermail/gluster-users/2018-December/035460.html > Maybe he suffers from the same issue. > > Best Olaf > > > Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar < > olaf.buitelaar at gmail.com>: > >> Dear All, >> >> It appears i've a stale file in one of the volumes, on 2 files. These >> files are qemu images (1 raw and 1 qcow2). >> I'll just focus on 1 file since the situation on the other seems the >> same. >> >> The VM get's paused more or less directly after being booted with error; >> [2018-12-18 14:05:05.275713] E [MSGID: 133010] >> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-backbone-2-shard: >> Lookup on shard 51500 failed. Base file gfid >> f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle] >> >> investigating the shard; >> >> #on the arbiter node: >> >> [root at lease-05 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >> getfattr: Removing leading '/' from absolute path names >> # file: >> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >> >> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> [root at lease-05 ovirt-backbone-2]# stat >> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >> Size: 0 Blocks: 0 IO Block: 4096 regular empty >> file >> Device: fd01h/64769d Inode: 537277306 Links: 2 >> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >> Context: system_u:object_r:etc_runtime_t:s0 >> Access: 2018-12-17 21:43:36.361984810 +0000 >> Modify: 2018-12-17 21:43:36.361984810 +0000 >> Change: 2018-12-18 20:55:29.908647417 +0000 >> Birth: - >> >> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306 >> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> >> #on the data nodes: >> >> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string >> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >> getfattr: Removing leading '/' from absolute path names >> # file: >> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >> >> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> [root at lease-08 ovirt-backbone-2]# stat >> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >> Size: 2166784 Blocks: 4128 IO Block: 4096 regular file >> Device: fd03h/64771d Inode: 12893624759 Links: 3 >> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >> Context: system_u:object_r:etc_runtime_t:s0 >> Access: 2018-12-18 18:52:38.070776585 +0000 >> Modify: 2018-12-17 21:43:36.388054443 +0000 >> Change: 2018-12-18 21:01:47.810506528 +0000 >> Birth: - >> >> [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759 >> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> >> =======================>> >> [root at lease-11 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >> getfattr: Removing leading '/' from absolute path names >> # file: >> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >> >> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.afr.dirty=0x000000000000000000000000 >> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> [root at lease-11 ovirt-backbone-2]# stat >> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >> Size: 2166784 Blocks: 4128 IO Block: 4096 regular file >> Device: fd03h/64771d Inode: 12956094809 Links: 3 >> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >> Context: system_u:object_r:etc_runtime_t:s0 >> Access: 2018-12-18 20:11:53.595208449 +0000 >> Modify: 2018-12-17 21:43:36.391580259 +0000 >> Change: 2018-12-18 19:19:25.888055392 +0000 >> Birth: - >> >> [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809 >> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> >> ===============>> >> I don't really see any inconsistencies, except the dates on the stat. >> However this is only after i tried moving the file out of the volumes to >> force a heal, which does happen on the data nodes, but not on the arbiter >> node. Before that they were also the same. >> I've also compared the file >> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2 nodes and they >> are exactly the same. >> >> Things i've further tried; >> - gluster v heal ovirt-backbone-2 full => gluster v heal ovirt-backbone-2 >> info reports 0 entries on all nodes >> >> - stop each glusterd and glusterfsd, pause around 40sec and start them >> again on each node, 1 at a time, waiting for the heal to recover before >> moving to the next node >> >> - force a heal by stopping glusterd on a node and perform these steps; >> mkdir /mnt/ovirt-backbone-2/trigger >> rmdir /mnt/ovirt-backbone-2/trigger >> setfattr -n trusted.non-existent-key -v abc /mnt/ovirt-backbone-2/ >> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/ >> start glusterd >> >> - gluster volume rebalance ovirt-backbone-2 start => success >> >> Whats further interesting is that according the mount log, the volume is >> in split-brain; >> [2018-12-18 10:06:04.606870] E [MSGID: 108008] >> [afr-read-txn.c:90:afr_read_txn_refresh_done] >> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >> error] >> [2018-12-18 10:06:04.606908] E [MSGID: 133014] >> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >> [2018-12-18 10:06:04.606927] W [fuse-bridge.c:871:fuse_attr_cbk] >> 0-glusterfs-fuse: 428090: FSTAT() >> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >> [2018-12-18 10:06:05.107729] E [MSGID: 108008] >> [afr-read-txn.c:90:afr_read_txn_refresh_done] >> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >> error] >> [2018-12-18 10:06:05.107770] E [MSGID: 133014] >> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >> [2018-12-18 10:06:05.107791] W [fuse-bridge.c:871:fuse_attr_cbk] >> 0-glusterfs-fuse: 428091: FSTAT() >> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >> [2018-12-18 10:06:05.537244] I [MSGID: 108006] >> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >> subvolumes up >> [2018-12-18 10:06:05.538523] E [MSGID: 108008] >> [afr-read-txn.c:90:afr_read_txn_refresh_done] >> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid >> 00000000-0000-0000-0000-000000000001: split-brain observed. [Input/output >> error] >> [2018-12-18 10:06:05.538685] I [MSGID: 108006] >> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >> subvolumes up >> [2018-12-18 10:06:05.538794] I [MSGID: 108006] >> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >> subvolumes up >> [2018-12-18 10:06:05.539342] I [MSGID: 109063] >> [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht: Found >> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid >> 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0 >> [2018-12-18 10:06:05.539372] W [MSGID: 109005] >> [dht-selfheal.c:2158:dht_selfheal_directory] 0-ovirt-backbone-2-dht: >> Directory selfheal failed: 2 subvolumes down.Not fixing. path >> /b1c2c949-aef4-4aec-999b-b179efeef732, gfid >> 8c8598ce-1a52-418e-a7b4-435fee34bae8 >> [2018-12-18 10:06:05.539694] I [MSGID: 108006] >> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >> subvolumes up >> [2018-12-18 10:06:05.540652] I [MSGID: 108006] >> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >> subvolumes up >> [2018-12-18 10:06:05.608612] E [MSGID: 108008] >> [afr-read-txn.c:90:afr_read_txn_refresh_done] >> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >> error] >> [2018-12-18 10:06:05.608657] E [MSGID: 133014] >> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >> [2018-12-18 10:06:05.608672] W [fuse-bridge.c:871:fuse_attr_cbk] >> 0-glusterfs-fuse: 428096: FSTAT() >> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >> [2018-12-18 10:06:06.109339] E [MSGID: 108008] >> [afr-read-txn.c:90:afr_read_txn_refresh_done] >> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >> error] >> [2018-12-18 10:06:06.109378] E [MSGID: 133014] >> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >> [2018-12-18 10:06:06.109399] W [fuse-bridge.c:871:fuse_attr_cbk] >> 0-glusterfs-fuse: 428097: FSTAT() >> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >> >> #note i'm able to see ; /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >> [root at lease-11 ovirt-backbone-2]# stat >> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >> File: >> ?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids? >> Size: 1048576 Blocks: 2048 IO Block: 131072 regular file >> Device: 41h/65d Inode: 10492258721813610344 Links: 1 >> Access: (0660/-rw-rw----) Uid: ( 36/ vdsm) Gid: ( 36/ kvm) >> Context: system_u:object_r:fusefs_t:s0 >> Access: 2018-12-19 20:07:39.917573869 +0000 >> Modify: 2018-12-19 20:07:39.928573917 +0000 >> Change: 2018-12-19 20:07:39.929573921 +0000 >> Birth: - >> >> however checking: gluster v heal ovirt-backbone-2 info split-brain >> reports no entries. >> >> I've also tried mounting the qemu image, and this works fine, i'm able to >> see all contents; >> losetup /dev/loop0 >> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >> kpartx -a /dev/loop0 >> vgscan >> vgchange -ay slave-data >> mkdir /mnt/slv01 >> mount /dev/mapper/slave--data-lvol0 /mnt/slv01/ >> >> Possible causes for this issue; >> 1. the machine "lease-11" suffered from a faulty RAM module (ECC), which >> halted the machine and causes an invalid state. (this machine also hosts >> other volumes, with similar configurations, which report no issue) >> 2. after the RAM module was replaced, the VM using the backing qemu >> image, was restored from a backup (the backup was file based within the VM >> on a different directory). This is because some files were corrupted. The >> backup/recovery obviously causes extra IO, possible introducing race >> conditions? The machine did run for about 12h without issues, and in total >> for about 36h. >> 3. since only the client (maybe only gfapi?) reports errors, something is >> broken there? >> >> The volume info; >> root at lease-06 ~# gluster v info ovirt-backbone-2 >> >> Volume Name: ovirt-backbone-2 >> Type: Distributed-Replicate >> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 3 x (2 + 1) = 9 >> Transport-type: tcp >> Bricks: >> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2 >> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2 >> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2 >> Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2 >> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2 >> Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2 >> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >> Options Reconfigured: >> nfs.disable: on >> transport.address-family: inet >> performance.quick-read: off >> performance.read-ahead: off >> performance.io-cache: off >> performance.low-prio-threads: 32 >> network.remote-dio: enable >> cluster.eager-lock: enable >> cluster.quorum-type: auto >> cluster.server-quorum-type: server >> cluster.data-self-heal-algorithm: full >> cluster.locking-scheme: granular >> cluster.shd-max-threads: 8 >> cluster.shd-wait-qlength: 10000 >> features.shard: on >> user.cifs: off >> storage.owner-uid: 36 >> storage.owner-gid: 36 >> features.shard-block-size: 64MB >> performance.write-behind-window-size: 512MB >> performance.cache-size: 384MB >> cluster.brick-multiplex: on >> >> The volume status; >> root at lease-06 ~# gluster v status ovirt-backbone-2 >> Status of volume: ovirt-backbone-2 >> Gluster process TCP Port RDMA Port Online >> Pid >> >> ------------------------------------------------------------------------------ >> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi >> rt-backbone-2 49152 0 Y >> 7727 >> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi >> rt-backbone-2 49152 0 Y >> 12620 >> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi >> rt-backbone-2 49152 0 Y >> 8794 >> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov >> irt-backbone-2 49161 0 Y >> 22333 >> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o >> virt-backbone-2 49152 0 Y >> 15030 >> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi >> rt-backbone-2 49166 0 Y >> 24592 >> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov >> irt-backbone-2 49153 0 Y >> 20148 >> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o >> virt-backbone-2 49154 0 Y >> 15413 >> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi >> rt-backbone-2 49152 0 Y >> 43120 >> Self-heal Daemon on localhost N/A N/A Y >> 44587 >> Self-heal Daemon on 10.201.0.2 N/A N/A Y >> 8401 >> Self-heal Daemon on 10.201.0.5 N/A N/A Y >> 11038 >> Self-heal Daemon on 10.201.0.8 N/A N/A Y >> 9513 >> Self-heal Daemon on 10.32.9.4 N/A N/A Y >> 23736 >> Self-heal Daemon on 10.32.9.20 N/A N/A Y >> 2738 >> Self-heal Daemon on 10.32.9.3 N/A N/A Y >> 25598 >> Self-heal Daemon on 10.32.9.5 N/A N/A Y >> 511 >> Self-heal Daemon on 10.32.9.9 N/A N/A Y >> 23357 >> Self-heal Daemon on 10.32.9.8 N/A N/A Y >> 15225 >> Self-heal Daemon on 10.32.9.7 N/A N/A Y >> 25781 >> Self-heal Daemon on 10.32.9.21 N/A N/A Y >> 5034 >> >> Task Status of Volume ovirt-backbone-2 >> >> ------------------------------------------------------------------------------ >> Task : Rebalance >> ID : 6dfbac43-0125-4568-9ac3-a2c453faaa3d >> Status : completed >> >> gluster version is @3.12.15 and cluster.op-version=31202 >> >> =======================>> >> It would be nice to know if it's possible to mark the files as not stale >> or if i should investigate other things? >> Or should we consider this volume lost? >> Also checking the code at; >> https://github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c >> it seems the functions shifted quite some (line 1724 vs. 2243), so maybe >> it's fixed in a future version? >> Any thoughts are welcome. >> >> Thanks Olaf >> >> >> >> >> >> >> >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181230/37a90c0e/attachment.html>
Dear All, The bash file i'm planning to run can be found here; https://gist.github.com/olafbuitelaar/ff6fe9d4ab39696d9ad6ca689cc89986 It would be nice to receive some feedback from the community before i would actually run the clean-up of all stale file handles. Thanks Olaf Op zo 30 dec. 2018 om 20:56 schreef Olaf Buitelaar <olaf.buitelaar at gmail.com>:> Dear All, > > till now a selected group of VM's still seem to produce new stale file's > and getting paused due to this. > I've not updated gluster recently, however i did change the op version > from 31200 to 31202 about a week before this issue arose. > Looking at the .shard directory, i've 100.000+ files sharing the same > characteristics as a stale file. which are found till now, > they all have the sticky bit set, e.g. file permissions; ---------T. are > 0kb in size, and have the trusted.glusterfs.dht.linkto attribute. > These files range from long a go (beginning of the year) till now. Which > makes me suspect this was laying dormant for some time now..and somehow > recently surfaced. > Checking other sub-volumes they contain also 0kb files in the .shard > directory, but don't have the sticky bit and the linkto attribute. > > Does anybody else experience this issue? Could this be a bug or an > environmental issue? > > Also i wonder if there is any tool or gluster command to clean all stale > file handles? > Otherwise i'm planning to make a simple bash script, which iterates over > the .shard dir, checks each file for the above mentioned criteria, and > (re)moves the file and the corresponding .glusterfs file. > If there are other criteria needed to identify a stale file handle, i > would like to hear that. > If this is a viable and safe operation to do of course. > > Thanks Olaf > > > > Op do 20 dec. 2018 om 13:43 schreef Olaf Buitelaar < > olaf.buitelaar at gmail.com>: > >> Dear All, >> >> I figured it out, it appeared to be the exact same issue as described >> here; >> https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html >> Another subvolume also had the shard file, only were all 0 bytes and had >> the dht.linkto >> >> for reference; >> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex >> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 >> >> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex >> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >> # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 >> >> [root at lease-04 ovirt-backbone-2]# stat >> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >> File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d? >> Size: 0 Blocks: 0 IO Block: 4096 regular empty >> file >> Device: fd01h/64769d Inode: 1918631406 Links: 2 >> Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root) >> Context: system_u:object_r:etc_runtime_t:s0 >> Access: 2018-12-17 21:43:36.405735296 +0000 >> Modify: 2018-12-17 21:43:36.405735296 +0000 >> Change: 2018-12-17 21:43:36.405735296 +0000 >> Birth: - >> >> removing the shard file and glusterfs file from each node resolved the >> issue. >> >> I also found this thread; >> https://lists.gluster.org/pipermail/gluster-users/2018-December/035460.html >> Maybe he suffers from the same issue. >> >> Best Olaf >> >> >> Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar < >> olaf.buitelaar at gmail.com>: >> >>> Dear All, >>> >>> It appears i've a stale file in one of the volumes, on 2 files. These >>> files are qemu images (1 raw and 1 qcow2). >>> I'll just focus on 1 file since the situation on the other seems the >>> same. >>> >>> The VM get's paused more or less directly after being booted with error; >>> [2018-12-18 14:05:05.275713] E [MSGID: 133010] >>> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-backbone-2-shard: >>> Lookup on shard 51500 failed. Base file gfid >>> f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle] >>> >>> investigating the shard; >>> >>> #on the arbiter node: >>> >>> [root at lease-05 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> getfattr: Removing leading '/' from absolute path names >>> # file: >>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>> >>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-05 ovirt-backbone-2]# stat >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>> Size: 0 Blocks: 0 IO Block: 4096 regular >>> empty file >>> Device: fd01h/64769d Inode: 537277306 Links: 2 >>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >>> Context: system_u:object_r:etc_runtime_t:s0 >>> Access: 2018-12-17 21:43:36.361984810 +0000 >>> Modify: 2018-12-17 21:43:36.361984810 +0000 >>> Change: 2018-12-18 20:55:29.908647417 +0000 >>> Birth: - >>> >>> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306 >>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> #on the data nodes: >>> >>> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> getfattr: Removing leading '/' from absolute path names >>> # file: >>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>> >>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-08 ovirt-backbone-2]# stat >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>> Size: 2166784 Blocks: 4128 IO Block: 4096 regular file >>> Device: fd03h/64771d Inode: 12893624759 Links: 3 >>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >>> Context: system_u:object_r:etc_runtime_t:s0 >>> Access: 2018-12-18 18:52:38.070776585 +0000 >>> Modify: 2018-12-17 21:43:36.388054443 +0000 >>> Change: 2018-12-18 21:01:47.810506528 +0000 >>> Birth: - >>> >>> [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759 >>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> =======================>>> >>> [root at lease-11 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> getfattr: Removing leading '/' from absolute path names >>> # file: >>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>> >>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-11 ovirt-backbone-2]# stat >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>> Size: 2166784 Blocks: 4128 IO Block: 4096 regular file >>> Device: fd03h/64771d Inode: 12956094809 Links: 3 >>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >>> Context: system_u:object_r:etc_runtime_t:s0 >>> Access: 2018-12-18 20:11:53.595208449 +0000 >>> Modify: 2018-12-17 21:43:36.391580259 +0000 >>> Change: 2018-12-18 19:19:25.888055392 +0000 >>> Birth: - >>> >>> [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809 >>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> ===============>>> >>> I don't really see any inconsistencies, except the dates on the stat. >>> However this is only after i tried moving the file out of the volumes to >>> force a heal, which does happen on the data nodes, but not on the arbiter >>> node. Before that they were also the same. >>> I've also compared the file >>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2 nodes and they >>> are exactly the same. >>> >>> Things i've further tried; >>> - gluster v heal ovirt-backbone-2 full => gluster v heal >>> ovirt-backbone-2 info reports 0 entries on all nodes >>> >>> - stop each glusterd and glusterfsd, pause around 40sec and start them >>> again on each node, 1 at a time, waiting for the heal to recover before >>> moving to the next node >>> >>> - force a heal by stopping glusterd on a node and perform these steps; >>> mkdir /mnt/ovirt-backbone-2/trigger >>> rmdir /mnt/ovirt-backbone-2/trigger >>> setfattr -n trusted.non-existent-key -v abc /mnt/ovirt-backbone-2/ >>> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/ >>> start glusterd >>> >>> - gluster volume rebalance ovirt-backbone-2 start => success >>> >>> Whats further interesting is that according the mount log, the volume is >>> in split-brain; >>> [2018-12-18 10:06:04.606870] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:04.606908] E [MSGID: 133014] >>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>> [2018-12-18 10:06:04.606927] W [fuse-bridge.c:871:fuse_attr_cbk] >>> 0-glusterfs-fuse: 428090: FSTAT() >>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>> [2018-12-18 10:06:05.107729] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:05.107770] E [MSGID: 133014] >>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>> [2018-12-18 10:06:05.107791] W [fuse-bridge.c:871:fuse_attr_cbk] >>> 0-glusterfs-fuse: 428091: FSTAT() >>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>> [2018-12-18 10:06:05.537244] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.538523] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid >>> 00000000-0000-0000-0000-000000000001: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:05.538685] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.538794] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.539342] I [MSGID: 109063] >>> [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht: Found >>> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid >>> 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0 >>> [2018-12-18 10:06:05.539372] W [MSGID: 109005] >>> [dht-selfheal.c:2158:dht_selfheal_directory] 0-ovirt-backbone-2-dht: >>> Directory selfheal failed: 2 subvolumes down.Not fixing. path >>> /b1c2c949-aef4-4aec-999b-b179efeef732, gfid >>> 8c8598ce-1a52-418e-a7b4-435fee34bae8 >>> [2018-12-18 10:06:05.539694] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.540652] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.608612] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:05.608657] E [MSGID: 133014] >>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>> [2018-12-18 10:06:05.608672] W [fuse-bridge.c:871:fuse_attr_cbk] >>> 0-glusterfs-fuse: 428096: FSTAT() >>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>> [2018-12-18 10:06:06.109339] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:06.109378] E [MSGID: 133014] >>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>> [2018-12-18 10:06:06.109399] W [fuse-bridge.c:871:fuse_attr_cbk] >>> 0-glusterfs-fuse: 428097: FSTAT() >>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>> >>> #note i'm able to see ; /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >>> [root at lease-11 ovirt-backbone-2]# stat >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >>> File: >>> ?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids? >>> Size: 1048576 Blocks: 2048 IO Block: 131072 regular file >>> Device: 41h/65d Inode: 10492258721813610344 Links: 1 >>> Access: (0660/-rw-rw----) Uid: ( 36/ vdsm) Gid: ( 36/ kvm) >>> Context: system_u:object_r:fusefs_t:s0 >>> Access: 2018-12-19 20:07:39.917573869 +0000 >>> Modify: 2018-12-19 20:07:39.928573917 +0000 >>> Change: 2018-12-19 20:07:39.929573921 +0000 >>> Birth: - >>> >>> however checking: gluster v heal ovirt-backbone-2 info split-brain >>> reports no entries. >>> >>> I've also tried mounting the qemu image, and this works fine, i'm able >>> to see all contents; >>> losetup /dev/loop0 >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> kpartx -a /dev/loop0 >>> vgscan >>> vgchange -ay slave-data >>> mkdir /mnt/slv01 >>> mount /dev/mapper/slave--data-lvol0 /mnt/slv01/ >>> >>> Possible causes for this issue; >>> 1. the machine "lease-11" suffered from a faulty RAM module (ECC), which >>> halted the machine and causes an invalid state. (this machine also hosts >>> other volumes, with similar configurations, which report no issue) >>> 2. after the RAM module was replaced, the VM using the backing qemu >>> image, was restored from a backup (the backup was file based within the VM >>> on a different directory). This is because some files were corrupted. The >>> backup/recovery obviously causes extra IO, possible introducing race >>> conditions? The machine did run for about 12h without issues, and in total >>> for about 36h. >>> 3. since only the client (maybe only gfapi?) reports errors, something >>> is broken there? >>> >>> The volume info; >>> root at lease-06 ~# gluster v info ovirt-backbone-2 >>> >>> Volume Name: ovirt-backbone-2 >>> Type: Distributed-Replicate >>> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 3 x (2 + 1) = 9 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>> Options Reconfigured: >>> nfs.disable: on >>> transport.address-family: inet >>> performance.quick-read: off >>> performance.read-ahead: off >>> performance.io-cache: off >>> performance.low-prio-threads: 32 >>> network.remote-dio: enable >>> cluster.eager-lock: enable >>> cluster.quorum-type: auto >>> cluster.server-quorum-type: server >>> cluster.data-self-heal-algorithm: full >>> cluster.locking-scheme: granular >>> cluster.shd-max-threads: 8 >>> cluster.shd-wait-qlength: 10000 >>> features.shard: on >>> user.cifs: off >>> storage.owner-uid: 36 >>> storage.owner-gid: 36 >>> features.shard-block-size: 64MB >>> performance.write-behind-window-size: 512MB >>> performance.cache-size: 384MB >>> cluster.brick-multiplex: on >>> >>> The volume status; >>> root at lease-06 ~# gluster v status ovirt-backbone-2 >>> Status of volume: ovirt-backbone-2 >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> >>> ------------------------------------------------------------------------------ >>> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi >>> rt-backbone-2 49152 0 Y >>> 7727 >>> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi >>> rt-backbone-2 49152 0 Y >>> 12620 >>> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi >>> rt-backbone-2 49152 0 Y >>> 8794 >>> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov >>> irt-backbone-2 49161 0 Y >>> 22333 >>> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o >>> virt-backbone-2 49152 0 Y >>> 15030 >>> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi >>> rt-backbone-2 49166 0 Y >>> 24592 >>> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov >>> irt-backbone-2 49153 0 Y >>> 20148 >>> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o >>> virt-backbone-2 49154 0 Y >>> 15413 >>> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi >>> rt-backbone-2 49152 0 Y >>> 43120 >>> Self-heal Daemon on localhost N/A N/A Y >>> 44587 >>> Self-heal Daemon on 10.201.0.2 N/A N/A Y >>> 8401 >>> Self-heal Daemon on 10.201.0.5 N/A N/A Y >>> 11038 >>> Self-heal Daemon on 10.201.0.8 N/A N/A Y >>> 9513 >>> Self-heal Daemon on 10.32.9.4 N/A N/A Y >>> 23736 >>> Self-heal Daemon on 10.32.9.20 N/A N/A Y >>> 2738 >>> Self-heal Daemon on 10.32.9.3 N/A N/A Y >>> 25598 >>> Self-heal Daemon on 10.32.9.5 N/A N/A Y >>> 511 >>> Self-heal Daemon on 10.32.9.9 N/A N/A Y >>> 23357 >>> Self-heal Daemon on 10.32.9.8 N/A N/A Y >>> 15225 >>> Self-heal Daemon on 10.32.9.7 N/A N/A Y >>> 25781 >>> Self-heal Daemon on 10.32.9.21 N/A N/A Y >>> 5034 >>> >>> Task Status of Volume ovirt-backbone-2 >>> >>> ------------------------------------------------------------------------------ >>> Task : Rebalance >>> ID : 6dfbac43-0125-4568-9ac3-a2c453faaa3d >>> Status : completed >>> >>> gluster version is @3.12.15 and cluster.op-version=31202 >>> >>> =======================>>> >>> It would be nice to know if it's possible to mark the files as not stale >>> or if i should investigate other things? >>> Or should we consider this volume lost? >>> Also checking the code at; >>> https://github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c >>> it seems the functions shifted quite some (line 1724 vs. 2243), so maybe >>> it's fixed in a future version? >>> Any thoughts are welcome. >>> >>> Thanks Olaf >>> >>> >>> >>> >>> >>> >>> >>> >>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190102/58c5d238/attachment.html>
Nithya Balachandran
2019-Jan-02 13:20 UTC
[Gluster-users] [Stale file handle] in shard volume
On Mon, 31 Dec 2018 at 01:27, Olaf Buitelaar <olaf.buitelaar at gmail.com> wrote:> Dear All, > > till now a selected group of VM's still seem to produce new stale file's > and getting paused due to this. > I've not updated gluster recently, however i did change the op version > from 31200 to 31202 about a week before this issue arose. > Looking at the .shard directory, i've 100.000+ files sharing the same > characteristics as a stale file. which are found till now, > they all have the sticky bit set, e.g. file permissions; ---------T. are > 0kb in size, and have the trusted.glusterfs.dht.linkto attribute. >These are internal files used by gluster and do not necessarily mean they are stale. They "point" to data files which may be on different bricks (same name, gfid etc but no linkto xattr and no ----T permissions).> These files range from long a go (beginning of the year) till now. Which > makes me suspect this was laying dormant for some time now..and somehow > recently surfaced. > Checking other sub-volumes they contain also 0kb files in the .shard > directory, but don't have the sticky bit and the linkto attribute. > > Does anybody else experience this issue? Could this be a bug or an > environmental issue? >These are most likely valid files- please do not delete them without double-checking. Stale file handle errors show up when a file with a specified gfid is not found. You will need to debug the files for which you see this error by checking the bricks to see if they actually exist.> > Also i wonder if there is any tool or gluster command to clean all stale > file handles? > Otherwise i'm planning to make a simple bash script, which iterates over > the .shard dir, checks each file for the above mentioned criteria, and > (re)moves the file and the corresponding .glusterfs file. > If there are other criteria needed to identify a stale file handle, i > would like to hear that. > If this is a viable and safe operation to do of course. > > Thanks Olaf > > > > Op do 20 dec. 2018 om 13:43 schreef Olaf Buitelaar < > olaf.buitelaar at gmail.com>: > >> Dear All, >> >> I figured it out, it appeared to be the exact same issue as described >> here; >> https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html >> Another subvolume also had the shard file, only were all 0 bytes and had >> the dht.linkto >> >> for reference; >> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex >> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 >> >> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex >> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >> # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >> >> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d >> >> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >> >> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 >> >> [root at lease-04 ovirt-backbone-2]# stat >> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >> File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d? >> Size: 0 Blocks: 0 IO Block: 4096 regular empty >> file >> Device: fd01h/64769d Inode: 1918631406 Links: 2 >> Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root) >> Context: system_u:object_r:etc_runtime_t:s0 >> Access: 2018-12-17 21:43:36.405735296 +0000 >> Modify: 2018-12-17 21:43:36.405735296 +0000 >> Change: 2018-12-17 21:43:36.405735296 +0000 >> Birth: - >> >> removing the shard file and glusterfs file from each node resolved the >> issue. >> >> I also found this thread; >> https://lists.gluster.org/pipermail/gluster-users/2018-December/035460.html >> Maybe he suffers from the same issue. >> >> Best Olaf >> >> >> Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar < >> olaf.buitelaar at gmail.com>: >> >>> Dear All, >>> >>> It appears i've a stale file in one of the volumes, on 2 files. These >>> files are qemu images (1 raw and 1 qcow2). >>> I'll just focus on 1 file since the situation on the other seems the >>> same. >>> >>> The VM get's paused more or less directly after being booted with error; >>> [2018-12-18 14:05:05.275713] E [MSGID: 133010] >>> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-backbone-2-shard: >>> Lookup on shard 51500 failed. Base file gfid >>> f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle] >>> >>> investigating the shard; >>> >>> #on the arbiter node: >>> >>> [root at lease-05 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> getfattr: Removing leading '/' from absolute path names >>> # file: >>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>> >>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-05 ovirt-backbone-2]# stat >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>> Size: 0 Blocks: 0 IO Block: 4096 regular >>> empty file >>> Device: fd01h/64769d Inode: 537277306 Links: 2 >>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >>> Context: system_u:object_r:etc_runtime_t:s0 >>> Access: 2018-12-17 21:43:36.361984810 +0000 >>> Modify: 2018-12-17 21:43:36.361984810 +0000 >>> Change: 2018-12-18 20:55:29.908647417 +0000 >>> Birth: - >>> >>> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306 >>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> #on the data nodes: >>> >>> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> getfattr: Removing leading '/' from absolute path names >>> # file: >>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>> >>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-08 ovirt-backbone-2]# stat >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>> Size: 2166784 Blocks: 4128 IO Block: 4096 regular file >>> Device: fd03h/64771d Inode: 12893624759 Links: 3 >>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >>> Context: system_u:object_r:etc_runtime_t:s0 >>> Access: 2018-12-18 18:52:38.070776585 +0000 >>> Modify: 2018-12-17 21:43:36.388054443 +0000 >>> Change: 2018-12-18 21:01:47.810506528 +0000 >>> Birth: - >>> >>> [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759 >>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> =======================>>> >>> [root at lease-11 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> getfattr: Removing leading '/' from absolute path names >>> # file: >>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>> >>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> >>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>> trusted.afr.dirty=0x000000000000000000000000 >>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>> >>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>> >>> [root at lease-11 ovirt-backbone-2]# stat >>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>> Size: 2166784 Blocks: 4128 IO Block: 4096 regular file >>> Device: fd03h/64771d Inode: 12956094809 Links: 3 >>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root) >>> Context: system_u:object_r:etc_runtime_t:s0 >>> Access: 2018-12-18 20:11:53.595208449 +0000 >>> Modify: 2018-12-17 21:43:36.391580259 +0000 >>> Change: 2018-12-18 19:19:25.888055392 +0000 >>> Birth: - >>> >>> [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809 >>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>> >>> ===============>>> >>> I don't really see any inconsistencies, except the dates on the stat. >>> However this is only after i tried moving the file out of the volumes to >>> force a heal, which does happen on the data nodes, but not on the arbiter >>> node. Before that they were also the same. >>> I've also compared the file >>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2 nodes and they >>> are exactly the same. >>> >>> Things i've further tried; >>> - gluster v heal ovirt-backbone-2 full => gluster v heal >>> ovirt-backbone-2 info reports 0 entries on all nodes >>> >>> - stop each glusterd and glusterfsd, pause around 40sec and start them >>> again on each node, 1 at a time, waiting for the heal to recover before >>> moving to the next node >>> >>> - force a heal by stopping glusterd on a node and perform these steps; >>> mkdir /mnt/ovirt-backbone-2/trigger >>> rmdir /mnt/ovirt-backbone-2/trigger >>> setfattr -n trusted.non-existent-key -v abc /mnt/ovirt-backbone-2/ >>> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/ >>> start glusterd >>> >>> - gluster volume rebalance ovirt-backbone-2 start => success >>> >>> Whats further interesting is that according the mount log, the volume is >>> in split-brain; >>> [2018-12-18 10:06:04.606870] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:04.606908] E [MSGID: 133014] >>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>> [2018-12-18 10:06:04.606927] W [fuse-bridge.c:871:fuse_attr_cbk] >>> 0-glusterfs-fuse: 428090: FSTAT() >>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>> [2018-12-18 10:06:05.107729] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:05.107770] E [MSGID: 133014] >>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>> [2018-12-18 10:06:05.107791] W [fuse-bridge.c:871:fuse_attr_cbk] >>> 0-glusterfs-fuse: 428091: FSTAT() >>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>> [2018-12-18 10:06:05.537244] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.538523] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid >>> 00000000-0000-0000-0000-000000000001: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:05.538685] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.538794] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.539342] I [MSGID: 109063] >>> [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht: Found >>> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid >>> 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0 >>> [2018-12-18 10:06:05.539372] W [MSGID: 109005] >>> [dht-selfheal.c:2158:dht_selfheal_directory] 0-ovirt-backbone-2-dht: >>> Directory selfheal failed: 2 subvolumes down.Not fixing. path >>> /b1c2c949-aef4-4aec-999b-b179efeef732, gfid >>> 8c8598ce-1a52-418e-a7b4-435fee34bae8 >>> [2018-12-18 10:06:05.539694] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.540652] I [MSGID: 108006] >>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>> subvolumes up >>> [2018-12-18 10:06:05.608612] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:05.608657] E [MSGID: 133014] >>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>> [2018-12-18 10:06:05.608672] W [fuse-bridge.c:871:fuse_attr_cbk] >>> 0-glusterfs-fuse: 428096: FSTAT() >>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>> [2018-12-18 10:06:06.109339] E [MSGID: 108008] >>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>> error] >>> [2018-12-18 10:06:06.109378] E [MSGID: 133014] >>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>> [2018-12-18 10:06:06.109399] W [fuse-bridge.c:871:fuse_attr_cbk] >>> 0-glusterfs-fuse: 428097: FSTAT() >>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>> >>> #note i'm able to see ; /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >>> [root at lease-11 ovirt-backbone-2]# stat >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >>> File: >>> ?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids? >>> Size: 1048576 Blocks: 2048 IO Block: 131072 regular file >>> Device: 41h/65d Inode: 10492258721813610344 Links: 1 >>> Access: (0660/-rw-rw----) Uid: ( 36/ vdsm) Gid: ( 36/ kvm) >>> Context: system_u:object_r:fusefs_t:s0 >>> Access: 2018-12-19 20:07:39.917573869 +0000 >>> Modify: 2018-12-19 20:07:39.928573917 +0000 >>> Change: 2018-12-19 20:07:39.929573921 +0000 >>> Birth: - >>> >>> however checking: gluster v heal ovirt-backbone-2 info split-brain >>> reports no entries. >>> >>> I've also tried mounting the qemu image, and this works fine, i'm able >>> to see all contents; >>> losetup /dev/loop0 >>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>> kpartx -a /dev/loop0 >>> vgscan >>> vgchange -ay slave-data >>> mkdir /mnt/slv01 >>> mount /dev/mapper/slave--data-lvol0 /mnt/slv01/ >>> >>> Possible causes for this issue; >>> 1. the machine "lease-11" suffered from a faulty RAM module (ECC), which >>> halted the machine and causes an invalid state. (this machine also hosts >>> other volumes, with similar configurations, which report no issue) >>> 2. after the RAM module was replaced, the VM using the backing qemu >>> image, was restored from a backup (the backup was file based within the VM >>> on a different directory). This is because some files were corrupted. The >>> backup/recovery obviously causes extra IO, possible introducing race >>> conditions? The machine did run for about 12h without issues, and in total >>> for about 36h. >>> 3. since only the client (maybe only gfapi?) reports errors, something >>> is broken there? >>> >>> The volume info; >>> root at lease-06 ~# gluster v info ovirt-backbone-2 >>> >>> Volume Name: ovirt-backbone-2 >>> Type: Distributed-Replicate >>> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 3 x (2 + 1) = 9 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>> Options Reconfigured: >>> nfs.disable: on >>> transport.address-family: inet >>> performance.quick-read: off >>> performance.read-ahead: off >>> performance.io-cache: off >>> performance.low-prio-threads: 32 >>> network.remote-dio: enable >>> cluster.eager-lock: enable >>> cluster.quorum-type: auto >>> cluster.server-quorum-type: server >>> cluster.data-self-heal-algorithm: full >>> cluster.locking-scheme: granular >>> cluster.shd-max-threads: 8 >>> cluster.shd-wait-qlength: 10000 >>> features.shard: on >>> user.cifs: off >>> storage.owner-uid: 36 >>> storage.owner-gid: 36 >>> features.shard-block-size: 64MB >>> performance.write-behind-window-size: 512MB >>> performance.cache-size: 384MB >>> cluster.brick-multiplex: on >>> >>> The volume status; >>> root at lease-06 ~# gluster v status ovirt-backbone-2 >>> Status of volume: ovirt-backbone-2 >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> >>> ------------------------------------------------------------------------------ >>> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi >>> rt-backbone-2 49152 0 Y >>> 7727 >>> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi >>> rt-backbone-2 49152 0 Y >>> 12620 >>> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi >>> rt-backbone-2 49152 0 Y >>> 8794 >>> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov >>> irt-backbone-2 49161 0 Y >>> 22333 >>> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o >>> virt-backbone-2 49152 0 Y >>> 15030 >>> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi >>> rt-backbone-2 49166 0 Y >>> 24592 >>> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov >>> irt-backbone-2 49153 0 Y >>> 20148 >>> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o >>> virt-backbone-2 49154 0 Y >>> 15413 >>> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi >>> rt-backbone-2 49152 0 Y >>> 43120 >>> Self-heal Daemon on localhost N/A N/A Y >>> 44587 >>> Self-heal Daemon on 10.201.0.2 N/A N/A Y >>> 8401 >>> Self-heal Daemon on 10.201.0.5 N/A N/A Y >>> 11038 >>> Self-heal Daemon on 10.201.0.8 N/A N/A Y >>> 9513 >>> Self-heal Daemon on 10.32.9.4 N/A N/A Y >>> 23736 >>> Self-heal Daemon on 10.32.9.20 N/A N/A Y >>> 2738 >>> Self-heal Daemon on 10.32.9.3 N/A N/A Y >>> 25598 >>> Self-heal Daemon on 10.32.9.5 N/A N/A Y >>> 511 >>> Self-heal Daemon on 10.32.9.9 N/A N/A Y >>> 23357 >>> Self-heal Daemon on 10.32.9.8 N/A N/A Y >>> 15225 >>> Self-heal Daemon on 10.32.9.7 N/A N/A Y >>> 25781 >>> Self-heal Daemon on 10.32.9.21 N/A N/A Y >>> 5034 >>> >>> Task Status of Volume ovirt-backbone-2 >>> >>> ------------------------------------------------------------------------------ >>> Task : Rebalance >>> ID : 6dfbac43-0125-4568-9ac3-a2c453faaa3d >>> Status : completed >>> >>> gluster version is @3.12.15 and cluster.op-version=31202 >>> >>> =======================>>> >>> It would be nice to know if it's possible to mark the files as not stale >>> or if i should investigate other things? >>> Or should we consider this volume lost? >>> Also checking the code at; >>> https://github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c >>> it seems the functions shifted quite some (line 1724 vs. 2243), so maybe >>> it's fixed in a future version? >>> Any thoughts are welcome. >>> >>> Thanks Olaf >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20190102/1032dcc2/attachment.html>