Nithya Balachandran
2019-Jan-04 06:50 UTC
[Gluster-users] [Stale file handle] in shard volume
Adding Krutika. On Wed, 2 Jan 2019 at 20:56, Olaf Buitelaar <olaf.buitelaar at gmail.com> wrote:> Hi Nithya, > > Thank you for your reply. > > the VM's using the gluster volumes keeps on getting paused/stopped on > errors like these; > [2019-01-02 02:33:44.469132] E [MSGID: 133010] > [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on > shard 101487 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c > [Stale file handle] > [2019-01-02 02:33:44.563288] E [MSGID: 133010] > [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on > shard 101488 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c > [Stale file handle] > > Krutika, Can you take a look at this?> > What i'm trying to find out, if i can purge all gluster volumes from all > possible stale file handles (and hopefully find a method to prevent this in > the future), so the VM's can start running stable again. > For this i need to know when the "shard_common_lookup_shards_cbk" function > considers a file as stale. > The statement; "Stale file handle errors show up when a file with a > specified gfid is not found." doesn't seem to cover it all, as i've shown > in earlier mails the shard file and glusterfs/xx/xx/uuid file do both > exist, and have the same inode. > If the criteria i'm using aren't correct, could you please tell me which > criteria i should use to determine if a file is stale or not? > these criteria are just based observations i made, moving the stale files > manually. After removing them i was able to start the VM again..until some > time later it hangs on another stale shard file unfortunate. > > Thanks Olaf > > Op wo 2 jan. 2019 om 14:20 schreef Nithya Balachandran < > nbalacha at redhat.com>: > >> >> >> On Mon, 31 Dec 2018 at 01:27, Olaf Buitelaar <olaf.buitelaar at gmail.com> >> wrote: >> >>> Dear All, >>> >>> till now a selected group of VM's still seem to produce new stale file's >>> and getting paused due to this. >>> I've not updated gluster recently, however i did change the op version >>> from 31200 to 31202 about a week before this issue arose. >>> Looking at the .shard directory, i've 100.000+ files sharing the same >>> characteristics as a stale file. which are found till now, >>> they all have the sticky bit set, e.g. file permissions; ---------T. are >>> 0kb in size, and have the trusted.glusterfs.dht.linkto attribute. >>> >> >> These are internal files used by gluster and do not necessarily mean they >> are stale. They "point" to data files which may be on different bricks >> (same name, gfid etc but no linkto xattr and no ----T permissions). >> >> >>> These files range from long a go (beginning of the year) till now. Which >>> makes me suspect this was laying dormant for some time now..and somehow >>> recently surfaced. >>> Checking other sub-volumes they contain also 0kb files in the .shard >>> directory, but don't have the sticky bit and the linkto attribute. >>> >>> Does anybody else experience this issue? Could this be a bug or an >>> environmental issue? >>> >> These are most likely valid files- please do not delete them without >> double-checking. >> >> Stale file handle errors show up when a file with a specified gfid is not >> found. You will need to debug the files for which you see this error by >> checking the bricks to see if they actually exist. >> >>> >>> Also i wonder if there is any tool or gluster command to clean all stale >>> file handles? >>> Otherwise i'm planning to make a simple bash script, which iterates over >>> the .shard dir, checks each file for the above mentioned criteria, and >>> (re)moves the file and the corresponding .glusterfs file. >>> If there are other criteria needed to identify a stale file handle, i >>> would like to hear that. >>> If this is a viable and safe operation to do of course. >>> >>> Thanks Olaf >>> >>> >>> >>> Op do 20 dec. 2018 om 13:43 schreef Olaf Buitelaar < >>> olaf.buitelaar at gmail.com>: >>> >>>> Dear All, >>>> >>>> I figured it out, it appeared to be the exact same issue as described >>>> here; >>>> lists.gluster.org/pipermail/gluster-users/2018-March/033785.html >>>> Another subvolume also had the shard file, only were all 0 bytes and >>>> had the dht.linkto >>>> >>>> for reference; >>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex >>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>> >>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d >>>> >>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>> >>>> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 >>>> >>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex >>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >>>> # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >>>> >>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d >>>> >>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>> >>>> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 >>>> >>>> [root at lease-04 ovirt-backbone-2]# stat >>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >>>> File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d? >>>> Size: 0 Blocks: 0 IO Block: 4096 regular >>>> empty file >>>> Device: fd01h/64769d Inode: 1918631406 Links: 2 >>>> Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root) >>>> Context: system_u:object_r:etc_runtime_t:s0 >>>> Access: 2018-12-17 21:43:36.405735296 +0000 >>>> Modify: 2018-12-17 21:43:36.405735296 +0000 >>>> Change: 2018-12-17 21:43:36.405735296 +0000 >>>> Birth: - >>>> >>>> removing the shard file and glusterfs file from each node resolved the >>>> issue. >>>> >>>> I also found this thread; >>>> lists.gluster.org/pipermail/gluster-users/2018-December/035460.html >>>> Maybe he suffers from the same issue. >>>> >>>> Best Olaf >>>> >>>> >>>> Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar < >>>> olaf.buitelaar at gmail.com>: >>>> >>>>> Dear All, >>>>> >>>>> It appears i've a stale file in one of the volumes, on 2 files. These >>>>> files are qemu images (1 raw and 1 qcow2). >>>>> I'll just focus on 1 file since the situation on the other seems the >>>>> same. >>>>> >>>>> The VM get's paused more or less directly after being booted with >>>>> error; >>>>> [2018-12-18 14:05:05.275713] E [MSGID: 133010] >>>>> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-backbone-2-shard: >>>>> Lookup on shard 51500 failed. Base file gfid >>>>> f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle] >>>>> >>>>> investigating the shard; >>>>> >>>>> #on the arbiter node: >>>>> >>>>> [root at lease-05 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>> getfattr: Removing leading '/' from absolute path names >>>>> # file: >>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>>>> >>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> >>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>> >>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>> >>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> >>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>> >>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>> >>>>> [root at lease-05 ovirt-backbone-2]# stat >>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>>>> Size: 0 Blocks: 0 IO Block: 4096 regular >>>>> empty file >>>>> Device: fd01h/64769d Inode: 537277306 Links: 2 >>>>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ >>>>> root) >>>>> Context: system_u:object_r:etc_runtime_t:s0 >>>>> Access: 2018-12-17 21:43:36.361984810 +0000 >>>>> Modify: 2018-12-17 21:43:36.361984810 +0000 >>>>> Change: 2018-12-18 20:55:29.908647417 +0000 >>>>> Birth: - >>>>> >>>>> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306 >>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> >>>>> #on the data nodes: >>>>> >>>>> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string >>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>> getfattr: Removing leading '/' from absolute path names >>>>> # file: >>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>>>> >>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> >>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>> >>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>> >>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> >>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>> >>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>> >>>>> [root at lease-08 ovirt-backbone-2]# stat >>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>>>> Size: 2166784 Blocks: 4128 IO Block: 4096 regular >>>>> file >>>>> Device: fd03h/64771d Inode: 12893624759 Links: 3 >>>>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ >>>>> root) >>>>> Context: system_u:object_r:etc_runtime_t:s0 >>>>> Access: 2018-12-18 18:52:38.070776585 +0000 >>>>> Modify: 2018-12-17 21:43:36.388054443 +0000 >>>>> Change: 2018-12-18 21:01:47.810506528 +0000 >>>>> Birth: - >>>>> >>>>> [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759 >>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> >>>>> =======================>>>>> >>>>> [root at lease-11 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>> getfattr: Removing leading '/' from absolute path names >>>>> # file: >>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>>>> >>>>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> >>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>> >>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>> >>>>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> >>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>> >>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>> >>>>> [root at lease-11 ovirt-backbone-2]# stat >>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>>>> Size: 2166784 Blocks: 4128 IO Block: 4096 regular >>>>> file >>>>> Device: fd03h/64771d Inode: 12956094809 Links: 3 >>>>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ >>>>> root) >>>>> Context: system_u:object_r:etc_runtime_t:s0 >>>>> Access: 2018-12-18 20:11:53.595208449 +0000 >>>>> Modify: 2018-12-17 21:43:36.391580259 +0000 >>>>> Change: 2018-12-18 19:19:25.888055392 +0000 >>>>> Birth: - >>>>> >>>>> [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809 >>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> >>>>> ===============>>>>> >>>>> I don't really see any inconsistencies, except the dates on the stat. >>>>> However this is only after i tried moving the file out of the volumes to >>>>> force a heal, which does happen on the data nodes, but not on the arbiter >>>>> node. Before that they were also the same. >>>>> I've also compared the file >>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2 nodes and they >>>>> are exactly the same. >>>>> >>>>> Things i've further tried; >>>>> - gluster v heal ovirt-backbone-2 full => gluster v heal >>>>> ovirt-backbone-2 info reports 0 entries on all nodes >>>>> >>>>> - stop each glusterd and glusterfsd, pause around 40sec and start them >>>>> again on each node, 1 at a time, waiting for the heal to recover before >>>>> moving to the next node >>>>> >>>>> - force a heal by stopping glusterd on a node and perform these steps; >>>>> mkdir /mnt/ovirt-backbone-2/trigger >>>>> rmdir /mnt/ovirt-backbone-2/trigger >>>>> setfattr -n trusted.non-existent-key -v abc /mnt/ovirt-backbone-2/ >>>>> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/ >>>>> start glusterd >>>>> >>>>> - gluster volume rebalance ovirt-backbone-2 start => success >>>>> >>>>> Whats further interesting is that according the mount log, the volume >>>>> is in split-brain; >>>>> [2018-12-18 10:06:04.606870] E [MSGID: 108008] >>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>>>> error] >>>>> [2018-12-18 10:06:04.606908] E [MSGID: 133014] >>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>>>> [2018-12-18 10:06:04.606927] W [fuse-bridge.c:871:fuse_attr_cbk] >>>>> 0-glusterfs-fuse: 428090: FSTAT() >>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>>>> [2018-12-18 10:06:05.107729] E [MSGID: 108008] >>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>>>> error] >>>>> [2018-12-18 10:06:05.107770] E [MSGID: 133014] >>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>>>> [2018-12-18 10:06:05.107791] W [fuse-bridge.c:871:fuse_attr_cbk] >>>>> 0-glusterfs-fuse: 428091: FSTAT() >>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>>>> [2018-12-18 10:06:05.537244] I [MSGID: 108006] >>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>> subvolumes up >>>>> [2018-12-18 10:06:05.538523] E [MSGID: 108008] >>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid >>>>> 00000000-0000-0000-0000-000000000001: split-brain observed. [Input/output >>>>> error] >>>>> [2018-12-18 10:06:05.538685] I [MSGID: 108006] >>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>> subvolumes up >>>>> [2018-12-18 10:06:05.538794] I [MSGID: 108006] >>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>> subvolumes up >>>>> [2018-12-18 10:06:05.539342] I [MSGID: 109063] >>>>> [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht: Found >>>>> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid >>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0 >>>>> [2018-12-18 10:06:05.539372] W [MSGID: 109005] >>>>> [dht-selfheal.c:2158:dht_selfheal_directory] 0-ovirt-backbone-2-dht: >>>>> Directory selfheal failed: 2 subvolumes down.Not fixing. path >>>>> /b1c2c949-aef4-4aec-999b-b179efeef732, gfid >>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8 >>>>> [2018-12-18 10:06:05.539694] I [MSGID: 108006] >>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>> subvolumes up >>>>> [2018-12-18 10:06:05.540652] I [MSGID: 108006] >>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>> subvolumes up >>>>> [2018-12-18 10:06:05.608612] E [MSGID: 108008] >>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>>>> error] >>>>> [2018-12-18 10:06:05.608657] E [MSGID: 133014] >>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>>>> [2018-12-18 10:06:05.608672] W [fuse-bridge.c:871:fuse_attr_cbk] >>>>> 0-glusterfs-fuse: 428096: FSTAT() >>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>>>> [2018-12-18 10:06:06.109339] E [MSGID: 108008] >>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>>>> error] >>>>> [2018-12-18 10:06:06.109378] E [MSGID: 133014] >>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>>>> [2018-12-18 10:06:06.109399] W [fuse-bridge.c:871:fuse_attr_cbk] >>>>> 0-glusterfs-fuse: 428097: FSTAT() >>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>>>> >>>>> #note i'm able to see ; >>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >>>>> [root at lease-11 ovirt-backbone-2]# stat >>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >>>>> File: >>>>> ?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids? >>>>> Size: 1048576 Blocks: 2048 IO Block: 131072 regular >>>>> file >>>>> Device: 41h/65d Inode: 10492258721813610344 Links: 1 >>>>> Access: (0660/-rw-rw----) Uid: ( 36/ vdsm) Gid: ( 36/ >>>>> kvm) >>>>> Context: system_u:object_r:fusefs_t:s0 >>>>> Access: 2018-12-19 20:07:39.917573869 +0000 >>>>> Modify: 2018-12-19 20:07:39.928573917 +0000 >>>>> Change: 2018-12-19 20:07:39.929573921 +0000 >>>>> Birth: - >>>>> >>>>> however checking: gluster v heal ovirt-backbone-2 info split-brain >>>>> reports no entries. >>>>> >>>>> I've also tried mounting the qemu image, and this works fine, i'm able >>>>> to see all contents; >>>>> losetup /dev/loop0 >>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>> kpartx -a /dev/loop0 >>>>> vgscan >>>>> vgchange -ay slave-data >>>>> mkdir /mnt/slv01 >>>>> mount /dev/mapper/slave--data-lvol0 /mnt/slv01/ >>>>> >>>>> Possible causes for this issue; >>>>> 1. the machine "lease-11" suffered from a faulty RAM module (ECC), >>>>> which halted the machine and causes an invalid state. (this machine also >>>>> hosts other volumes, with similar configurations, which report no issue) >>>>> 2. after the RAM module was replaced, the VM using the backing qemu >>>>> image, was restored from a backup (the backup was file based within the VM >>>>> on a different directory). This is because some files were corrupted. The >>>>> backup/recovery obviously causes extra IO, possible introducing race >>>>> conditions? The machine did run for about 12h without issues, and in total >>>>> for about 36h. >>>>> 3. since only the client (maybe only gfapi?) reports errors, something >>>>> is broken there? >>>>> >>>>> The volume info; >>>>> root at lease-06 ~# gluster v info ovirt-backbone-2 >>>>> >>>>> Volume Name: ovirt-backbone-2 >>>>> Type: Distributed-Replicate >>>>> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28 >>>>> Status: Started >>>>> Snapshot Count: 0 >>>>> Number of Bricks: 3 x (2 + 1) = 9 >>>>> Transport-type: tcp >>>>> Bricks: >>>>> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2 >>>>> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2 >>>>> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>>>> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>>>> Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>>>> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>>>> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>>>> Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>>>> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>>>> Options Reconfigured: >>>>> nfs.disable: on >>>>> transport.address-family: inet >>>>> performance.quick-read: off >>>>> performance.read-ahead: off >>>>> performance.io-cache: off >>>>> performance.low-prio-threads: 32 >>>>> network.remote-dio: enable >>>>> cluster.eager-lock: enable >>>>> cluster.quorum-type: auto >>>>> cluster.server-quorum-type: server >>>>> cluster.data-self-heal-algorithm: full >>>>> cluster.locking-scheme: granular >>>>> cluster.shd-max-threads: 8 >>>>> cluster.shd-wait-qlength: 10000 >>>>> features.shard: on >>>>> user.cifs: off >>>>> storage.owner-uid: 36 >>>>> storage.owner-gid: 36 >>>>> features.shard-block-size: 64MB >>>>> performance.write-behind-window-size: 512MB >>>>> performance.cache-size: 384MB >>>>> cluster.brick-multiplex: on >>>>> >>>>> The volume status; >>>>> root at lease-06 ~# gluster v status ovirt-backbone-2 >>>>> Status of volume: ovirt-backbone-2 >>>>> Gluster process TCP Port RDMA Port >>>>> Online Pid >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi >>>>> rt-backbone-2 49152 0 >>>>> Y 7727 >>>>> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi >>>>> rt-backbone-2 49152 0 >>>>> Y 12620 >>>>> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi >>>>> rt-backbone-2 49152 0 >>>>> Y 8794 >>>>> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov >>>>> irt-backbone-2 49161 0 >>>>> Y 22333 >>>>> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o >>>>> virt-backbone-2 49152 0 >>>>> Y 15030 >>>>> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi >>>>> rt-backbone-2 49166 0 >>>>> Y 24592 >>>>> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov >>>>> irt-backbone-2 49153 0 >>>>> Y 20148 >>>>> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o >>>>> virt-backbone-2 49154 0 >>>>> Y 15413 >>>>> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi >>>>> rt-backbone-2 49152 0 >>>>> Y 43120 >>>>> Self-heal Daemon on localhost N/A N/A >>>>> Y 44587 >>>>> Self-heal Daemon on 10.201.0.2 N/A N/A >>>>> Y 8401 >>>>> Self-heal Daemon on 10.201.0.5 N/A N/A >>>>> Y 11038 >>>>> Self-heal Daemon on 10.201.0.8 N/A N/A >>>>> Y 9513 >>>>> Self-heal Daemon on 10.32.9.4 N/A N/A >>>>> Y 23736 >>>>> Self-heal Daemon on 10.32.9.20 N/A N/A >>>>> Y 2738 >>>>> Self-heal Daemon on 10.32.9.3 N/A N/A >>>>> Y 25598 >>>>> Self-heal Daemon on 10.32.9.5 N/A N/A >>>>> Y 511 >>>>> Self-heal Daemon on 10.32.9.9 N/A N/A >>>>> Y 23357 >>>>> Self-heal Daemon on 10.32.9.8 N/A N/A >>>>> Y 15225 >>>>> Self-heal Daemon on 10.32.9.7 N/A N/A >>>>> Y 25781 >>>>> Self-heal Daemon on 10.32.9.21 N/A N/A >>>>> Y 5034 >>>>> >>>>> Task Status of Volume ovirt-backbone-2 >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Task : Rebalance >>>>> ID : 6dfbac43-0125-4568-9ac3-a2c453faaa3d >>>>> Status : completed >>>>> >>>>> gluster version is @3.12.15 and cluster.op-version=31202 >>>>> >>>>> =======================>>>>> >>>>> It would be nice to know if it's possible to mark the files as not >>>>> stale or if i should investigate other things? >>>>> Or should we consider this volume lost? >>>>> Also checking the code at; >>>>> github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c >>>>> it seems the functions shifted quite some (line 1724 vs. 2243), so maybe >>>>> it's fixed in a future version? >>>>> Any thoughts are welcome. >>>>> >>>>> Thanks Olaf >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> lists.gluster.org/mailman/listinfo/gluster-users >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.gluster.org/pipermail/gluster-users/attachments/20190104/3499535f/attachment.html>
@Krutika if you need any further information, please let me know. Thanks Olaf Op vr 4 jan. 2019 om 07:51 schreef Nithya Balachandran <nbalacha at redhat.com>:> Adding Krutika. > > On Wed, 2 Jan 2019 at 20:56, Olaf Buitelaar <olaf.buitelaar at gmail.com> > wrote: > >> Hi Nithya, >> >> Thank you for your reply. >> >> the VM's using the gluster volumes keeps on getting paused/stopped on >> errors like these; >> [2019-01-02 02:33:44.469132] E [MSGID: 133010] >> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on >> shard 101487 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c >> [Stale file handle] >> [2019-01-02 02:33:44.563288] E [MSGID: 133010] >> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on >> shard 101488 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c >> [Stale file handle] >> >> Krutika, Can you take a look at this? > > >> >> What i'm trying to find out, if i can purge all gluster volumes from all >> possible stale file handles (and hopefully find a method to prevent this in >> the future), so the VM's can start running stable again. >> For this i need to know when the "shard_common_lookup_shards_cbk" >> function considers a file as stale. >> The statement; "Stale file handle errors show up when a file with a >> specified gfid is not found." doesn't seem to cover it all, as i've shown >> in earlier mails the shard file and glusterfs/xx/xx/uuid file do both >> exist, and have the same inode. >> If the criteria i'm using aren't correct, could you please tell me which >> criteria i should use to determine if a file is stale or not? >> these criteria are just based observations i made, moving the stale files >> manually. After removing them i was able to start the VM again..until some >> time later it hangs on another stale shard file unfortunate. >> >> Thanks Olaf >> >> Op wo 2 jan. 2019 om 14:20 schreef Nithya Balachandran < >> nbalacha at redhat.com>: >> >>> >>> >>> On Mon, 31 Dec 2018 at 01:27, Olaf Buitelaar <olaf.buitelaar at gmail.com> >>> wrote: >>> >>>> Dear All, >>>> >>>> till now a selected group of VM's still seem to produce new stale >>>> file's and getting paused due to this. >>>> I've not updated gluster recently, however i did change the op version >>>> from 31200 to 31202 about a week before this issue arose. >>>> Looking at the .shard directory, i've 100.000+ files sharing the same >>>> characteristics as a stale file. which are found till now, >>>> they all have the sticky bit set, e.g. file permissions; ---------T. >>>> are 0kb in size, and have the trusted.glusterfs.dht.linkto attribute. >>>> >>> >>> These are internal files used by gluster and do not necessarily mean >>> they are stale. They "point" to data files which may be on different bricks >>> (same name, gfid etc but no linkto xattr and no ----T permissions). >>> >>> >>>> These files range from long a go (beginning of the year) till now. >>>> Which makes me suspect this was laying dormant for some time now..and >>>> somehow recently surfaced. >>>> Checking other sub-volumes they contain also 0kb files in the .shard >>>> directory, but don't have the sticky bit and the linkto attribute. >>>> >>>> Does anybody else experience this issue? Could this be a bug or an >>>> environmental issue? >>>> >>> These are most likely valid files- please do not delete them without >>> double-checking. >>> >>> Stale file handle errors show up when a file with a specified gfid is >>> not found. You will need to debug the files for which you see this error by >>> checking the bricks to see if they actually exist. >>> >>>> >>>> Also i wonder if there is any tool or gluster command to clean all >>>> stale file handles? >>>> Otherwise i'm planning to make a simple bash script, which iterates >>>> over the .shard dir, checks each file for the above mentioned criteria, and >>>> (re)moves the file and the corresponding .glusterfs file. >>>> If there are other criteria needed to identify a stale file handle, i >>>> would like to hear that. >>>> If this is a viable and safe operation to do of course. >>>> >>>> Thanks Olaf >>>> >>>> >>>> >>>> Op do 20 dec. 2018 om 13:43 schreef Olaf Buitelaar < >>>> olaf.buitelaar at gmail.com>: >>>> >>>>> Dear All, >>>>> >>>>> I figured it out, it appeared to be the exact same issue as described >>>>> here; >>>>> lists.gluster.org/pipermail/gluster-users/2018-March/033785.html >>>>> Another subvolume also had the shard file, only were all 0 bytes and >>>>> had the dht.linkto >>>>> >>>>> for reference; >>>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>> >>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d >>>>> >>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>> >>>>> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 >>>>> >>>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >>>>> # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >>>>> >>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d >>>>> >>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>> >>>>> trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100 >>>>> >>>>> [root at lease-04 ovirt-backbone-2]# stat >>>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d >>>>> File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d? >>>>> Size: 0 Blocks: 0 IO Block: 4096 regular >>>>> empty file >>>>> Device: fd01h/64769d Inode: 1918631406 Links: 2 >>>>> Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ >>>>> root) >>>>> Context: system_u:object_r:etc_runtime_t:s0 >>>>> Access: 2018-12-17 21:43:36.405735296 +0000 >>>>> Modify: 2018-12-17 21:43:36.405735296 +0000 >>>>> Change: 2018-12-17 21:43:36.405735296 +0000 >>>>> Birth: - >>>>> >>>>> removing the shard file and glusterfs file from each node resolved the >>>>> issue. >>>>> >>>>> I also found this thread; >>>>> lists.gluster.org/pipermail/gluster-users/2018-December/035460.html >>>>> Maybe he suffers from the same issue. >>>>> >>>>> Best Olaf >>>>> >>>>> >>>>> Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar < >>>>> olaf.buitelaar at gmail.com>: >>>>> >>>>>> Dear All, >>>>>> >>>>>> It appears i've a stale file in one of the volumes, on 2 files. These >>>>>> files are qemu images (1 raw and 1 qcow2). >>>>>> I'll just focus on 1 file since the situation on the other seems the >>>>>> same. >>>>>> >>>>>> The VM get's paused more or less directly after being booted with >>>>>> error; >>>>>> [2018-12-18 14:05:05.275713] E [MSGID: 133010] >>>>>> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-backbone-2-shard: >>>>>> Lookup on shard 51500 failed. Base file gfid >>>>>> f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle] >>>>>> >>>>>> investigating the shard; >>>>>> >>>>>> #on the arbiter node: >>>>>> >>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>>> getfattr: Removing leading '/' from absolute path names >>>>>> # file: >>>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>>>>> >>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> >>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>>> >>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>>> >>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> >>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>>> >>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>>> >>>>>> [root at lease-05 ovirt-backbone-2]# stat >>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>>>>> Size: 0 Blocks: 0 IO Block: 4096 regular >>>>>> empty file >>>>>> Device: fd01h/64769d Inode: 537277306 Links: 2 >>>>>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ >>>>>> root) >>>>>> Context: system_u:object_r:etc_runtime_t:s0 >>>>>> Access: 2018-12-17 21:43:36.361984810 +0000 >>>>>> Modify: 2018-12-17 21:43:36.361984810 +0000 >>>>>> Change: 2018-12-18 20:55:29.908647417 +0000 >>>>>> Birth: - >>>>>> >>>>>> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306 >>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> >>>>>> #on the data nodes: >>>>>> >>>>>> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string >>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>>> getfattr: Removing leading '/' from absolute path names >>>>>> # file: >>>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>>>>> >>>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> >>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>>> >>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>>> >>>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> >>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>>> >>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>>> >>>>>> [root at lease-08 ovirt-backbone-2]# stat >>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>>>>> Size: 2166784 Blocks: 4128 IO Block: 4096 regular >>>>>> file >>>>>> Device: fd03h/64771d Inode: 12893624759 Links: 3 >>>>>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ >>>>>> root) >>>>>> Context: system_u:object_r:etc_runtime_t:s0 >>>>>> Access: 2018-12-18 18:52:38.070776585 +0000 >>>>>> Modify: 2018-12-17 21:43:36.388054443 +0000 >>>>>> Change: 2018-12-18 21:01:47.810506528 +0000 >>>>>> Birth: - >>>>>> >>>>>> [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759 >>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> >>>>>> =======================>>>>>> >>>>>> [root at lease-11 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string >>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>>> getfattr: Removing leading '/' from absolute path names >>>>>> # file: >>>>>> mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>>> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40" >>>>>> >>>>>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> >>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>>> >>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>>> >>>>>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex >>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> >>>>>> security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000 >>>>>> trusted.afr.dirty=0x000000000000000000000000 >>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0 >>>>>> >>>>>> trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030 >>>>>> >>>>>> [root at lease-11 ovirt-backbone-2]# stat >>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0? >>>>>> Size: 2166784 Blocks: 4128 IO Block: 4096 regular >>>>>> file >>>>>> Device: fd03h/64771d Inode: 12956094809 Links: 3 >>>>>> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ >>>>>> root) >>>>>> Context: system_u:object_r:etc_runtime_t:s0 >>>>>> Access: 2018-12-18 20:11:53.595208449 +0000 >>>>>> Modify: 2018-12-17 21:43:36.391580259 +0000 >>>>>> Change: 2018-12-18 19:19:25.888055392 +0000 >>>>>> Birth: - >>>>>> >>>>>> [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809 >>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0 >>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 >>>>>> >>>>>> ===============>>>>>> >>>>>> I don't really see any inconsistencies, except the dates on the stat. >>>>>> However this is only after i tried moving the file out of the volumes to >>>>>> force a heal, which does happen on the data nodes, but not on the arbiter >>>>>> node. Before that they were also the same. >>>>>> I've also compared the file >>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2 nodes and they >>>>>> are exactly the same. >>>>>> >>>>>> Things i've further tried; >>>>>> - gluster v heal ovirt-backbone-2 full => gluster v heal >>>>>> ovirt-backbone-2 info reports 0 entries on all nodes >>>>>> >>>>>> - stop each glusterd and glusterfsd, pause around 40sec and start >>>>>> them again on each node, 1 at a time, waiting for the heal to recover >>>>>> before moving to the next node >>>>>> >>>>>> - force a heal by stopping glusterd on a node and perform these steps; >>>>>> mkdir /mnt/ovirt-backbone-2/trigger >>>>>> rmdir /mnt/ovirt-backbone-2/trigger >>>>>> setfattr -n trusted.non-existent-key -v abc /mnt/ovirt-backbone-2/ >>>>>> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/ >>>>>> start glusterd >>>>>> >>>>>> - gluster volume rebalance ovirt-backbone-2 start => success >>>>>> >>>>>> Whats further interesting is that according the mount log, the volume >>>>>> is in split-brain; >>>>>> [2018-12-18 10:06:04.606870] E [MSGID: 108008] >>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>>>>> error] >>>>>> [2018-12-18 10:06:04.606908] E [MSGID: 133014] >>>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>>>>> [2018-12-18 10:06:04.606927] W [fuse-bridge.c:871:fuse_attr_cbk] >>>>>> 0-glusterfs-fuse: 428090: FSTAT() >>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>>>>> [2018-12-18 10:06:05.107729] E [MSGID: 108008] >>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>>>>> error] >>>>>> [2018-12-18 10:06:05.107770] E [MSGID: 133014] >>>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>>>>> [2018-12-18 10:06:05.107791] W [fuse-bridge.c:871:fuse_attr_cbk] >>>>>> 0-glusterfs-fuse: 428091: FSTAT() >>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>>>>> [2018-12-18 10:06:05.537244] I [MSGID: 108006] >>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>>> subvolumes up >>>>>> [2018-12-18 10:06:05.538523] E [MSGID: 108008] >>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>>> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid >>>>>> 00000000-0000-0000-0000-000000000001: split-brain observed. [Input/output >>>>>> error] >>>>>> [2018-12-18 10:06:05.538685] I [MSGID: 108006] >>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>>> subvolumes up >>>>>> [2018-12-18 10:06:05.538794] I [MSGID: 108006] >>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>>> subvolumes up >>>>>> [2018-12-18 10:06:05.539342] I [MSGID: 109063] >>>>>> [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht: Found >>>>>> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid >>>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0 >>>>>> [2018-12-18 10:06:05.539372] W [MSGID: 109005] >>>>>> [dht-selfheal.c:2158:dht_selfheal_directory] 0-ovirt-backbone-2-dht: >>>>>> Directory selfheal failed: 2 subvolumes down.Not fixing. path >>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732, gfid >>>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8 >>>>>> [2018-12-18 10:06:05.539694] I [MSGID: 108006] >>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>>> subvolumes up >>>>>> [2018-12-18 10:06:05.540652] I [MSGID: 108006] >>>>>> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no >>>>>> subvolumes up >>>>>> [2018-12-18 10:06:05.608612] E [MSGID: 108008] >>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>>>>> error] >>>>>> [2018-12-18 10:06:05.608657] E [MSGID: 133014] >>>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>>>>> [2018-12-18 10:06:05.608672] W [fuse-bridge.c:871:fuse_attr_cbk] >>>>>> 0-glusterfs-fuse: 428096: FSTAT() >>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>>>>> [2018-12-18 10:06:06.109339] E [MSGID: 108008] >>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done] >>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid >>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output >>>>>> error] >>>>>> [2018-12-18 10:06:06.109378] E [MSGID: 133014] >>>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed: >>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error] >>>>>> [2018-12-18 10:06:06.109399] W [fuse-bridge.c:871:fuse_attr_cbk] >>>>>> 0-glusterfs-fuse: 428097: FSTAT() >>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output error) >>>>>> >>>>>> #note i'm able to see ; >>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >>>>>> [root at lease-11 ovirt-backbone-2]# stat >>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids >>>>>> File: >>>>>> ?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids? >>>>>> Size: 1048576 Blocks: 2048 IO Block: 131072 regular >>>>>> file >>>>>> Device: 41h/65d Inode: 10492258721813610344 Links: 1 >>>>>> Access: (0660/-rw-rw----) Uid: ( 36/ vdsm) Gid: ( 36/ >>>>>> kvm) >>>>>> Context: system_u:object_r:fusefs_t:s0 >>>>>> Access: 2018-12-19 20:07:39.917573869 +0000 >>>>>> Modify: 2018-12-19 20:07:39.928573917 +0000 >>>>>> Change: 2018-12-19 20:07:39.929573921 +0000 >>>>>> Birth: - >>>>>> >>>>>> however checking: gluster v heal ovirt-backbone-2 info split-brain >>>>>> reports no entries. >>>>>> >>>>>> I've also tried mounting the qemu image, and this works fine, i'm >>>>>> able to see all contents; >>>>>> losetup /dev/loop0 >>>>>> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425 >>>>>> kpartx -a /dev/loop0 >>>>>> vgscan >>>>>> vgchange -ay slave-data >>>>>> mkdir /mnt/slv01 >>>>>> mount /dev/mapper/slave--data-lvol0 /mnt/slv01/ >>>>>> >>>>>> Possible causes for this issue; >>>>>> 1. the machine "lease-11" suffered from a faulty RAM module (ECC), >>>>>> which halted the machine and causes an invalid state. (this machine also >>>>>> hosts other volumes, with similar configurations, which report no issue) >>>>>> 2. after the RAM module was replaced, the VM using the backing qemu >>>>>> image, was restored from a backup (the backup was file based within the VM >>>>>> on a different directory). This is because some files were corrupted. The >>>>>> backup/recovery obviously causes extra IO, possible introducing race >>>>>> conditions? The machine did run for about 12h without issues, and in total >>>>>> for about 36h. >>>>>> 3. since only the client (maybe only gfapi?) reports errors, >>>>>> something is broken there? >>>>>> >>>>>> The volume info; >>>>>> root at lease-06 ~# gluster v info ovirt-backbone-2 >>>>>> >>>>>> Volume Name: ovirt-backbone-2 >>>>>> Type: Distributed-Replicate >>>>>> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28 >>>>>> Status: Started >>>>>> Snapshot Count: 0 >>>>>> Number of Bricks: 3 x (2 + 1) = 9 >>>>>> Transport-type: tcp >>>>>> Bricks: >>>>>> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2 >>>>>> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2 >>>>>> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>>>>> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>>>>> Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>>>>> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>>>>> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>>>>> Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2 >>>>>> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter) >>>>>> Options Reconfigured: >>>>>> nfs.disable: on >>>>>> transport.address-family: inet >>>>>> performance.quick-read: off >>>>>> performance.read-ahead: off >>>>>> performance.io-cache: off >>>>>> performance.low-prio-threads: 32 >>>>>> network.remote-dio: enable >>>>>> cluster.eager-lock: enable >>>>>> cluster.quorum-type: auto >>>>>> cluster.server-quorum-type: server >>>>>> cluster.data-self-heal-algorithm: full >>>>>> cluster.locking-scheme: granular >>>>>> cluster.shd-max-threads: 8 >>>>>> cluster.shd-wait-qlength: 10000 >>>>>> features.shard: on >>>>>> user.cifs: off >>>>>> storage.owner-uid: 36 >>>>>> storage.owner-gid: 36 >>>>>> features.shard-block-size: 64MB >>>>>> performance.write-behind-window-size: 512MB >>>>>> performance.cache-size: 384MB >>>>>> cluster.brick-multiplex: on >>>>>> >>>>>> The volume status; >>>>>> root at lease-06 ~# gluster v status ovirt-backbone-2 >>>>>> Status of volume: ovirt-backbone-2 >>>>>> Gluster process TCP Port RDMA Port >>>>>> Online Pid >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi >>>>>> rt-backbone-2 49152 0 >>>>>> Y 7727 >>>>>> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi >>>>>> rt-backbone-2 49152 0 >>>>>> Y 12620 >>>>>> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi >>>>>> rt-backbone-2 49152 0 >>>>>> Y 8794 >>>>>> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov >>>>>> irt-backbone-2 49161 0 >>>>>> Y 22333 >>>>>> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o >>>>>> virt-backbone-2 49152 0 >>>>>> Y 15030 >>>>>> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi >>>>>> rt-backbone-2 49166 0 >>>>>> Y 24592 >>>>>> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov >>>>>> irt-backbone-2 49153 0 >>>>>> Y 20148 >>>>>> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o >>>>>> virt-backbone-2 49154 0 >>>>>> Y 15413 >>>>>> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi >>>>>> rt-backbone-2 49152 0 >>>>>> Y 43120 >>>>>> Self-heal Daemon on localhost N/A N/A >>>>>> Y 44587 >>>>>> Self-heal Daemon on 10.201.0.2 N/A N/A >>>>>> Y 8401 >>>>>> Self-heal Daemon on 10.201.0.5 N/A N/A >>>>>> Y 11038 >>>>>> Self-heal Daemon on 10.201.0.8 N/A N/A >>>>>> Y 9513 >>>>>> Self-heal Daemon on 10.32.9.4 N/A N/A >>>>>> Y 23736 >>>>>> Self-heal Daemon on 10.32.9.20 N/A N/A >>>>>> Y 2738 >>>>>> Self-heal Daemon on 10.32.9.3 N/A N/A >>>>>> Y 25598 >>>>>> Self-heal Daemon on 10.32.9.5 N/A N/A >>>>>> Y 511 >>>>>> Self-heal Daemon on 10.32.9.9 N/A N/A >>>>>> Y 23357 >>>>>> Self-heal Daemon on 10.32.9.8 N/A N/A >>>>>> Y 15225 >>>>>> Self-heal Daemon on 10.32.9.7 N/A N/A >>>>>> Y 25781 >>>>>> Self-heal Daemon on 10.32.9.21 N/A N/A >>>>>> Y 5034 >>>>>> >>>>>> Task Status of Volume ovirt-backbone-2 >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Task : Rebalance >>>>>> ID : 6dfbac43-0125-4568-9ac3-a2c453faaa3d >>>>>> Status : completed >>>>>> >>>>>> gluster version is @3.12.15 and cluster.op-version=31202 >>>>>> >>>>>> =======================>>>>>> >>>>>> It would be nice to know if it's possible to mark the files as not >>>>>> stale or if i should investigate other things? >>>>>> Or should we consider this volume lost? >>>>>> Also checking the code at; >>>>>> github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c >>>>>> it seems the functions shifted quite some (line 1724 vs. 2243), so maybe >>>>>> it's fixed in a future version? >>>>>> Any thoughts are welcome. >>>>>> >>>>>> Thanks Olaf >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> lists.gluster.org/mailman/listinfo/gluster-users >>> >>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.gluster.org/pipermail/gluster-users/attachments/20190113/6e845f07/attachment.html>