thr3ads.net - Gluster users - [Gluster-users] [Stale file handle] in shard volume [Jan 2019]

If this information is useful, please help other people find it:
Share via:

Olaf Buitelaar

2019-Jan-02 15:25 UTC

[Gluster-users] [Stale file handle] in shard volume

Hi Nithya,

Thank you for your reply.

the VM's using the gluster volumes keeps on getting paused/stopped on
errors like these;
[2019-01-02 02:33:44.469132] E [MSGID: 133010]
[shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
shard 101487 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
[Stale file handle]
[2019-01-02 02:33:44.563288] E [MSGID: 133010]
[shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
shard 101488 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
[Stale file handle]


What i'm trying to find out, if i can purge all gluster volumes from all
possible stale file handles (and hopefully find a method to prevent this in
the future), so the VM's can start running stable again.
For this i need to know when the "shard_common_lookup_shards_cbk"
function
considers a file as stale.
The statement; "Stale file handle errors show up when a file with a
specified gfid is not found." doesn't seem to cover it all, as i've
shown
in earlier mails the shard file and glusterfs/xx/xx/uuid file do both
exist, and have the same inode.
If the criteria i'm using aren't correct, could you please tell me which
criteria i should use to determine if a file is stale or not?
these criteria are just based observations i made, moving the stale files
manually. After removing them i was able to start the VM again..until some
time later it hangs on another stale shard file unfortunate.

Thanks Olaf

Op wo 2 jan. 2019 om 14:20 schreef Nithya Balachandran <nbalacha at
redhat.com>:
>
>
> On Mon, 31 Dec 2018 at 01:27, Olaf Buitelaar <olaf.buitelaar at
gmail.com>
> wrote:
>
>> Dear All,
>>
>> till now a selected group of VM's still seem to produce new stale
file's
>> and getting paused due to this.
>> I've not updated gluster recently, however i did change the op
version
>> from 31200 to 31202 about a week before this issue arose.
>> Looking at the .shard directory, i've 100.000+ files sharing the
same
>> characteristics as a stale file. which are found till now,
>> they all have the sticky bit set, e.g. file permissions; ---------T.
are
>> 0kb in size, and have the trusted.glusterfs.dht.linkto attribute.
>>
>
> These are internal files used by gluster and do not necessarily mean they
> are stale. They "point" to data files which may be on different
bricks
> (same name, gfid etc but no linkto xattr and no ----T permissions).
>
>
>> These files range from long a go (beginning of the year) till now.
Which
>> makes me suspect this was laying dormant for some time now..and somehow
>> recently surfaced.
>> Checking other sub-volumes they contain also 0kb files in the .shard
>> directory, but don't have the sticky bit and the linkto attribute.
>>
>> Does anybody else experience this issue? Could this be a bug or an
>> environmental issue?
>>
> These are most likely valid files- please do not delete them without
> double-checking.
>
> Stale file handle errors show up when a file with a specified gfid is not
> found. You will need to debug the files for which you see this error by
> checking the bricks to see if they actually exist.
>
>>
>> Also i wonder if there is any tool or gluster command to clean all
stale
>> file handles?
>> Otherwise i'm planning to make a simple bash script, which iterates
over
>> the .shard dir, checks each file for the above mentioned criteria, and
>> (re)moves the file and the corresponding .glusterfs file.
>> If there are other criteria needed to identify a stale file handle, i
>> would like to hear that.
>> If this is a viable and safe operation to do of course.
>>
>> Thanks Olaf
>>
>>
>>
>> Op do 20 dec. 2018 om 13:43 schreef Olaf Buitelaar <
>> olaf.buitelaar at gmail.com>:
>>
>>> Dear All,
>>>
>>> I figured it out, it appeared to be the exact same issue as
described
>>> here;
>>>
https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html
>>> Another subvolume also had the shard file, only were all 0 bytes
and had
>>> the dht.linkto
>>>
>>> for reference;
>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex
>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>
>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d
>>>
>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>
>>>
trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100
>>>
>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex
>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>> # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>>
>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d
>>>
>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>
>>>
trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100
>>>
>>> [root at lease-04 ovirt-backbone-2]# stat
>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>>   File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d?
>>>   Size: 0               Blocks: 0          IO Block: 4096   regular
>>> empty file
>>> Device: fd01h/64769d    Inode: 1918631406  Links: 2
>>> Access: (1000/---------T)  Uid: (    0/    root)   Gid: (    0/   
root)
>>> Context: system_u:object_r:etc_runtime_t:s0
>>> Access: 2018-12-17 21:43:36.405735296 +0000
>>> Modify: 2018-12-17 21:43:36.405735296 +0000
>>> Change: 2018-12-17 21:43:36.405735296 +0000
>>>  Birth: -
>>>
>>> removing the shard file and glusterfs file from each node resolved
the
>>> issue.
>>>
>>> I also found this thread;
>>>
https://lists.gluster.org/pipermail/gluster-users/2018-December/035460.html
>>> Maybe he suffers from the same issue.
>>>
>>> Best Olaf
>>>
>>>
>>> Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar <
>>> olaf.buitelaar at gmail.com>:
>>>
>>>> Dear All,
>>>>
>>>> It appears i've a stale file in one of the volumes, on 2
files. These
>>>> files are qemu images (1 raw and 1 qcow2).
>>>> I'll just focus on 1 file since the situation on the other
seems the
>>>> same.
>>>>
>>>> The VM get's paused more or less directly after being
booted with error;
>>>> [2018-12-18 14:05:05.275713] E [MSGID: 133010]
>>>> [shard.c:1724:shard_common_lookup_shards_cbk]
0-ovirt-backbone-2-shard:
>>>> Lookup on shard 51500 failed. Base file gfid >>>>
f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle]
>>>>
>>>> investigating the shard;
>>>>
>>>> #on the arbiter node:
>>>>
>>>> [root at lease-05 ovirt-backbone-2]# getfattr -n
glusterfs.gfid.string
>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>> getfattr: Removing leading '/' from absolute path names
>>>> # file:
>>>>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>
glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>
>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>
>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>
>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>
>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>
>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>
>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>
>>>> [root at lease-05 ovirt-backbone-2]# stat
>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>   File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
>>>>   Size: 0               Blocks: 0          IO Block: 4096  
regular
>>>> empty file
>>>> Device: fd01h/64769d    Inode: 537277306   Links: 2
>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/
root)
>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>> Access: 2018-12-17 21:43:36.361984810 +0000
>>>> Modify: 2018-12-17 21:43:36.361984810 +0000
>>>> Change: 2018-12-18 20:55:29.908647417 +0000
>>>>  Birth: -
>>>>
>>>> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306
>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>
>>>> #on the data nodes:
>>>>
>>>> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string
>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>> getfattr: Removing leading '/' from absolute path names
>>>> # file:
>>>>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>
glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>
>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>
>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>
>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>
>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>
>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>
>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>
>>>> [root at lease-08 ovirt-backbone-2]# stat
>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>   File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
>>>>   Size: 2166784         Blocks: 4128       IO Block: 4096  
regular file
>>>> Device: fd03h/64771d    Inode: 12893624759  Links: 3
>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/
root)
>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>> Access: 2018-12-18 18:52:38.070776585 +0000
>>>> Modify: 2018-12-17 21:43:36.388054443 +0000
>>>> Change: 2018-12-18 21:01:47.810506528 +0000
>>>>  Birth: -
>>>>
>>>> [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759
>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>
>>>> =======================>>>>
>>>> [root at lease-11 ovirt-backbone-2]# getfattr -n
glusterfs.gfid.string
>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>> getfattr: Removing leading '/' from absolute path names
>>>> # file:
>>>>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>
glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>
>>>> [root at lease-11 ovirt-backbone-2]#  getfattr -d -m . -e hex
>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>
>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>
>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>
>>>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>
>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>> trusted.afr.dirty=0x000000000000000000000000
>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>
>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>
>>>> [root at lease-11 ovirt-backbone-2]# stat
>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>   File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
>>>>   Size: 2166784         Blocks: 4128       IO Block: 4096  
regular file
>>>> Device: fd03h/64771d    Inode: 12956094809  Links: 3
>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (    0/
root)
>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>> Access: 2018-12-18 20:11:53.595208449 +0000
>>>> Modify: 2018-12-17 21:43:36.391580259 +0000
>>>> Change: 2018-12-18 19:19:25.888055392 +0000
>>>>  Birth: -
>>>>
>>>> [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809
>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>
>>>> ===============>>>>
>>>> I don't really see any inconsistencies, except the dates on
the stat.
>>>> However this is only after i tried moving the file out of the
volumes to
>>>> force a heal, which does happen on the data nodes, but not on
the arbiter
>>>> node. Before that they were also the same.
>>>> I've also compared the file
>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2
nodes and they
>>>> are exactly the same.
>>>>
>>>> Things i've further tried;
>>>> - gluster v heal ovirt-backbone-2 full => gluster v heal
>>>> ovirt-backbone-2 info reports 0 entries on all nodes
>>>>
>>>> - stop each glusterd and glusterfsd, pause around 40sec and
start them
>>>> again on each node, 1 at a time, waiting for the heal to
recover before
>>>> moving to the next node
>>>>
>>>> - force a heal by stopping glusterd on a node and perform these
steps;
>>>> mkdir /mnt/ovirt-backbone-2/trigger
>>>> rmdir /mnt/ovirt-backbone-2/trigger
>>>> setfattr -n trusted.non-existent-key -v abc
/mnt/ovirt-backbone-2/
>>>> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/
>>>> start glusterd
>>>>
>>>> - gluster volume rebalance ovirt-backbone-2 start => success
>>>>
>>>> Whats further interesting is that according the mount log, the
volume
>>>> is in split-brain;
>>>> [2018-12-18 10:06:04.606870] E [MSGID: 108008]
>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed.
[Input/output
>>>> error]
>>>> [2018-12-18 10:06:04.606908] E [MSGID: 133014]
>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard:
stat failed:
>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>> [2018-12-18 10:06:04.606927] W
[fuse-bridge.c:871:fuse_attr_cbk]
>>>> 0-glusterfs-fuse: 428090: FSTAT()
>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1
(Input/output error)
>>>> [2018-12-18 10:06:05.107729] E [MSGID: 108008]
>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed.
[Input/output
>>>> error]
>>>> [2018-12-18 10:06:05.107770] E [MSGID: 133014]
>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard:
stat failed:
>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>> [2018-12-18 10:06:05.107791] W
[fuse-bridge.c:871:fuse_attr_cbk]
>>>> 0-glusterfs-fuse: 428091: FSTAT()
>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1
(Input/output error)
>>>> [2018-12-18 10:06:05.537244] I [MSGID: 108006]
>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>> subvolumes up
>>>> [2018-12-18 10:06:05.538523] E [MSGID: 108008]
>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid
>>>> 00000000-0000-0000-0000-000000000001: split-brain observed.
[Input/output
>>>> error]
>>>> [2018-12-18 10:06:05.538685] I [MSGID: 108006]
>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>> subvolumes up
>>>> [2018-12-18 10:06:05.538794] I [MSGID: 108006]
>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>> subvolumes up
>>>> [2018-12-18 10:06:05.539342] I [MSGID: 109063]
>>>> [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht:
Found
>>>> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid
>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0
>>>> [2018-12-18 10:06:05.539372] W [MSGID: 109005]
>>>> [dht-selfheal.c:2158:dht_selfheal_directory]
0-ovirt-backbone-2-dht:
>>>> Directory selfheal failed: 2 subvolumes down.Not fixing. path
>>>> /b1c2c949-aef4-4aec-999b-b179efeef732, gfid >>>>
8c8598ce-1a52-418e-a7b4-435fee34bae8
>>>> [2018-12-18 10:06:05.539694] I [MSGID: 108006]
>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>> subvolumes up
>>>> [2018-12-18 10:06:05.540652] I [MSGID: 108006]
>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>> subvolumes up
>>>> [2018-12-18 10:06:05.608612] E [MSGID: 108008]
>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed.
[Input/output
>>>> error]
>>>> [2018-12-18 10:06:05.608657] E [MSGID: 133014]
>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard:
stat failed:
>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>> [2018-12-18 10:06:05.608672] W
[fuse-bridge.c:871:fuse_attr_cbk]
>>>> 0-glusterfs-fuse: 428096: FSTAT()
>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1
(Input/output error)
>>>> [2018-12-18 10:06:06.109339] E [MSGID: 108008]
>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed.
[Input/output
>>>> error]
>>>> [2018-12-18 10:06:06.109378] E [MSGID: 133014]
>>>> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard:
stat failed:
>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>> [2018-12-18 10:06:06.109399] W
[fuse-bridge.c:871:fuse_attr_cbk]
>>>> 0-glusterfs-fuse: 428097: FSTAT()
>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1
(Input/output error)
>>>>
>>>> #note i'm able to see ;
>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids
>>>> [root at lease-11 ovirt-backbone-2]# stat
>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids
>>>>   File:
>>>>
?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids?
>>>>   Size: 1048576         Blocks: 2048       IO Block: 131072
regular file
>>>> Device: 41h/65d Inode: 10492258721813610344  Links: 1
>>>> Access: (0660/-rw-rw----)  Uid: (   36/    vdsm)   Gid: (   36/
kvm)
>>>> Context: system_u:object_r:fusefs_t:s0
>>>> Access: 2018-12-19 20:07:39.917573869 +0000
>>>> Modify: 2018-12-19 20:07:39.928573917 +0000
>>>> Change: 2018-12-19 20:07:39.929573921 +0000
>>>>  Birth: -
>>>>
>>>> however checking: gluster v heal ovirt-backbone-2 info
split-brain
>>>> reports no entries.
>>>>
>>>> I've also tried mounting the qemu image, and this works
fine, i'm able
>>>> to see all contents;
>>>>  losetup /dev/loop0
>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>  kpartx -a /dev/loop0
>>>>  vgscan
>>>>  vgchange -ay slave-data
>>>>  mkdir /mnt/slv01
>>>>  mount /dev/mapper/slave--data-lvol0 /mnt/slv01/
>>>>
>>>> Possible causes for this issue;
>>>> 1. the machine "lease-11" suffered from a faulty RAM
module (ECC),
>>>> which halted the machine and causes an invalid state. (this
machine also
>>>> hosts other volumes, with similar configurations, which report
no issue)
>>>> 2. after the RAM module was replaced, the VM using the backing
qemu
>>>> image, was restored from a backup (the backup was file based
within the VM
>>>> on a different directory). This is because some files were
corrupted. The
>>>> backup/recovery obviously causes extra IO, possible introducing
race
>>>> conditions? The machine did run for about 12h without issues,
and in total
>>>> for about 36h.
>>>> 3. since only the client (maybe only gfapi?) reports errors,
something
>>>> is broken there?
>>>>
>>>> The volume info;
>>>> root at lease-06 ~# gluster v info ovirt-backbone-2
>>>>
>>>> Volume Name: ovirt-backbone-2
>>>> Type: Distributed-Replicate
>>>> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28
>>>> Status: Started
>>>> Snapshot Count: 0
>>>> Number of Bricks: 3 x (2 + 1) = 9
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2
>>>> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2
>>>> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2
(arbiter)
>>>> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>> Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2
(arbiter)
>>>> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>> Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2
(arbiter)
>>>> Options Reconfigured:
>>>> nfs.disable: on
>>>> transport.address-family: inet
>>>> performance.quick-read: off
>>>> performance.read-ahead: off
>>>> performance.io-cache: off
>>>> performance.low-prio-threads: 32
>>>> network.remote-dio: enable
>>>> cluster.eager-lock: enable
>>>> cluster.quorum-type: auto
>>>> cluster.server-quorum-type: server
>>>> cluster.data-self-heal-algorithm: full
>>>> cluster.locking-scheme: granular
>>>> cluster.shd-max-threads: 8
>>>> cluster.shd-wait-qlength: 10000
>>>> features.shard: on
>>>> user.cifs: off
>>>> storage.owner-uid: 36
>>>> storage.owner-gid: 36
>>>> features.shard-block-size: 64MB
>>>> performance.write-behind-window-size: 512MB
>>>> performance.cache-size: 384MB
>>>> cluster.brick-multiplex: on
>>>>
>>>> The volume status;
>>>> root at lease-06 ~# gluster v status ovirt-backbone-2
>>>> Status of volume: ovirt-backbone-2
>>>> Gluster process                             TCP Port  RDMA Port
>>>> Online  Pid
>>>>
>>>>
------------------------------------------------------------------------------
>>>> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi
>>>> rt-backbone-2                               49152     0
>>>> Y       7727
>>>> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi
>>>> rt-backbone-2                               49152     0
>>>> Y       12620
>>>> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi
>>>> rt-backbone-2                               49152     0
>>>> Y       8794
>>>> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov
>>>> irt-backbone-2                              49161     0
>>>> Y       22333
>>>> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o
>>>> virt-backbone-2                             49152     0
>>>> Y       15030
>>>> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi
>>>> rt-backbone-2                               49166     0
>>>> Y       24592
>>>> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov
>>>> irt-backbone-2                              49153     0
>>>> Y       20148
>>>> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o
>>>> virt-backbone-2                             49154     0
>>>> Y       15413
>>>> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi
>>>> rt-backbone-2                               49152     0
>>>> Y       43120
>>>> Self-heal Daemon on localhost               N/A       N/A
>>>> Y       44587
>>>> Self-heal Daemon on 10.201.0.2              N/A       N/A
>>>> Y       8401
>>>> Self-heal Daemon on 10.201.0.5              N/A       N/A
>>>> Y       11038
>>>> Self-heal Daemon on 10.201.0.8              N/A       N/A
>>>> Y       9513
>>>> Self-heal Daemon on 10.32.9.4               N/A       N/A
>>>> Y       23736
>>>> Self-heal Daemon on 10.32.9.20              N/A       N/A
>>>> Y       2738
>>>> Self-heal Daemon on 10.32.9.3               N/A       N/A
>>>> Y       25598
>>>> Self-heal Daemon on 10.32.9.5               N/A       N/A
>>>> Y       511
>>>> Self-heal Daemon on 10.32.9.9               N/A       N/A
>>>> Y       23357
>>>> Self-heal Daemon on 10.32.9.8               N/A       N/A
>>>> Y       15225
>>>> Self-heal Daemon on 10.32.9.7               N/A       N/A
>>>> Y       25781
>>>> Self-heal Daemon on 10.32.9.21              N/A       N/A
>>>> Y       5034
>>>>
>>>> Task Status of Volume ovirt-backbone-2
>>>>
>>>>
------------------------------------------------------------------------------
>>>> Task                 : Rebalance
>>>> ID                   : 6dfbac43-0125-4568-9ac3-a2c453faaa3d
>>>> Status               : completed
>>>>
>>>> gluster version is @3.12.15 and cluster.op-version=31202
>>>>
>>>> =======================>>>>
>>>> It would be nice to know if it's possible to mark the files
as not
>>>> stale or if i should investigate other things?
>>>> Or should we consider this volume lost?
>>>> Also checking the code at;
>>>>
https://github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c
>>>> it seems the functions shifted quite some (line 1724 vs. 2243),
so maybe
>>>> it's fixed in a future version?
>>>> Any thoughts are welcome.
>>>>
>>>> Thanks Olaf
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190102/0325e877/attachment.html>

Nithya Balachandran

2019-Jan-04 06:50 UTC

head link

[Gluster-users] [Stale file handle] in shard volume

Adding Krutika.

On Wed, 2 Jan 2019 at 20:56, Olaf Buitelaar <olaf.buitelaar at gmail.com>
wrote:
> Hi Nithya,
>
> Thank you for your reply.
>
> the VM's using the gluster volumes keeps on getting paused/stopped on
> errors like these;
> [2019-01-02 02:33:44.469132] E [MSGID: 133010]
> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
> shard 101487 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
> [Stale file handle]
> [2019-01-02 02:33:44.563288] E [MSGID: 133010]
> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-kube-shard: Lookup on
> shard 101488 failed. Base file gfid = a38d64bc-a28b-4ee1-a0bb-f919e7a1022c
> [Stale file handle]
>
> Krutika, Can you take a look at this?
>
> What i'm trying to find out, if i can purge all gluster volumes from
all
> possible stale file handles (and hopefully find a method to prevent this in
> the future), so the VM's can start running stable again.
> For this i need to know when the "shard_common_lookup_shards_cbk"
function
> considers a file as stale.
> The statement; "Stale file handle errors show up when a file with a
> specified gfid is not found." doesn't seem to cover it all, as
i've shown
> in earlier mails the shard file and glusterfs/xx/xx/uuid file do both
> exist, and have the same inode.
> If the criteria i'm using aren't correct, could you please tell me
which
> criteria i should use to determine if a file is stale or not?
> these criteria are just based observations i made, moving the stale files
> manually. After removing them i was able to start the VM again..until some
> time later it hangs on another stale shard file unfortunate.
>
> Thanks Olaf
>
> Op wo 2 jan. 2019 om 14:20 schreef Nithya Balachandran <
> nbalacha at redhat.com>:
>
>>
>>
>> On Mon, 31 Dec 2018 at 01:27, Olaf Buitelaar <olaf.buitelaar at
gmail.com>
>> wrote:
>>
>>> Dear All,
>>>
>>> till now a selected group of VM's still seem to produce new
stale file's
>>> and getting paused due to this.
>>> I've not updated gluster recently, however i did change the op
version
>>> from 31200 to 31202 about a week before this issue arose.
>>> Looking at the .shard directory, i've 100.000+ files sharing
the same
>>> characteristics as a stale file. which are found till now,
>>> they all have the sticky bit set, e.g. file permissions;
---------T. are
>>> 0kb in size, and have the trusted.glusterfs.dht.linkto attribute.
>>>
>>
>> These are internal files used by gluster and do not necessarily mean
they
>> are stale. They "point" to data files which may be on
different bricks
>> (same name, gfid etc but no linkto xattr and no ----T permissions).
>>
>>
>>> These files range from long a go (beginning of the year) till now.
Which
>>> makes me suspect this was laying dormant for some time now..and
somehow
>>> recently surfaced.
>>> Checking other sub-volumes they contain also 0kb files in the
.shard
>>> directory, but don't have the sticky bit and the linkto
attribute.
>>>
>>> Does anybody else experience this issue? Could this be a bug or an
>>> environmental issue?
>>>
>> These are most likely valid files- please do not delete them without
>> double-checking.
>>
>> Stale file handle errors show up when a file with a specified gfid is
not
>> found. You will need to debug the files for which you see this error by
>> checking the bricks to see if they actually exist.
>>
>>>
>>> Also i wonder if there is any tool or gluster command to clean all
stale
>>> file handles?
>>> Otherwise i'm planning to make a simple bash script, which
iterates over
>>> the .shard dir, checks each file for the above mentioned criteria,
and
>>> (re)moves the file and the corresponding .glusterfs file.
>>> If there are other criteria needed to identify a stale file handle,
i
>>> would like to hear that.
>>> If this is a viable and safe operation to do of course.
>>>
>>> Thanks Olaf
>>>
>>>
>>>
>>> Op do 20 dec. 2018 om 13:43 schreef Olaf Buitelaar <
>>> olaf.buitelaar at gmail.com>:
>>>
>>>> Dear All,
>>>>
>>>> I figured it out, it appeared to be the exact same issue as
described
>>>> here;
>>>>
https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html
>>>> Another subvolume also had the shard file, only were all 0
bytes and
>>>> had the dht.linkto
>>>>
>>>> for reference;
>>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>
>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d
>>>>
>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>
>>>>
trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100
>>>>
>>>> [root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex
>>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>>> # file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>>>
>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>> trusted.gfid=0x298147e49f9748b2baf1c8fff897244d
>>>>
>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>
>>>>
trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100
>>>>
>>>> [root at lease-04 ovirt-backbone-2]# stat
>>>> .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
>>>>   File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d?
>>>>   Size: 0               Blocks: 0          IO Block: 4096  
regular
>>>> empty file
>>>> Device: fd01h/64769d    Inode: 1918631406  Links: 2
>>>> Access: (1000/---------T)  Uid: (    0/    root)   Gid: (    0/
root)
>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>> Access: 2018-12-17 21:43:36.405735296 +0000
>>>> Modify: 2018-12-17 21:43:36.405735296 +0000
>>>> Change: 2018-12-17 21:43:36.405735296 +0000
>>>>  Birth: -
>>>>
>>>> removing the shard file and glusterfs file from each node
resolved the
>>>> issue.
>>>>
>>>> I also found this thread;
>>>>
https://lists.gluster.org/pipermail/gluster-users/2018-December/035460.html
>>>> Maybe he suffers from the same issue.
>>>>
>>>> Best Olaf
>>>>
>>>>
>>>> Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar <
>>>> olaf.buitelaar at gmail.com>:
>>>>
>>>>> Dear All,
>>>>>
>>>>> It appears i've a stale file in one of the volumes, on
2 files. These
>>>>> files are qemu images (1 raw and 1 qcow2).
>>>>> I'll just focus on 1 file since the situation on the
other seems the
>>>>> same.
>>>>>
>>>>> The VM get's paused more or less directly after being
booted with
>>>>> error;
>>>>> [2018-12-18 14:05:05.275713] E [MSGID: 133010]
>>>>> [shard.c:1724:shard_common_lookup_shards_cbk]
0-ovirt-backbone-2-shard:
>>>>> Lookup on shard 51500 failed. Base file gfid
>>>>> f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle]
>>>>>
>>>>> investigating the shard;
>>>>>
>>>>> #on the arbiter node:
>>>>>
>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -n
glusterfs.gfid.string
>>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>> getfattr: Removing leading '/' from absolute path
names
>>>>> # file:
>>>>>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>
glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>>
>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e
hex
>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>
>>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>
>>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>
>>>>> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e
hex
>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>> # file:
.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>
>>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>
>>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>
>>>>> [root at lease-05 ovirt-backbone-2]# stat
>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>   File:
?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
>>>>>   Size: 0               Blocks: 0          IO Block: 4096  
regular
>>>>> empty file
>>>>> Device: fd01h/64769d    Inode: 537277306   Links: 2
>>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (  
0/
>>>>> root)
>>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>>> Access: 2018-12-17 21:43:36.361984810 +0000
>>>>> Modify: 2018-12-17 21:43:36.361984810 +0000
>>>>> Change: 2018-12-18 20:55:29.908647417 +0000
>>>>>  Birth: -
>>>>>
>>>>> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306
>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>
>>>>> #on the data nodes:
>>>>>
>>>>> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string
>>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>> getfattr: Removing leading '/' from absolute path
names
>>>>> # file:
>>>>>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>
glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>>
>>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e
hex
>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>
>>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>
>>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>
>>>>> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e
hex
>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>> # file:
.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>
>>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>
>>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>
>>>>> [root at lease-08 ovirt-backbone-2]# stat
>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>   File:
?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
>>>>>   Size: 2166784         Blocks: 4128       IO Block: 4096  
regular
>>>>> file
>>>>> Device: fd03h/64771d    Inode: 12893624759  Links: 3
>>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (  
0/
>>>>> root)
>>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>>> Access: 2018-12-18 18:52:38.070776585 +0000
>>>>> Modify: 2018-12-17 21:43:36.388054443 +0000
>>>>> Change: 2018-12-18 21:01:47.810506528 +0000
>>>>>  Birth: -
>>>>>
>>>>> [root at lease-08 ovirt-backbone-2]# find . -inum
12893624759
>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>
>>>>> =======================>>>>>
>>>>> [root at lease-11 ovirt-backbone-2]# getfattr -n
glusterfs.gfid.string
>>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>> getfattr: Removing leading '/' from absolute path
names
>>>>> # file:
>>>>>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>
glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>>>>>
>>>>> [root at lease-11 ovirt-backbone-2]#  getfattr -d -m . -e
hex
>>>>> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>
>>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>
>>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>
>>>>> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e
hex
>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>> # file:
.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>
>>>>>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
>>>>> trusted.afr.dirty=0x000000000000000000000000
>>>>> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>>>>>
>>>>>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>>>>>
>>>>> [root at lease-11 ovirt-backbone-2]# stat
>>>>> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>>   File:
?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
>>>>>   Size: 2166784         Blocks: 4128       IO Block: 4096  
regular
>>>>> file
>>>>> Device: fd03h/64771d    Inode: 12956094809  Links: 3
>>>>> Access: (0660/-rw-rw----)  Uid: (    0/    root)   Gid: (  
0/
>>>>> root)
>>>>> Context: system_u:object_r:etc_runtime_t:s0
>>>>> Access: 2018-12-18 20:11:53.595208449 +0000
>>>>> Modify: 2018-12-17 21:43:36.391580259 +0000
>>>>> Change: 2018-12-18 19:19:25.888055392 +0000
>>>>>  Birth: -
>>>>>
>>>>> [root at lease-11 ovirt-backbone-2]# find . -inum
12956094809
>>>>> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>>>>>
>>>>> ===============>>>>>
>>>>> I don't really see any inconsistencies, except the
dates on the stat.
>>>>> However this is only after i tried moving the file out of
the volumes to
>>>>> force a heal, which does happen on the data nodes, but not
on the arbiter
>>>>> node. Before that they were also the same.
>>>>> I've also compared the file
>>>>> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the
2 nodes and they
>>>>> are exactly the same.
>>>>>
>>>>> Things i've further tried;
>>>>> - gluster v heal ovirt-backbone-2 full => gluster v heal
>>>>> ovirt-backbone-2 info reports 0 entries on all nodes
>>>>>
>>>>> - stop each glusterd and glusterfsd, pause around 40sec and
start them
>>>>> again on each node, 1 at a time, waiting for the heal to
recover before
>>>>> moving to the next node
>>>>>
>>>>> - force a heal by stopping glusterd on a node and perform
these steps;
>>>>> mkdir /mnt/ovirt-backbone-2/trigger
>>>>> rmdir /mnt/ovirt-backbone-2/trigger
>>>>> setfattr -n trusted.non-existent-key -v abc
/mnt/ovirt-backbone-2/
>>>>> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/
>>>>> start glusterd
>>>>>
>>>>> - gluster volume rebalance ovirt-backbone-2 start =>
success
>>>>>
>>>>> Whats further interesting is that according the mount log,
the volume
>>>>> is in split-brain;
>>>>> [2018-12-18 10:06:04.606870] E [MSGID: 108008]
>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed.
[Input/output
>>>>> error]
>>>>> [2018-12-18 10:06:04.606908] E [MSGID: 133014]
>>>>> [shard.c:1248:shard_common_stat_cbk]
0-ovirt-backbone-2-shard: stat failed:
>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>>> [2018-12-18 10:06:04.606927] W
[fuse-bridge.c:871:fuse_attr_cbk]
>>>>> 0-glusterfs-fuse: 428090: FSTAT()
>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1
(Input/output error)
>>>>> [2018-12-18 10:06:05.107729] E [MSGID: 108008]
>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed.
[Input/output
>>>>> error]
>>>>> [2018-12-18 10:06:05.107770] E [MSGID: 133014]
>>>>> [shard.c:1248:shard_common_stat_cbk]
0-ovirt-backbone-2-shard: stat failed:
>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>>> [2018-12-18 10:06:05.107791] W
[fuse-bridge.c:871:fuse_attr_cbk]
>>>>> 0-glusterfs-fuse: 428091: FSTAT()
>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1
(Input/output error)
>>>>> [2018-12-18 10:06:05.537244] I [MSGID: 108006]
>>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>>> subvolumes up
>>>>> [2018-12-18 10:06:05.538523] E [MSGID: 108008]
>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid
>>>>> 00000000-0000-0000-0000-000000000001: split-brain observed.
[Input/output
>>>>> error]
>>>>> [2018-12-18 10:06:05.538685] I [MSGID: 108006]
>>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>>> subvolumes up
>>>>> [2018-12-18 10:06:05.538794] I [MSGID: 108006]
>>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>>> subvolumes up
>>>>> [2018-12-18 10:06:05.539342] I [MSGID: 109063]
>>>>> [dht-layout.c:716:dht_layout_normalize]
0-ovirt-backbone-2-dht: Found
>>>>> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid
>>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0
>>>>> [2018-12-18 10:06:05.539372] W [MSGID: 109005]
>>>>> [dht-selfheal.c:2158:dht_selfheal_directory]
0-ovirt-backbone-2-dht:
>>>>> Directory selfheal failed: 2 subvolumes down.Not fixing.
path >>>>> /b1c2c949-aef4-4aec-999b-b179efeef732, gfid
>>>>> 8c8598ce-1a52-418e-a7b4-435fee34bae8
>>>>> [2018-12-18 10:06:05.539694] I [MSGID: 108006]
>>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>>> subvolumes up
>>>>> [2018-12-18 10:06:05.540652] I [MSGID: 108006]
>>>>> [afr-common.c:5494:afr_local_init]
0-ovirt-backbone-2-replicate-1: no
>>>>> subvolumes up
>>>>> [2018-12-18 10:06:05.608612] E [MSGID: 108008]
>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed.
[Input/output
>>>>> error]
>>>>> [2018-12-18 10:06:05.608657] E [MSGID: 133014]
>>>>> [shard.c:1248:shard_common_stat_cbk]
0-ovirt-backbone-2-shard: stat failed:
>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>>> [2018-12-18 10:06:05.608672] W
[fuse-bridge.c:871:fuse_attr_cbk]
>>>>> 0-glusterfs-fuse: 428096: FSTAT()
>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1
(Input/output error)
>>>>> [2018-12-18 10:06:06.109339] E [MSGID: 108008]
>>>>> [afr-read-txn.c:90:afr_read_txn_refresh_done]
>>>>> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed.
[Input/output
>>>>> error]
>>>>> [2018-12-18 10:06:06.109378] E [MSGID: 133014]
>>>>> [shard.c:1248:shard_common_stat_cbk]
0-ovirt-backbone-2-shard: stat failed:
>>>>> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
>>>>> [2018-12-18 10:06:06.109399] W
[fuse-bridge.c:871:fuse_attr_cbk]
>>>>> 0-glusterfs-fuse: 428097: FSTAT()
>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1
(Input/output error)
>>>>>
>>>>> #note i'm able to see ;
>>>>> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids
>>>>> [root at lease-11 ovirt-backbone-2]# stat
>>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids
>>>>>   File:
>>>>>
?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids?
>>>>>   Size: 1048576         Blocks: 2048       IO Block: 131072
regular
>>>>> file
>>>>> Device: 41h/65d Inode: 10492258721813610344  Links: 1
>>>>> Access: (0660/-rw-rw----)  Uid: (   36/    vdsm)   Gid: (  
36/
>>>>> kvm)
>>>>> Context: system_u:object_r:fusefs_t:s0
>>>>> Access: 2018-12-19 20:07:39.917573869 +0000
>>>>> Modify: 2018-12-19 20:07:39.928573917 +0000
>>>>> Change: 2018-12-19 20:07:39.929573921 +0000
>>>>>  Birth: -
>>>>>
>>>>> however checking: gluster v heal ovirt-backbone-2 info
split-brain
>>>>> reports no entries.
>>>>>
>>>>> I've also tried mounting the qemu image, and this works
fine, i'm able
>>>>> to see all contents;
>>>>>  losetup /dev/loop0
>>>>>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
>>>>>  kpartx -a /dev/loop0
>>>>>  vgscan
>>>>>  vgchange -ay slave-data
>>>>>  mkdir /mnt/slv01
>>>>>  mount /dev/mapper/slave--data-lvol0 /mnt/slv01/
>>>>>
>>>>> Possible causes for this issue;
>>>>> 1. the machine "lease-11" suffered from a faulty
RAM module (ECC),
>>>>> which halted the machine and causes an invalid state. (this
machine also
>>>>> hosts other volumes, with similar configurations, which
report no issue)
>>>>> 2. after the RAM module was replaced, the VM using the
backing qemu
>>>>> image, was restored from a backup (the backup was file
based within the VM
>>>>> on a different directory). This is because some files were
corrupted. The
>>>>> backup/recovery obviously causes extra IO, possible
introducing race
>>>>> conditions? The machine did run for about 12h without
issues, and in total
>>>>> for about 36h.
>>>>> 3. since only the client (maybe only gfapi?) reports
errors, something
>>>>> is broken there?
>>>>>
>>>>> The volume info;
>>>>> root at lease-06 ~# gluster v info ovirt-backbone-2
>>>>>
>>>>> Volume Name: ovirt-backbone-2
>>>>> Type: Distributed-Replicate
>>>>> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 3 x (2 + 1) = 9
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2
>>>>> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2
>>>>> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2
(arbiter)
>>>>> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>>> Brick5:
10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>>> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2
(arbiter)
>>>>> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>>> Brick8:
10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2
>>>>> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2
(arbiter)
>>>>> Options Reconfigured:
>>>>> nfs.disable: on
>>>>> transport.address-family: inet
>>>>> performance.quick-read: off
>>>>> performance.read-ahead: off
>>>>> performance.io-cache: off
>>>>> performance.low-prio-threads: 32
>>>>> network.remote-dio: enable
>>>>> cluster.eager-lock: enable
>>>>> cluster.quorum-type: auto
>>>>> cluster.server-quorum-type: server
>>>>> cluster.data-self-heal-algorithm: full
>>>>> cluster.locking-scheme: granular
>>>>> cluster.shd-max-threads: 8
>>>>> cluster.shd-wait-qlength: 10000
>>>>> features.shard: on
>>>>> user.cifs: off
>>>>> storage.owner-uid: 36
>>>>> storage.owner-gid: 36
>>>>> features.shard-block-size: 64MB
>>>>> performance.write-behind-window-size: 512MB
>>>>> performance.cache-size: 384MB
>>>>> cluster.brick-multiplex: on
>>>>>
>>>>> The volume status;
>>>>> root at lease-06 ~# gluster v status ovirt-backbone-2
>>>>> Status of volume: ovirt-backbone-2
>>>>> Gluster process                             TCP Port  RDMA
Port
>>>>> Online  Pid
>>>>>
>>>>>
------------------------------------------------------------------------------
>>>>> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi
>>>>> rt-backbone-2                               49152     0
>>>>> Y       7727
>>>>> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi
>>>>> rt-backbone-2                               49152     0
>>>>> Y       12620
>>>>> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi
>>>>> rt-backbone-2                               49152     0
>>>>> Y       8794
>>>>> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov
>>>>> irt-backbone-2                              49161     0
>>>>> Y       22333
>>>>> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o
>>>>> virt-backbone-2                             49152     0
>>>>> Y       15030
>>>>> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi
>>>>> rt-backbone-2                               49166     0
>>>>> Y       24592
>>>>> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov
>>>>> irt-backbone-2                              49153     0
>>>>> Y       20148
>>>>> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o
>>>>> virt-backbone-2                             49154     0
>>>>> Y       15413
>>>>> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi
>>>>> rt-backbone-2                               49152     0
>>>>> Y       43120
>>>>> Self-heal Daemon on localhost               N/A       N/A
>>>>> Y       44587
>>>>> Self-heal Daemon on 10.201.0.2              N/A       N/A
>>>>> Y       8401
>>>>> Self-heal Daemon on 10.201.0.5              N/A       N/A
>>>>> Y       11038
>>>>> Self-heal Daemon on 10.201.0.8              N/A       N/A
>>>>> Y       9513
>>>>> Self-heal Daemon on 10.32.9.4               N/A       N/A
>>>>> Y       23736
>>>>> Self-heal Daemon on 10.32.9.20              N/A       N/A
>>>>> Y       2738
>>>>> Self-heal Daemon on 10.32.9.3               N/A       N/A
>>>>> Y       25598
>>>>> Self-heal Daemon on 10.32.9.5               N/A       N/A
>>>>> Y       511
>>>>> Self-heal Daemon on 10.32.9.9               N/A       N/A
>>>>> Y       23357
>>>>> Self-heal Daemon on 10.32.9.8               N/A       N/A
>>>>> Y       15225
>>>>> Self-heal Daemon on 10.32.9.7               N/A       N/A
>>>>> Y       25781
>>>>> Self-heal Daemon on 10.32.9.21              N/A       N/A
>>>>> Y       5034
>>>>>
>>>>> Task Status of Volume ovirt-backbone-2
>>>>>
>>>>>
------------------------------------------------------------------------------
>>>>> Task                 : Rebalance
>>>>> ID                   : 6dfbac43-0125-4568-9ac3-a2c453faaa3d
>>>>> Status               : completed
>>>>>
>>>>> gluster version is @3.12.15 and cluster.op-version=31202
>>>>>
>>>>> =======================>>>>>
>>>>> It would be nice to know if it's possible to mark the
files as not
>>>>> stale or if i should investigate other things?
>>>>> Or should we consider this volume lost?
>>>>> Also checking the code at;
>>>>>
https://github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c
>>>>> it seems the functions shifted quite some (line 1724 vs.
2243), so maybe
>>>>> it's fixed in a future version?
>>>>> Any thoughts are welcome.
>>>>>
>>>>> Thanks Olaf
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190104/3499535f/attachment.html>

Gluster users - Jan 2019 - [Stale file handle] in shard volume

[Gluster-users] [Stale file handle] in shard volume

[Gluster-users] [Stale file handle] in shard volume