Dear All,
I figured it out, it appeared to be the exact same issue as described
here;
https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html
Another subvolume also had the shard file, only were all 0 bytes and had
the dht.linkto
for reference;
[root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex
.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
# file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
trusted.gfid=0x298147e49f9748b2baf1c8fff897244d
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100
[root at lease-04 ovirt-backbone-2]# getfattr -d -m . -e hex
.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
# file: .glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
trusted.gfid=0x298147e49f9748b2baf1c8fff897244d
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
trusted.glusterfs.dht.linkto=0x6f766972742d6261636b626f6e652d322d7265706c69636174652d3100
[root at lease-04 ovirt-backbone-2]# stat
.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d
File: ?.glusterfs/29/81/298147e4-9f97-48b2-baf1-c8fff897244d?
Size: 0 Blocks: 0 IO Block: 4096 regular empty
file
Device: fd01h/64769d Inode: 1918631406 Links: 2
Access: (1000/---------T) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:etc_runtime_t:s0
Access: 2018-12-17 21:43:36.405735296 +0000
Modify: 2018-12-17 21:43:36.405735296 +0000
Change: 2018-12-17 21:43:36.405735296 +0000
Birth: -
removing the shard file and glusterfs file from each node resolved the
issue.
I also found this thread;
https://lists.gluster.org/pipermail/gluster-users/2018-December/035460.html
Maybe he suffers from the same issue.
Best Olaf
Op wo 19 dec. 2018 om 21:56 schreef Olaf Buitelaar <olaf.buitelaar at
gmail.com>:
> Dear All,
>
> It appears i've a stale file in one of the volumes, on 2 files. These
> files are qemu images (1 raw and 1 qcow2).
> I'll just focus on 1 file since the situation on the other seems the
same.
>
> The VM get's paused more or less directly after being booted with
error;
> [2018-12-18 14:05:05.275713] E [MSGID: 133010]
> [shard.c:1724:shard_common_lookup_shards_cbk] 0-ovirt-backbone-2-shard:
> Lookup on shard 51500 failed. Base file gfid >
f28cabcb-d169-41fc-a633-9bef4c4a8e40 [Stale file handle]
>
> investigating the shard;
>
> #on the arbiter node:
>
> [root at lease-05 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string
>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
> getfattr: Removing leading '/' from absolute path names
> # file:
>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>
> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex
> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>
>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>
>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>
> [root at lease-05 ovirt-backbone-2]# getfattr -d -m . -e hex
> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>
>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>
>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>
> [root at lease-05 ovirt-backbone-2]# stat
> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
> Size: 0 Blocks: 0 IO Block: 4096 regular empty
> file
> Device: fd01h/64769d Inode: 537277306 Links: 2
> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root)
> Context: system_u:object_r:etc_runtime_t:s0
> Access: 2018-12-17 21:43:36.361984810 +0000
> Modify: 2018-12-17 21:43:36.361984810 +0000
> Change: 2018-12-18 20:55:29.908647417 +0000
> Birth: -
>
> [root at lease-05 ovirt-backbone-2]# find . -inum 537277306
> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>
> #on the data nodes:
>
> [root at lease-08 ~]# getfattr -n glusterfs.gfid.string
>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
> getfattr: Removing leading '/' from absolute path names
> # file:
>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>
> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex
> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>
>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>
>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>
> [root at lease-08 ovirt-backbone-2]# getfattr -d -m . -e hex
> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>
>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>
>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>
> [root at lease-08 ovirt-backbone-2]# stat
> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
> Size: 2166784 Blocks: 4128 IO Block: 4096 regular file
> Device: fd03h/64771d Inode: 12893624759 Links: 3
> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root)
> Context: system_u:object_r:etc_runtime_t:s0
> Access: 2018-12-18 18:52:38.070776585 +0000
> Modify: 2018-12-17 21:43:36.388054443 +0000
> Change: 2018-12-18 21:01:47.810506528 +0000
> Birth: -
>
> [root at lease-08 ovirt-backbone-2]# find . -inum 12893624759
> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>
> =======================>
> [root at lease-11 ovirt-backbone-2]# getfattr -n glusterfs.gfid.string
>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
> getfattr: Removing leading '/' from absolute path names
> # file:
>
mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
> glusterfs.gfid.string="f28cabcb-d169-41fc-a633-9bef4c4a8e40"
>
> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex
> .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
> # file: .shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>
>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>
>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>
> [root at lease-11 ovirt-backbone-2]# getfattr -d -m . -e hex
> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> # file: .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
>
>
security.selinux=0x73797374656d5f753a6f626a6563745f723a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> trusted.gfid=0x1f86b4328ec6424699aa48cc6d7b5da0
>
>
trusted.gfid2path.b48064c78d7a85c9=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f66323863616263622d643136392d343166632d613633332d3962656634633461386534302e3531353030
>
> [root at lease-11 ovirt-backbone-2]# stat
> .glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> File: ?.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0?
> Size: 2166784 Blocks: 4128 IO Block: 4096 regular file
> Device: fd03h/64771d Inode: 12956094809 Links: 3
> Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root)
> Context: system_u:object_r:etc_runtime_t:s0
> Access: 2018-12-18 20:11:53.595208449 +0000
> Modify: 2018-12-17 21:43:36.391580259 +0000
> Change: 2018-12-18 19:19:25.888055392 +0000
> Birth: -
>
> [root at lease-11 ovirt-backbone-2]# find . -inum 12956094809
> ./.glusterfs/1f/86/1f86b432-8ec6-4246-99aa-48cc6d7b5da0
> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500
>
> ===============>
> I don't really see any inconsistencies, except the dates on the stat.
> However this is only after i tried moving the file out of the volumes to
> force a heal, which does happen on the data nodes, but not on the arbiter
> node. Before that they were also the same.
> I've also compared the file
> ./.shard/f28cabcb-d169-41fc-a633-9bef4c4a8e40.51500 on the 2 nodes and they
> are exactly the same.
>
> Things i've further tried;
> - gluster v heal ovirt-backbone-2 full => gluster v heal
ovirt-backbone-2
> info reports 0 entries on all nodes
>
> - stop each glusterd and glusterfsd, pause around 40sec and start them
> again on each node, 1 at a time, waiting for the heal to recover before
> moving to the next node
>
> - force a heal by stopping glusterd on a node and perform these steps;
> mkdir /mnt/ovirt-backbone-2/trigger
> rmdir /mnt/ovirt-backbone-2/trigger
> setfattr -n trusted.non-existent-key -v abc /mnt/ovirt-backbone-2/
> setfattr -x trusted.non-existent-key /mnt/ovirt-backbone-2/
> start glusterd
>
> - gluster volume rebalance ovirt-backbone-2 start => success
>
> Whats further interesting is that according the mount log, the volume is
> in split-brain;
> [2018-12-18 10:06:04.606870] E [MSGID: 108008]
> [afr-read-txn.c:90:afr_read_txn_refresh_done]
> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output
> error]
> [2018-12-18 10:06:04.606908] E [MSGID: 133014]
> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed:
> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
> [2018-12-18 10:06:04.606927] W [fuse-bridge.c:871:fuse_attr_cbk]
> 0-glusterfs-fuse: 428090: FSTAT()
> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output
error)
> [2018-12-18 10:06:05.107729] E [MSGID: 108008]
> [afr-read-txn.c:90:afr_read_txn_refresh_done]
> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output
> error]
> [2018-12-18 10:06:05.107770] E [MSGID: 133014]
> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed:
> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
> [2018-12-18 10:06:05.107791] W [fuse-bridge.c:871:fuse_attr_cbk]
> 0-glusterfs-fuse: 428091: FSTAT()
> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output
error)
> [2018-12-18 10:06:05.537244] I [MSGID: 108006]
> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
> subvolumes up
> [2018-12-18 10:06:05.538523] E [MSGID: 108008]
> [afr-read-txn.c:90:afr_read_txn_refresh_done]
> 0-ovirt-backbone-2-replicate-2: Failing STAT on gfid
> 00000000-0000-0000-0000-000000000001: split-brain observed. [Input/output
> error]
> [2018-12-18 10:06:05.538685] I [MSGID: 108006]
> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
> subvolumes up
> [2018-12-18 10:06:05.538794] I [MSGID: 108006]
> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
> subvolumes up
> [2018-12-18 10:06:05.539342] I [MSGID: 109063]
> [dht-layout.c:716:dht_layout_normalize] 0-ovirt-backbone-2-dht: Found
> anomalies in /b1c2c949-aef4-4aec-999b-b179efeef732 (gfid >
8c8598ce-1a52-418e-a7b4-435fee34bae8). Holes=2 overlaps=0
> [2018-12-18 10:06:05.539372] W [MSGID: 109005]
> [dht-selfheal.c:2158:dht_selfheal_directory] 0-ovirt-backbone-2-dht:
> Directory selfheal failed: 2 subvolumes down.Not fixing. path >
/b1c2c949-aef4-4aec-999b-b179efeef732, gfid >
8c8598ce-1a52-418e-a7b4-435fee34bae8
> [2018-12-18 10:06:05.539694] I [MSGID: 108006]
> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
> subvolumes up
> [2018-12-18 10:06:05.540652] I [MSGID: 108006]
> [afr-common.c:5494:afr_local_init] 0-ovirt-backbone-2-replicate-1: no
> subvolumes up
> [2018-12-18 10:06:05.608612] E [MSGID: 108008]
> [afr-read-txn.c:90:afr_read_txn_refresh_done]
> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output
> error]
> [2018-12-18 10:06:05.608657] E [MSGID: 133014]
> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed:
> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
> [2018-12-18 10:06:05.608672] W [fuse-bridge.c:871:fuse_attr_cbk]
> 0-glusterfs-fuse: 428096: FSTAT()
> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output
error)
> [2018-12-18 10:06:06.109339] E [MSGID: 108008]
> [afr-read-txn.c:90:afr_read_txn_refresh_done]
> 0-ovirt-backbone-2-replicate-2: Failing FSTAT on gfid
> 2a57d87d-fe49-4034-919b-fdb79531bf68: split-brain observed. [Input/output
> error]
> [2018-12-18 10:06:06.109378] E [MSGID: 133014]
> [shard.c:1248:shard_common_stat_cbk] 0-ovirt-backbone-2-shard: stat failed:
> 2a57d87d-fe49-4034-919b-fdb79531bf68 [Input/output error]
> [2018-12-18 10:06:06.109399] W [fuse-bridge.c:871:fuse_attr_cbk]
> 0-glusterfs-fuse: 428097: FSTAT()
> /b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids => -1 (Input/output
error)
>
> #note i'm able to see ;
/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids
> [root at lease-11 ovirt-backbone-2]# stat
> /mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids
> File:
> ?/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/dom_md/ids?
> Size: 1048576 Blocks: 2048 IO Block: 131072 regular file
> Device: 41h/65d Inode: 10492258721813610344 Links: 1
> Access: (0660/-rw-rw----) Uid: ( 36/ vdsm) Gid: ( 36/ kvm)
> Context: system_u:object_r:fusefs_t:s0
> Access: 2018-12-19 20:07:39.917573869 +0000
> Modify: 2018-12-19 20:07:39.928573917 +0000
> Change: 2018-12-19 20:07:39.929573921 +0000
> Birth: -
>
> however checking: gluster v heal ovirt-backbone-2 info split-brain
> reports no entries.
>
> I've also tried mounting the qemu image, and this works fine, i'm
able to
> see all contents;
> losetup /dev/loop0
>
/mnt/ovirt-backbone-2/b1c2c949-aef4-4aec-999b-b179efeef732/images/f6ac9660-a84e-469e-a17c-c6dbc538af4b/d6b09501-5b79-4c92-bf10-2ed3bda1b425
> kpartx -a /dev/loop0
> vgscan
> vgchange -ay slave-data
> mkdir /mnt/slv01
> mount /dev/mapper/slave--data-lvol0 /mnt/slv01/
>
> Possible causes for this issue;
> 1. the machine "lease-11" suffered from a faulty RAM module
(ECC), which
> halted the machine and causes an invalid state. (this machine also hosts
> other volumes, with similar configurations, which report no issue)
> 2. after the RAM module was replaced, the VM using the backing qemu image,
> was restored from a backup (the backup was file based within the VM on a
> different directory). This is because some files were corrupted. The
> backup/recovery obviously causes extra IO, possible introducing race
> conditions? The machine did run for about 12h without issues, and in total
> for about 36h.
> 3. since only the client (maybe only gfapi?) reports errors, something is
> broken there?
>
> The volume info;
> root at lease-06 ~# gluster v info ovirt-backbone-2
>
> Volume Name: ovirt-backbone-2
> Type: Distributed-Replicate
> Volume ID: 85702d35-62c8-4c8c-930d-46f455a8af28
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 3 x (2 + 1) = 9
> Transport-type: tcp
> Bricks:
> Brick1: 10.32.9.7:/data/gfs/bricks/brick1/ovirt-backbone-2
> Brick2: 10.32.9.3:/data/gfs/bricks/brick1/ovirt-backbone-2
> Brick3: 10.32.9.4:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter)
> Brick4: 10.32.9.8:/data0/gfs/bricks/brick1/ovirt-backbone-2
> Brick5: 10.32.9.21:/data0/gfs/bricks/brick1/ovirt-backbone-2
> Brick6: 10.32.9.5:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter)
> Brick7: 10.32.9.9:/data0/gfs/bricks/brick1/ovirt-backbone-2
> Brick8: 10.32.9.20:/data0/gfs/bricks/brick1/ovirt-backbone-2
> Brick9: 10.32.9.6:/data/gfs/bricks/bricka/ovirt-backbone-2 (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.low-prio-threads: 32
> network.remote-dio: enable
> cluster.eager-lock: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-max-threads: 8
> cluster.shd-wait-qlength: 10000
> features.shard: on
> user.cifs: off
> storage.owner-uid: 36
> storage.owner-gid: 36
> features.shard-block-size: 64MB
> performance.write-behind-window-size: 512MB
> performance.cache-size: 384MB
> cluster.brick-multiplex: on
>
> The volume status;
> root at lease-06 ~# gluster v status ovirt-backbone-2
> Status of volume: ovirt-backbone-2
> Gluster process TCP Port RDMA Port Online
> Pid
>
>
------------------------------------------------------------------------------
> Brick 10.32.9.7:/data/gfs/bricks/brick1/ovi
> rt-backbone-2 49152 0 Y
> 7727
> Brick 10.32.9.3:/data/gfs/bricks/brick1/ovi
> rt-backbone-2 49152 0 Y
> 12620
> Brick 10.32.9.4:/data/gfs/bricks/bricka/ovi
> rt-backbone-2 49152 0 Y
> 8794
> Brick 10.32.9.8:/data0/gfs/bricks/brick1/ov
> irt-backbone-2 49161 0 Y
> 22333
> Brick 10.32.9.21:/data0/gfs/bricks/brick1/o
> virt-backbone-2 49152 0 Y
> 15030
> Brick 10.32.9.5:/data/gfs/bricks/bricka/ovi
> rt-backbone-2 49166 0 Y
> 24592
> Brick 10.32.9.9:/data0/gfs/bricks/brick1/ov
> irt-backbone-2 49153 0 Y
> 20148
> Brick 10.32.9.20:/data0/gfs/bricks/brick1/o
> virt-backbone-2 49154 0 Y
> 15413
> Brick 10.32.9.6:/data/gfs/bricks/bricka/ovi
> rt-backbone-2 49152 0 Y
> 43120
> Self-heal Daemon on localhost N/A N/A Y
> 44587
> Self-heal Daemon on 10.201.0.2 N/A N/A Y
> 8401
> Self-heal Daemon on 10.201.0.5 N/A N/A Y
> 11038
> Self-heal Daemon on 10.201.0.8 N/A N/A Y
> 9513
> Self-heal Daemon on 10.32.9.4 N/A N/A Y
> 23736
> Self-heal Daemon on 10.32.9.20 N/A N/A Y
> 2738
> Self-heal Daemon on 10.32.9.3 N/A N/A Y
> 25598
> Self-heal Daemon on 10.32.9.5 N/A N/A Y
> 511
> Self-heal Daemon on 10.32.9.9 N/A N/A Y
> 23357
> Self-heal Daemon on 10.32.9.8 N/A N/A Y
> 15225
> Self-heal Daemon on 10.32.9.7 N/A N/A Y
> 25781
> Self-heal Daemon on 10.32.9.21 N/A N/A Y
> 5034
>
> Task Status of Volume ovirt-backbone-2
>
>
------------------------------------------------------------------------------
> Task : Rebalance
> ID : 6dfbac43-0125-4568-9ac3-a2c453faaa3d
> Status : completed
>
> gluster version is @3.12.15 and cluster.op-version=31202
>
> =======================>
> It would be nice to know if it's possible to mark the files as not
stale
> or if i should investigate other things?
> Or should we consider this volume lost?
> Also checking the code at;
>
https://github.com/gluster/glusterfs/blob/master/xlators/features/shard/src/shard.c
> it seems the functions shifted quite some (line 1724 vs. 2243), so maybe
> it's fixed in a future version?
> Any thoughts are welcome.
>
> Thanks Olaf
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20181220/2e73df5a/attachment.html>