thr3ads.net - Gluster users - [Gluster-users] 3.8.3 Shards Healing Glacier Slow [Aug 2016]

If this information is useful, please help other people find it:
Share via:

Krutika Dhananjay

2016-Aug-31 07:09 UTC

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

Just figured the steps Anuradha has provided won't work if granular entry
heal is on.
So when you bring down a brick and create fake2 under / of the volume,
granular entry heal feature causes
sh to remember only the fact that 'fake2' needs to be recreated on the
offline brick (because changelogs are granular).

In this case, we would be required to indicate to self-heal-daemon that the
entire directory tree from '/' needs to be repaired on the brick that
contains no data.

To fix this, I did the following (for users who use granular entry
self-healing):

1. Kill the last brick process in the replica (/bricks/3)

2. [root at server-3 ~]# rm -rf /bricks/3

3. [root at server-3 ~]# mkdir /bricks/3

4. Create a new dir on the mount point:
    [root at client-1 ~]# mkdir /mnt/fake

5. Set some fake xattr on the root of the volume, and not the 'fake'
directory itself.
    [root at client-1 ~]# setfattr -n "user.some-name" -v
"some-value" /mnt

6. Make sure there's no io happening on your volume.

7. Check the pending xattrs on the brick directories of the two good copies
(on bricks 1 and 2), you should be seeing same values as the one marked in
red in both bricks.
(note that the client-<num> xattr key will have the same last digit as the
index of the brick that is down, when counting from 0. So if the first
brick is the one that is down, it would read trusted.afr.*-client-0; if the
second brick is the one that is empty and down, it would read
trusted.afr.*-client-1 and so on).

[root at server-1 ~]# getfattr -d -m . -e hex /bricks/1
# file: 1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
23a6574635f72756e74696d655f743a733000
trusted.afr.dirty=0x000000000000000000000000
*trusted.afr.rep-client-2=0x000000000000000100000001*
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b

[root at server-2 ~]# getfattr -d -m . -e hex /bricks/2
# file: 2
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
23a6574635f72756e74696d655f743a733000
trusted.afr.dirty=0x000000000000000000000000
*trusted.afr.rep-client-2=0x000**000000000000100000001*
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b

8. Flip the 8th digit in the trusted.afr.<VOLNAME>-client-2 to a 1.

[root at server-1 ~]# setfattr -n trusted.afr.rep-client-2 -v
*0x000000010000000100000001* /bricks/1
[root at server-2 ~]# setfattr -n trusted.afr.rep-client-2 -v
*0x000000010000000100000001* /bricks/2

9. Get the xattrs again and check the xattrs are set properly now

[root at server-1 ~]# getfattr -d -m . -e hex /bricks/1
# file: 1
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
23a6574635f72756e74696d655f743a733000
trusted.afr.dirty=0x000000000000000000000000
*trusted.afr.rep-client-2=0x000**000010000000100000001*
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b

[root at server-2 ~]# getfattr -d -m . -e hex /bricks/2
# file: 2
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
23a6574635f72756e74696d655f743a733000
trusted.afr.dirty=0x000000000000000000000000
*trusted.afr.rep-client-2=0x000**000010000000100000001*
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b

10. Force-start the volume.

[root at server-1 ~]# gluster volume start rep force
volume start: rep: success

11. Monitor heal-info command to ensure the number of entries keeps growing.

12. Keep monitoring with step 10 and eventually the number of entries
needing heal must come down to 0.
Also the checksums of the files on the previously empty brick should now
match with the copies on the other two bricks.

Could you check if the above steps work for you, in your test environment?

You caught a nice bug in the manual steps to follow when granular
entry-heal is enabled and an empty brick needs heal. Thanks for reporting
it. :) We will fix the documentation appropriately.

-Krutika


On Wed, Aug 31, 2016 at 11:29 AM, Krutika Dhananjay <kdhananj at
redhat.com>
wrote:
> Tried this.
>
> With me, only 'fake2' gets healed after i bring the 'empty'
brick back up
> and it stops there unless I do a 'heal-full'.
>
> Is that what you're seeing as well?
>
> -Krutika
>
> On Wed, Aug 31, 2016 at 4:43 AM, David Gossage <
> dgossage at carouselchecks.com> wrote:
>
>> Same issue brought up glusterd on problem node heal count still stuck
at
>> 6330.
>>
>> Ran gluster v heal GUSTER1 full
>>
>> glustershd on problem node shows a sweep starting and finishing in
>> seconds.  Other 2 nodes show no activity in log.  They should start a
sweep
>> too shouldn't they?
>>
>> Tried starting from scratch
>>
>> kill -15 brickpid
>> rm -Rf /brick
>> mkdir -p /brick
>> mkdir mkdir /gsmount/fake2
>> setfattr -n "user.some-name" -v "some-value"
/gsmount/fake2
>>
>> Heals visible dirs instantly then stops.
>>
>> gluster v heal GLUSTER1 full
>>
>> see sweep star on problem node and end almost instantly.  no files
added
>> t heal list no files healed no more logging
>>
>> [2016-08-30 23:11:31.544331] I [MSGID: 108026]
>> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>> starting full sweep on subvol GLUSTER1-client-1
>> [2016-08-30 23:11:33.776235] I [MSGID: 108026]
>> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>> finished full sweep on subvol GLUSTER1-client-1
>>
>> same results no matter which node you run command on.  Still stuck with
>> 6330 files showing needing healed out of 19k.  still showing in logs no
>> heals are occuring.
>>
>> Is their a way to forcibly reset any prior heal data?  Could it be
stuck
>> on some past failed heal start?
>>
>>
>>
>>
>> *David Gossage*
>> *Carousel Checks Inc. | System Administrator*
>> *Office* 708.613.2284
>>
>> On Tue, Aug 30, 2016 at 10:03 AM, David Gossage <
>> dgossage at carouselchecks.com> wrote:
>>
>>> On Tue, Aug 30, 2016 at 10:02 AM, David Gossage <
>>> dgossage at carouselchecks.com> wrote:
>>>
>>>> updated test server to 3.8.3
>>>>
>>>> Brick1: 192.168.71.10:/gluster2/brick1/1
>>>> Brick2: 192.168.71.11:/gluster2/brick2/1
>>>> Brick3: 192.168.71.12:/gluster2/brick3/1
>>>> Options Reconfigured:
>>>> cluster.granular-entry-heal: on
>>>> performance.readdir-ahead: on
>>>> performance.read-ahead: off
>>>> nfs.disable: on
>>>> nfs.addr-namelookup: off
>>>> nfs.enable-ino32: off
>>>> cluster.background-self-heal-count: 16
>>>> cluster.self-heal-window-size: 1024
>>>> performance.quick-read: off
>>>> performance.io-cache: off
>>>> performance.stat-prefetch: off
>>>> cluster.eager-lock: enable
>>>> network.remote-dio: on
>>>> cluster.quorum-type: auto
>>>> cluster.server-quorum-type: server
>>>> storage.owner-gid: 36
>>>> storage.owner-uid: 36
>>>> server.allow-insecure: on
>>>> features.shard: on
>>>> features.shard-block-size: 64MB
>>>> performance.strict-o-direct: off
>>>> cluster.locking-scheme: granular
>>>>
>>>> kill -15 brickpid
>>>> rm -Rf /gluster2/brick3
>>>> mkdir -p /gluster2/brick3/1
>>>> mkdir mkdir /rhev/data-center/mnt/glusterSD/192.168.71.10
>>>> \:_glustershard/fake2
>>>> setfattr -n "user.some-name" -v
"some-value"
>>>>
/rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard/fake2
>>>> gluster v start glustershard force
>>>>
>>>> at this point brick process starts and all visible files
including new
>>>> dir are made on brick
>>>> handful of shards are in heal statistics still but no .shard
directory
>>>> created and no increase in shard count
>>>>
>>>> gluster v heal glustershard
>>>>
>>>> At this point still no increase in count or dir made no
additional
>>>> activity in logs for healing generated.  waited few minutes
tailing logs to
>>>> check if anything kicked in.
>>>>
>>>> gluster v heal glustershard full
>>>>
>>>> gluster shards added to list and heal commences.  logs show
full sweep
>>>> starting on all 3 nodes.  though this time it only shows as
finishing on
>>>> one which looks to be the one that had brick deleted.
>>>>
>>>> [2016-08-30 14:45:33.098589] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> starting full sweep on subvol glustershard-client-0
>>>> [2016-08-30 14:45:33.099492] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> starting full sweep on subvol glustershard-client-1
>>>> [2016-08-30 14:45:33.100093] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> starting full sweep on subvol glustershard-client-2
>>>> [2016-08-30 14:52:29.760213] I [MSGID: 108026]
>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> finished full sweep on subvol glustershard-client-2
>>>>
>>>
>>> Just realized its still healing so that may be why sweep on 2 other
>>> bricks haven't replied as finished.
>>>
>>>>
>>>>
>>>> my hope is that later tonight a full heal will work on
production.  Is
>>>> it possible self-heal daemon can get stale or stop listening
but still show
>>>> as active?  Would stopping and starting self-heal daemon from
gluster cli
>>>> before doing these heals be helpful?
>>>>
>>>>
>>>> On Tue, Aug 30, 2016 at 9:29 AM, David Gossage <
>>>> dgossage at carouselchecks.com> wrote:
>>>>
>>>>> On Tue, Aug 30, 2016 at 8:52 AM, David Gossage <
>>>>> dgossage at carouselchecks.com> wrote:
>>>>>
>>>>>> On Tue, Aug 30, 2016 at 8:01 AM, Krutika Dhananjay <
>>>>>> kdhananj at redhat.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Aug 30, 2016 at 6:20 PM, Krutika Dhananjay
<
>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2016 at 6:07 PM, David Gossage
<
>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2016 at 7:18 AM, Krutika
Dhananjay <
>>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>>
>>>>>>>>>> Could you also share the glustershd
logs?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'll get them when I get to work sure
>>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I tried the same steps that you
mentioned multiple times, but
>>>>>>>>>> heal is running to completion without
any issues.
>>>>>>>>>>
>>>>>>>>>> It must be said that 'heal
full' traverses the files and
>>>>>>>>>> directories in a depth-first order and
does heals also in the same order.
>>>>>>>>>> But if it gets interrupted in the
middle (say because self-heal-daemon was
>>>>>>>>>> either intentionally or unintentionally
brought offline and then brought
>>>>>>>>>> back up), self-heal will only pick up
the entries that are so far marked as
>>>>>>>>>> new-entries that need heal which it
will find in indices/xattrop directory.
>>>>>>>>>> What this means is that those files and
directories that were not visited
>>>>>>>>>> during the crawl, will remain untouched
and unhealed in this second
>>>>>>>>>> iteration of heal, unless you execute a
'heal-full' again.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So should it start healing shards as it
crawls or not until after
>>>>>>>>> it crawls the entire .shard directory?  At
the pace it was going that could
>>>>>>>>> be a week with one node appearing in the
cluster but with no shard files if
>>>>>>>>> anything tries to access a file on that
node.  From my experience other day
>>>>>>>>> telling it to heal full again did nothing
regardless of node used.
>>>>>>>>>
>>>>>>>>
>>>>>>> Crawl is started from '/' of the volume.
Whenever self-heal detects
>>>>>>> during the crawl that a file or directory is
present in some brick(s) and
>>>>>>> absent in others, it creates the file on the bricks
where it is absent and
>>>>>>> marks the fact that the file or directory might
need data/entry and
>>>>>>> metadata heal too (this also means that an index is
created under
>>>>>>> .glusterfs/indices/xattrop of the src bricks). And
the data/entry and
>>>>>>> metadata heal are picked up and done in
>>>>>>>
>>>>>> the background with the help of these indices.
>>>>>>>
>>>>>>
>>>>>> Looking at my 3rd node as example i find nearly an
exact same number
>>>>>> of files in xattrop dir as reported by heal count at
time I brought down
>>>>>> node2 to try and alleviate read io errors that seemed
to occur from what I
>>>>>> was guessing as attempts to use the node with no shards
for reads.
>>>>>>
>>>>>> Also attached are the glustershd logs from the 3 nodes,
along with
>>>>>> the test node i tried yesterday with same results.
>>>>>>
>>>>>
>>>>> Looking at my own logs I notice that a full sweep was only
ever
>>>>> recorded in glustershd.log on 2nd node with missing
directory.  I believe I
>>>>> should have found a sweep begun on every node correct?
>>>>>
>>>>> On my test dev when it did work I do see that
>>>>>
>>>>> [2016-08-30 13:56:25.223333] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>> glustershard-client-0
>>>>> [2016-08-30 13:56:25.223522] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>> glustershard-client-1
>>>>> [2016-08-30 13:56:25.224616] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>> glustershard-client-2
>>>>> [2016-08-30 14:18:48.333740] I [MSGID: 108026]
>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: finished full sweep on subvol
>>>>> glustershard-client-2
>>>>> [2016-08-30 14:18:48.356008] I [MSGID: 108026]
>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: finished full sweep on subvol
>>>>> glustershard-client-1
>>>>> [2016-08-30 14:18:49.637811] I [MSGID: 108026]
>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: finished full sweep on subvol
>>>>> glustershard-client-0
>>>>>
>>>>> While when looking at past few days of the 3 prod nodes i
only found
>>>>> that on my 2nd node
>>>>> [2016-08-27 01:26:42.638772] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>> [2016-08-27 11:37:01.732366] I [MSGID: 108026]
>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>> [2016-08-27 12:58:34.597228] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>> [2016-08-27 12:59:28.041173] I [MSGID: 108026]
>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>> [2016-08-27 20:03:42.560188] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>> [2016-08-27 20:03:44.278274] I [MSGID: 108026]
>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>> [2016-08-27 21:00:42.603315] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>> [2016-08-27 21:00:46.148674] I [MSGID: 108026]
>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>> My suspicion is that this is what
happened on your setup. Could
>>>>>>>>>> you confirm if that was the case?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Brick was brought online with force start
then a full heal
>>>>>>>>> launched.  Hours later after it became
evident that it was not adding new
>>>>>>>>> files to heal I did try restarting
self-heal daemon and relaunching full
>>>>>>>>> heal again. But this was after the heal had
basically already failed to
>>>>>>>>> work as intended.
>>>>>>>>>
>>>>>>>>
>>>>>>>> OK. How did you figure it was not adding any
new files? I need to
>>>>>>>> know what places you were monitoring to come to
this conclusion.
>>>>>>>>
>>>>>>>> -Krutika
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> As for those logs, I did manager to do
something that caused
>>>>>>>>>> these warning messages you shared
earlier to appear in my client and server
>>>>>>>>>> logs.
>>>>>>>>>> Although these logs are annoying and a
bit scary too, they didn't
>>>>>>>>>> do any harm to the data in my volume.
Why they appear just after a brick is
>>>>>>>>>> replaced and under no other
circumstances is something I'm still
>>>>>>>>>> investigating.
>>>>>>>>>>
>>>>>>>>>> But for future, it would be good to
follow the steps Anuradha
>>>>>>>>>> gave as that would allow self-heal to
at least detect that it has some
>>>>>>>>>> repairing to do whenever it is
restarted whether intentionally or otherwise.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I followed those steps as described on my
test box and ended up
>>>>>>>>> with exact same outcome of adding shards at
an agonizing slow pace and no
>>>>>>>>> creation of .shard directory or heals on
shard directory.  Directories
>>>>>>>>> visible from mount healed quickly.  This
was with one VM so it has only 800
>>>>>>>>> shards as well.  After hours at work it had
added a total of 33 shards to
>>>>>>>>> be healed.  I sent those logs yesterday as
well though not the glustershd.
>>>>>>>>>
>>>>>>>>> Does replace-brick command copy files in
same manner?  For these
>>>>>>>>> purposes I am contemplating just skipping
the heal route.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -Krutika
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 30, 2016 at 2:22 AM, David
Gossage <
>>>>>>>>>> dgossage at carouselchecks.com>
wrote:
>>>>>>>>>>
>>>>>>>>>>> attached brick and client logs from
test machine where same
>>>>>>>>>>> behavior occurred not sure if
anything new is there.  its still on 3.8.2
>>>>>>>>>>>
>>>>>>>>>>> Number of Bricks: 1 x 3 = 3
>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>> Bricks:
>>>>>>>>>>> Brick1:
192.168.71.10:/gluster2/brick1/1
>>>>>>>>>>> Brick2:
192.168.71.11:/gluster2/brick2/1
>>>>>>>>>>> Brick3:
192.168.71.12:/gluster2/brick3/1
>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>> cluster.locking-scheme: granular
>>>>>>>>>>> performance.strict-o-direct: off
>>>>>>>>>>> features.shard-block-size: 64MB
>>>>>>>>>>> features.shard: on
>>>>>>>>>>> server.allow-insecure: on
>>>>>>>>>>> storage.owner-uid: 36
>>>>>>>>>>> storage.owner-gid: 36
>>>>>>>>>>> cluster.server-quorum-type: server
>>>>>>>>>>> cluster.quorum-type: auto
>>>>>>>>>>> network.remote-dio: on
>>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>>> performance.stat-prefetch: off
>>>>>>>>>>> performance.io-cache: off
>>>>>>>>>>> performance.quick-read: off
>>>>>>>>>>> cluster.self-heal-window-size: 1024
>>>>>>>>>>> cluster.background-self-heal-count:
16
>>>>>>>>>>> nfs.enable-ino32: off
>>>>>>>>>>> nfs.addr-namelookup: off
>>>>>>>>>>> nfs.disable: on
>>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>>>> cluster.granular-entry-heal: on
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Aug 29, 2016 at 2:20 PM,
David Gossage <
>>>>>>>>>>> dgossage at carouselchecks.com>
wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Aug 29, 2016 at 7:01
AM, Anuradha Talur <
>>>>>>>>>>>> atalur at redhat.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ----- Original Message
-----
>>>>>>>>>>>>> > From: "David
Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>>>> > To: "Anuradha
Talur" <atalur at redhat.com>
>>>>>>>>>>>>> > Cc:
"gluster-users at gluster.org List" <
>>>>>>>>>>>>> Gluster-users at
gluster.org>, "Krutika Dhananjay" <
>>>>>>>>>>>>> kdhananj at redhat.com>
>>>>>>>>>>>>> > Sent: Monday, August
29, 2016 5:12:42 PM
>>>>>>>>>>>>> > Subject: Re:
[Gluster-users] 3.8.3 Shards Healing Glacier
>>>>>>>>>>>>> Slow
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > On Mon, Aug 29, 2016
at 5:39 AM, Anuradha Talur <
>>>>>>>>>>>>> atalur at redhat.com>
wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > > Response inline.
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> > > ----- Original
Message -----
>>>>>>>>>>>>> > > > From:
"Krutika Dhananjay" <kdhananj at redhat.com>
>>>>>>>>>>>>> > > > To:
"David Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>>>> > > > Cc:
"gluster-users at gluster.org List" <
>>>>>>>>>>>>> Gluster-users at
gluster.org>
>>>>>>>>>>>>> > > > Sent:
Monday, August 29, 2016 3:55:04 PM
>>>>>>>>>>>>> > > > Subject: Re:
[Gluster-users] 3.8.3 Shards Healing
>>>>>>>>>>>>> Glacier Slow
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > Could you
attach both client and brick logs? Meanwhile I
>>>>>>>>>>>>> will try these
>>>>>>>>>>>>> > > steps
>>>>>>>>>>>>> > > > out on my
machines and see if it is easily recreatable.
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > -Krutika
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > On Mon, Aug
29, 2016 at 2:31 PM, David Gossage <
>>>>>>>>>>>>> > > dgossage at
carouselchecks.com
>>>>>>>>>>>>> > > > > wrote:
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > Centos 7
Gluster 3.8.3
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > Brick1:
ccgl1.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>> > > > Brick2:
ccgl2.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>> > > > Brick3:
ccgl4.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>> > > > Options
Reconfigured:
>>>>>>>>>>>>> > > >
cluster.data-self-heal-algorithm: full
>>>>>>>>>>>>> > > >
cluster.self-heal-daemon: on
>>>>>>>>>>>>> > > >
cluster.locking-scheme: granular
>>>>>>>>>>>>> > > >
features.shard-block-size: 64MB
>>>>>>>>>>>>> > > >
features.shard: on
>>>>>>>>>>>>> > > >
performance.readdir-ahead: on
>>>>>>>>>>>>> > > >
storage.owner-uid: 36
>>>>>>>>>>>>> > > >
storage.owner-gid: 36
>>>>>>>>>>>>> > > >
performance.quick-read: off
>>>>>>>>>>>>> > > >
performance.read-ahead: off
>>>>>>>>>>>>> > > >
performance.io-cache: off
>>>>>>>>>>>>> > > >
performance.stat-prefetch: on
>>>>>>>>>>>>> > > >
cluster.eager-lock: enable
>>>>>>>>>>>>> > > >
network.remote-dio: enable
>>>>>>>>>>>>> > > >
cluster.quorum-type: auto
>>>>>>>>>>>>> > > >
cluster.server-quorum-type: server
>>>>>>>>>>>>> > > >
server.allow-insecure: on
>>>>>>>>>>>>> > > >
cluster.self-heal-window-size: 1024
>>>>>>>>>>>>> > > >
cluster.background-self-heal-count: 16
>>>>>>>>>>>>> > > >
performance.strict-write-ordering: off
>>>>>>>>>>>>> > > > nfs.disable:
on
>>>>>>>>>>>>> > > >
nfs.addr-namelookup: off
>>>>>>>>>>>>> > > >
nfs.enable-ino32: off
>>>>>>>>>>>>> > > >
cluster.granular-entry-heal: on
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > Friday did
rolling upgrade from 3.8.3->3.8.3 no issues.
>>>>>>>>>>>>> > > > Following
steps detailed in previous recommendations
>>>>>>>>>>>>> began proces of
>>>>>>>>>>>>> > > > replacing
and healngbricks one node at a time.
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > 1) kill pid
of brick
>>>>>>>>>>>>> > > > 2)
reconfigure brick from raid6 to raid10
>>>>>>>>>>>>> > > > 3) recreate
directory of brick
>>>>>>>>>>>>> > > > 4) gluster
volume start <> force
>>>>>>>>>>>>> > > > 5) gluster
volume heal <> full
>>>>>>>>>>>>> > > Hi,
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> > > I'd suggest
that full heal is not used. There are a few
>>>>>>>>>>>>> bugs in full heal.
>>>>>>>>>>>>> > > Better safe than
sorry ;)
>>>>>>>>>>>>> > > Instead I'd
suggest the following steps:
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> > > Currently I
brought the node down by systemctl stop
>>>>>>>>>>>>> glusterd as I was
>>>>>>>>>>>>> > getting sporadic io
issues and a few VM's paused so hoping
>>>>>>>>>>>>> that will help.
>>>>>>>>>>>>> > I may wait to do this
till around 4PM when most work is done
>>>>>>>>>>>>> in case it
>>>>>>>>>>>>> > shoots load up.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > > 1) kill pid of
brick
>>>>>>>>>>>>> > > 2) to configuring
of brick that you need
>>>>>>>>>>>>> > > 3) recreate brick
dir
>>>>>>>>>>>>> > > 4) while the
brick is still down, from the mount point:
>>>>>>>>>>>>> > >    a) create a
dummy non existent dir under / of mount.
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > so if noee 2 is down
brick, pick node for example 3 and make
>>>>>>>>>>>>> a test dir
>>>>>>>>>>>>> > under its brick
directory that doesnt exist on 2 or should I
>>>>>>>>>>>>> be dong this
>>>>>>>>>>>>> > over a gluster mount?
>>>>>>>>>>>>> You should be doing this
over gluster mount.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > >    b) set a non
existent extended attribute on / of mount.
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Could you give me an
example of an attribute to set?   I've
>>>>>>>>>>>>> read a tad on
>>>>>>>>>>>>> > this, and looked up
attributes but haven't set any yet
>>>>>>>>>>>>> myself.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> Sure. setfattr -n
"user.some-name" -v "some-value"
>>>>>>>>>>>>> <path-to-mount>
>>>>>>>>>>>>> > Doing these steps will
ensure that heal happens only from
>>>>>>>>>>>>> updated brick to
>>>>>>>>>>>>> > > down brick.
>>>>>>>>>>>>> > > 5) gluster v
start <> force
>>>>>>>>>>>>> > > 6) gluster v heal
<>
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Will it matter if
somewhere in gluster the full heal command
>>>>>>>>>>>>> was run other
>>>>>>>>>>>>> > day?  Not sure if it
eventually stops or times out.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> full heal will stop once
the crawl is done. So if you want to
>>>>>>>>>>>>> trigger heal again,
>>>>>>>>>>>>> run gluster v heal
<>. Actually even brick up or volume start
>>>>>>>>>>>>> force should
>>>>>>>>>>>>> trigger the heal.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Did this on test bed today. 
its one server with 3 bricks on
>>>>>>>>>>>> same machine so take that for
what its worth.  also it still runs 3.8.2.
>>>>>>>>>>>> Maybe ill update and re-run
test.
>>>>>>>>>>>>
>>>>>>>>>>>> killed brick
>>>>>>>>>>>> deleted brick dir
>>>>>>>>>>>> recreated brick dir
>>>>>>>>>>>> created fake dir on gluster
mount
>>>>>>>>>>>> set suggested fake attribute on
it
>>>>>>>>>>>> ran volume start <> force
>>>>>>>>>>>>
>>>>>>>>>>>> looked at files it said needed
healing and it was just 8 shards
>>>>>>>>>>>> that were modified for few
minutes I ran through steps
>>>>>>>>>>>>
>>>>>>>>>>>> gave it few minutes and it
stayed same
>>>>>>>>>>>> ran gluster volume <>
heal
>>>>>>>>>>>>
>>>>>>>>>>>> it healed all the directories
and files you can see over mount
>>>>>>>>>>>> including fakedir.
>>>>>>>>>>>>
>>>>>>>>>>>> same issue for shards though. 
it adds more shards to heal at
>>>>>>>>>>>> glacier pace.  slight jump in
speed if I stat every file and dir in VM
>>>>>>>>>>>> running but not all shards.
>>>>>>>>>>>>
>>>>>>>>>>>> It started with 8 shards to
heal and is now only at 33 out of
>>>>>>>>>>>> 800 and probably wont finish
adding for few days at rate it goes.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> > > > 1st node
worked as expected took 12 hours to heal 1TB
>>>>>>>>>>>>> data. Load was
>>>>>>>>>>>>> > > little
>>>>>>>>>>>>> > > > heavy but
nothing shocking.
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > About an
hour after node 1 finished I began same process
>>>>>>>>>>>>> on node2. Heal
>>>>>>>>>>>>> > > > proces
kicked in as before and the files in directories
>>>>>>>>>>>>> visible from
>>>>>>>>>>>>> > > mount
>>>>>>>>>>>>> > > > and
.glusterfs healed in short time. Then it began crawl
>>>>>>>>>>>>> of .shard adding
>>>>>>>>>>>>> > > > those files
to heal count at which point the entire
>>>>>>>>>>>>> proces ground to a
>>>>>>>>>>>>> > > halt
>>>>>>>>>>>>> > > > basically.
After 48 hours out of 19k shards it has added
>>>>>>>>>>>>> 5900 to heal
>>>>>>>>>>>>> > > list.
>>>>>>>>>>>>> > > > Load on all
3 machnes is negligible. It was suggested to
>>>>>>>>>>>>> change this
>>>>>>>>>>>>> > > value
>>>>>>>>>>>>> > > > to full
cluster.data-self-heal-algorithm and restart
>>>>>>>>>>>>> volume which I
>>>>>>>>>>>>> > > did. No
>>>>>>>>>>>>> > > > efffect.
Tried relaunching heal no effect, despite any
>>>>>>>>>>>>> node picked. I
>>>>>>>>>>>>> > > > started each
VM and performed a stat of all files from
>>>>>>>>>>>>> within it, or a
>>>>>>>>>>>>> > > full
>>>>>>>>>>>>> > > > virus scan
and that seemed to cause short small spikes
>>>>>>>>>>>>> in shards added,
>>>>>>>>>>>>> > > but
>>>>>>>>>>>>> > > > not by much.
Logs are showing no real messages
>>>>>>>>>>>>> indicating anything is
>>>>>>>>>>>>> > > going
>>>>>>>>>>>>> > > > on. I get
hits to brick log on occasion of null lookups
>>>>>>>>>>>>> making me think
>>>>>>>>>>>>> > > its
>>>>>>>>>>>>> > > > not really
crawling shards directory but waiting for a
>>>>>>>>>>>>> shard lookup to
>>>>>>>>>>>>> > > add
>>>>>>>>>>>>> > > > it. I'll
get following in brick log but not constant and
>>>>>>>>>>>>> sometime
>>>>>>>>>>>>> > > multiple
>>>>>>>>>>>>> > > > for same
shard.
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > [2016-08-29
08:31:57.478125] W [MSGID: 115009]
>>>>>>>>>>>>> > > >
[server-resolve.c:569:server_resolve]
>>>>>>>>>>>>> 0-GLUSTER1-server: no
resolution
>>>>>>>>>>>>> > > type
>>>>>>>>>>>>> > > > for (null)
(LOOKUP)
>>>>>>>>>>>>> > > > [2016-08-29
08:31:57.478170] E [MSGID: 115050]
>>>>>>>>>>>>> > > >
[server-rpc-fops.c:156:server_lookup_cbk]
>>>>>>>>>>>>> 0-GLUSTER1-server:
12591783:
>>>>>>>>>>>>> > > > LOOKUP
(null) (00000000-0000-0000-00
>>>>>>>>>>>>> > > >
00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221)
>>>>>>>>>>>>> ==> (Invalid
>>>>>>>>>>>>> > > > argument)
[Invalid argument]
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > This one
repeated about 30 times in row then nothing for
>>>>>>>>>>>>> 10 minutes then
>>>>>>>>>>>>> > > one
>>>>>>>>>>>>> > > > hit for one
different shard by itself.
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > How can I
determine if Heal is actually running? How can
>>>>>>>>>>>>> I kill it or
>>>>>>>>>>>>> > > force
>>>>>>>>>>>>> > > > restart?
Does node I start it from determine which
>>>>>>>>>>>>> directory gets
>>>>>>>>>>>>> > > crawled to
>>>>>>>>>>>>> > > > determine
heals?
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > > David
Gossage
>>>>>>>>>>>>> > > > Carousel
Checks Inc. | System Administrator
>>>>>>>>>>>>> > > > Office
708.613.2284
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > >
_______________________________________________
>>>>>>>>>>>>> > > >
Gluster-users mailing list
>>>>>>>>>>>>> > > >
Gluster-users at gluster.org
>>>>>>>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > >
>>>>>>>>>>>>> > > >
_______________________________________________
>>>>>>>>>>>>> > > >
Gluster-users mailing list
>>>>>>>>>>>>> > > >
Gluster-users at gluster.org
>>>>>>>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> > > --
>>>>>>>>>>>>> > > Thanks,
>>>>>>>>>>>>> > > Anuradha.
>>>>>>>>>>>>> > >
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Anuradha.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160831/c848949d/attachment-0001.html>

Krutika Dhananjay

2016-Aug-31 07:19 UTC

head link

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

OK I just hit the other issue too, where .shard doesn't get healed. :)

Investigating as to why that is the case. Give me some time.

-Krutika

On Wed, Aug 31, 2016 at 12:39 PM, Krutika Dhananjay <kdhananj at
redhat.com>
wrote:
> Just figured the steps Anuradha has provided won't work if granular
entry
> heal is on.
> So when you bring down a brick and create fake2 under / of the volume,
> granular entry heal feature causes
> sh to remember only the fact that 'fake2' needs to be recreated on
the
> offline brick (because changelogs are granular).
>
> In this case, we would be required to indicate to self-heal-daemon that
> the entire directory tree from '/' needs to be repaired on the
brick that
> contains no data.
>
> To fix this, I did the following (for users who use granular entry
> self-healing):
>
> 1. Kill the last brick process in the replica (/bricks/3)
>
> 2. [root at server-3 ~]# rm -rf /bricks/3
>
> 3. [root at server-3 ~]# mkdir /bricks/3
>
> 4. Create a new dir on the mount point:
>     [root at client-1 ~]# mkdir /mnt/fake
>
> 5. Set some fake xattr on the root of the volume, and not the
'fake'
> directory itself.
>     [root at client-1 ~]# setfattr -n "user.some-name" -v
"some-value" /mnt
>
> 6. Make sure there's no io happening on your volume.
>
> 7. Check the pending xattrs on the brick directories of the two good
> copies (on bricks 1 and 2), you should be seeing same values as the one
> marked in red in both bricks.
> (note that the client-<num> xattr key will have the same last digit
as the
> index of the brick that is down, when counting from 0. So if the first
> brick is the one that is down, it would read trusted.afr.*-client-0; if the
> second brick is the one that is empty and down, it would read
> trusted.afr.*-client-1 and so on).
>
> [root at server-1 ~]# getfattr -d -m . -e hex /bricks/1
> # file: 1
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> *trusted.afr.rep-client-2=0x000000000000000100000001*
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b
>
> [root at server-2 ~]# getfattr -d -m . -e hex /bricks/2
> # file: 2
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> *trusted.afr.rep-client-2=0x000**000000000000100000001*
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b
>
> 8. Flip the 8th digit in the trusted.afr.<VOLNAME>-client-2 to a 1.
>
> [root at server-1 ~]# setfattr -n trusted.afr.rep-client-2 -v
> *0x000000010000000100000001* /bricks/1
> [root at server-2 ~]# setfattr -n trusted.afr.rep-client-2 -v
> *0x000000010000000100000001* /bricks/2
>
> 9. Get the xattrs again and check the xattrs are set properly now
>
> [root at server-1 ~]# getfattr -d -m . -e hex /bricks/1
> # file: 1
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> *trusted.afr.rep-client-2=0x000**000010000000100000001*
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b
>
> [root at server-2 ~]# getfattr -d -m . -e hex /bricks/2
> # file: 2
> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f7
> 23a6574635f72756e74696d655f743a733000
> trusted.afr.dirty=0x000000000000000000000000
> *trusted.afr.rep-client-2=0x000**000010000000100000001*
> trusted.gfid=0x00000000000000000000000000000001
> trusted.glusterfs.dht=0x000000010000000000000000ffffffff
> trusted.glusterfs.volume-id=0xa349517bb9d44bdf96da8ea324f89e7b
>
> 10. Force-start the volume.
>
> [root at server-1 ~]# gluster volume start rep force
> volume start: rep: success
>
> 11. Monitor heal-info command to ensure the number of entries keeps
> growing.
>
> 12. Keep monitoring with step 10 and eventually the number of entries
> needing heal must come down to 0.
> Also the checksums of the files on the previously empty brick should now
> match with the copies on the other two bricks.
>
> Could you check if the above steps work for you, in your test environment?
>
> You caught a nice bug in the manual steps to follow when granular
> entry-heal is enabled and an empty brick needs heal. Thanks for reporting
> it. :) We will fix the documentation appropriately.
>
> -Krutika
>
>
> On Wed, Aug 31, 2016 at 11:29 AM, Krutika Dhananjay <kdhananj at
redhat.com>
> wrote:
>
>> Tried this.
>>
>> With me, only 'fake2' gets healed after i bring the
'empty' brick back up
>> and it stops there unless I do a 'heal-full'.
>>
>> Is that what you're seeing as well?
>>
>> -Krutika
>>
>> On Wed, Aug 31, 2016 at 4:43 AM, David Gossage <
>> dgossage at carouselchecks.com> wrote:
>>
>>> Same issue brought up glusterd on problem node heal count still
stuck at
>>> 6330.
>>>
>>> Ran gluster v heal GUSTER1 full
>>>
>>> glustershd on problem node shows a sweep starting and finishing in
>>> seconds.  Other 2 nodes show no activity in log.  They should start
a sweep
>>> too shouldn't they?
>>>
>>> Tried starting from scratch
>>>
>>> kill -15 brickpid
>>> rm -Rf /brick
>>> mkdir -p /brick
>>> mkdir mkdir /gsmount/fake2
>>> setfattr -n "user.some-name" -v "some-value"
/gsmount/fake2
>>>
>>> Heals visible dirs instantly then stops.
>>>
>>> gluster v heal GLUSTER1 full
>>>
>>> see sweep star on problem node and end almost instantly.  no files
added
>>> t heal list no files healed no more logging
>>>
>>> [2016-08-30 23:11:31.544331] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> starting full sweep on subvol GLUSTER1-client-1
>>> [2016-08-30 23:11:33.776235] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> finished full sweep on subvol GLUSTER1-client-1
>>>
>>> same results no matter which node you run command on.  Still stuck
with
>>> 6330 files showing needing healed out of 19k.  still showing in
logs no
>>> heals are occuring.
>>>
>>> Is their a way to forcibly reset any prior heal data?  Could it be
stuck
>>> on some past failed heal start?
>>>
>>>
>>>
>>>
>>> *David Gossage*
>>> *Carousel Checks Inc. | System Administrator*
>>> *Office* 708.613.2284
>>>
>>> On Tue, Aug 30, 2016 at 10:03 AM, David Gossage <
>>> dgossage at carouselchecks.com> wrote:
>>>
>>>> On Tue, Aug 30, 2016 at 10:02 AM, David Gossage <
>>>> dgossage at carouselchecks.com> wrote:
>>>>
>>>>> updated test server to 3.8.3
>>>>>
>>>>> Brick1: 192.168.71.10:/gluster2/brick1/1
>>>>> Brick2: 192.168.71.11:/gluster2/brick2/1
>>>>> Brick3: 192.168.71.12:/gluster2/brick3/1
>>>>> Options Reconfigured:
>>>>> cluster.granular-entry-heal: on
>>>>> performance.readdir-ahead: on
>>>>> performance.read-ahead: off
>>>>> nfs.disable: on
>>>>> nfs.addr-namelookup: off
>>>>> nfs.enable-ino32: off
>>>>> cluster.background-self-heal-count: 16
>>>>> cluster.self-heal-window-size: 1024
>>>>> performance.quick-read: off
>>>>> performance.io-cache: off
>>>>> performance.stat-prefetch: off
>>>>> cluster.eager-lock: enable
>>>>> network.remote-dio: on
>>>>> cluster.quorum-type: auto
>>>>> cluster.server-quorum-type: server
>>>>> storage.owner-gid: 36
>>>>> storage.owner-uid: 36
>>>>> server.allow-insecure: on
>>>>> features.shard: on
>>>>> features.shard-block-size: 64MB
>>>>> performance.strict-o-direct: off
>>>>> cluster.locking-scheme: granular
>>>>>
>>>>> kill -15 brickpid
>>>>> rm -Rf /gluster2/brick3
>>>>> mkdir -p /gluster2/brick3/1
>>>>> mkdir mkdir /rhev/data-center/mnt/glusterSD/192.168.71.10
>>>>> \:_glustershard/fake2
>>>>> setfattr -n "user.some-name" -v
"some-value"
>>>>>
/rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard/fake2
>>>>> gluster v start glustershard force
>>>>>
>>>>> at this point brick process starts and all visible files
including new
>>>>> dir are made on brick
>>>>> handful of shards are in heal statistics still but no
.shard directory
>>>>> created and no increase in shard count
>>>>>
>>>>> gluster v heal glustershard
>>>>>
>>>>> At this point still no increase in count or dir made no
additional
>>>>> activity in logs for healing generated.  waited few minutes
tailing logs to
>>>>> check if anything kicked in.
>>>>>
>>>>> gluster v heal glustershard full
>>>>>
>>>>> gluster shards added to list and heal commences.  logs show
full sweep
>>>>> starting on all 3 nodes.  though this time it only shows as
finishing on
>>>>> one which looks to be the one that had brick deleted.
>>>>>
>>>>> [2016-08-30 14:45:33.098589] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>> glustershard-client-0
>>>>> [2016-08-30 14:45:33.099492] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>> glustershard-client-1
>>>>> [2016-08-30 14:45:33.100093] I [MSGID: 108026]
>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: starting full sweep on subvol
>>>>> glustershard-client-2
>>>>> [2016-08-30 14:52:29.760213] I [MSGID: 108026]
>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>> 0-glustershard-replicate-0: finished full sweep on subvol
>>>>> glustershard-client-2
>>>>>
>>>>
>>>> Just realized its still healing so that may be why sweep on 2
other
>>>> bricks haven't replied as finished.
>>>>
>>>>>
>>>>>
>>>>> my hope is that later tonight a full heal will work on
production.  Is
>>>>> it possible self-heal daemon can get stale or stop
listening but still show
>>>>> as active?  Would stopping and starting self-heal daemon
from gluster cli
>>>>> before doing these heals be helpful?
>>>>>
>>>>>
>>>>> On Tue, Aug 30, 2016 at 9:29 AM, David Gossage <
>>>>> dgossage at carouselchecks.com> wrote:
>>>>>
>>>>>> On Tue, Aug 30, 2016 at 8:52 AM, David Gossage <
>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>
>>>>>>> On Tue, Aug 30, 2016 at 8:01 AM, Krutika Dhananjay
<
>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2016 at 6:20 PM, Krutika
Dhananjay <
>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2016 at 6:07 PM, David
Gossage <
>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Tue, Aug 30, 2016 at 7:18 AM,
Krutika Dhananjay <
>>>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Could you also share the glustershd
logs?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'll get them when I get to work
sure
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I tried the same steps that you
mentioned multiple times, but
>>>>>>>>>>> heal is running to completion
without any issues.
>>>>>>>>>>>
>>>>>>>>>>> It must be said that 'heal
full' traverses the files and
>>>>>>>>>>> directories in a depth-first order
and does heals also in the same order.
>>>>>>>>>>> But if it gets interrupted in the
middle (say because self-heal-daemon was
>>>>>>>>>>> either intentionally or
unintentionally brought offline and then brought
>>>>>>>>>>> back up), self-heal will only pick
up the entries that are so far marked as
>>>>>>>>>>> new-entries that need heal which it
will find in indices/xattrop directory.
>>>>>>>>>>> What this means is that those files
and directories that were not visited
>>>>>>>>>>> during the crawl, will remain
untouched and unhealed in this second
>>>>>>>>>>> iteration of heal, unless you
execute a 'heal-full' again.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So should it start healing shards as it
crawls or not until after
>>>>>>>>>> it crawls the entire .shard directory? 
At the pace it was going that could
>>>>>>>>>> be a week with one node appearing in
the cluster but with no shard files if
>>>>>>>>>> anything tries to access a file on that
node.  From my experience other day
>>>>>>>>>> telling it to heal full again did
nothing regardless of node used.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> Crawl is started from '/' of the
volume. Whenever self-heal detects
>>>>>>>> during the crawl that a file or directory is
present in some brick(s) and
>>>>>>>> absent in others, it creates the file on the
bricks where it is absent and
>>>>>>>> marks the fact that the file or directory might
need data/entry and
>>>>>>>> metadata heal too (this also means that an
index is created under
>>>>>>>> .glusterfs/indices/xattrop of the src bricks).
And the data/entry and
>>>>>>>> metadata heal are picked up and done in
>>>>>>>>
>>>>>>> the background with the help of these indices.
>>>>>>>>
>>>>>>>
>>>>>>> Looking at my 3rd node as example i find nearly an
exact same number
>>>>>>> of files in xattrop dir as reported by heal count
at time I brought down
>>>>>>> node2 to try and alleviate read io errors that
seemed to occur from what I
>>>>>>> was guessing as attempts to use the node with no
shards for reads.
>>>>>>>
>>>>>>> Also attached are the glustershd logs from the 3
nodes, along with
>>>>>>> the test node i tried yesterday with same results.
>>>>>>>
>>>>>>
>>>>>> Looking at my own logs I notice that a full sweep was
only ever
>>>>>> recorded in glustershd.log on 2nd node with missing
directory.  I believe I
>>>>>> should have found a sweep begun on every node correct?
>>>>>>
>>>>>> On my test dev when it did work I do see that
>>>>>>
>>>>>> [2016-08-30 13:56:25.223333] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>> 0-glustershard-replicate-0: starting full sweep on
subvol
>>>>>> glustershard-client-0
>>>>>> [2016-08-30 13:56:25.223522] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>> 0-glustershard-replicate-0: starting full sweep on
subvol
>>>>>> glustershard-client-1
>>>>>> [2016-08-30 13:56:25.224616] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
>>>>>> 0-glustershard-replicate-0: starting full sweep on
subvol
>>>>>> glustershard-client-2
>>>>>> [2016-08-30 14:18:48.333740] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>> 0-glustershard-replicate-0: finished full sweep on
subvol
>>>>>> glustershard-client-2
>>>>>> [2016-08-30 14:18:48.356008] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>> 0-glustershard-replicate-0: finished full sweep on
subvol
>>>>>> glustershard-client-1
>>>>>> [2016-08-30 14:18:49.637811] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
>>>>>> 0-glustershard-replicate-0: finished full sweep on
subvol
>>>>>> glustershard-client-0
>>>>>>
>>>>>> While when looking at past few days of the 3 prod nodes
i only found
>>>>>> that on my 2nd node
>>>>>> [2016-08-27 01:26:42.638772] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>>> [2016-08-27 11:37:01.732366] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>>> [2016-08-27 12:58:34.597228] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>>> [2016-08-27 12:59:28.041173] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>>> [2016-08-27 20:03:42.560188] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>>> [2016-08-27 20:03:44.278274] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>>> [2016-08-27 21:00:42.603315] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>>> starting full sweep on subvol GLUSTER1-client-1
>>>>>> [2016-08-27 21:00:46.148674] I [MSGID: 108026]
>>>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> My suspicion is that this is what
happened on your setup. Could
>>>>>>>>>>> you confirm if that was the case?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Brick was brought online with force
start then a full heal
>>>>>>>>>> launched.  Hours later after it became
evident that it was not adding new
>>>>>>>>>> files to heal I did try restarting
self-heal daemon and relaunching full
>>>>>>>>>> heal again. But this was after the heal
had basically already failed to
>>>>>>>>>> work as intended.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> OK. How did you figure it was not adding
any new files? I need to
>>>>>>>>> know what places you were monitoring to
come to this conclusion.
>>>>>>>>>
>>>>>>>>> -Krutika
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> As for those logs, I did manager to
do something that caused
>>>>>>>>>>> these warning messages you shared
earlier to appear in my client and server
>>>>>>>>>>> logs.
>>>>>>>>>>> Although these logs are annoying
and a bit scary too, they
>>>>>>>>>>> didn't do any harm to the data
in my volume. Why they appear just after a
>>>>>>>>>>> brick is replaced and under no
other circumstances is something I'm still
>>>>>>>>>>> investigating.
>>>>>>>>>>>
>>>>>>>>>>> But for future, it would be good to
follow the steps Anuradha
>>>>>>>>>>> gave as that would allow self-heal
to at least detect that it has some
>>>>>>>>>>> repairing to do whenever it is
restarted whether intentionally or otherwise.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I followed those steps as described on
my test box and ended up
>>>>>>>>>> with exact same outcome of adding
shards at an agonizing slow pace and no
>>>>>>>>>> creation of .shard directory or heals
on shard directory.  Directories
>>>>>>>>>> visible from mount healed quickly. 
This was with one VM so it has only 800
>>>>>>>>>> shards as well.  After hours at work it
had added a total of 33 shards to
>>>>>>>>>> be healed.  I sent those logs yesterday
as well though not the glustershd.
>>>>>>>>>>
>>>>>>>>>> Does replace-brick command copy files
in same manner?  For these
>>>>>>>>>> purposes I am contemplating just
skipping the heal route.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -Krutika
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Aug 30, 2016 at 2:22 AM,
David Gossage <
>>>>>>>>>>> dgossage at carouselchecks.com>
wrote:
>>>>>>>>>>>
>>>>>>>>>>>> attached brick and client logs
from test machine where same
>>>>>>>>>>>> behavior occurred not sure if
anything new is there.  its still on 3.8.2
>>>>>>>>>>>>
>>>>>>>>>>>> Number of Bricks: 1 x 3 = 3
>>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>>> Bricks:
>>>>>>>>>>>> Brick1:
192.168.71.10:/gluster2/brick1/1
>>>>>>>>>>>> Brick2:
192.168.71.11:/gluster2/brick2/1
>>>>>>>>>>>> Brick3:
192.168.71.12:/gluster2/brick3/1
>>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>>> cluster.locking-scheme:
granular
>>>>>>>>>>>> performance.strict-o-direct:
off
>>>>>>>>>>>> features.shard-block-size: 64MB
>>>>>>>>>>>> features.shard: on
>>>>>>>>>>>> server.allow-insecure: on
>>>>>>>>>>>> storage.owner-uid: 36
>>>>>>>>>>>> storage.owner-gid: 36
>>>>>>>>>>>> cluster.server-quorum-type:
server
>>>>>>>>>>>> cluster.quorum-type: auto
>>>>>>>>>>>> network.remote-dio: on
>>>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>>>> performance.stat-prefetch: off
>>>>>>>>>>>> performance.io-cache: off
>>>>>>>>>>>> performance.quick-read: off
>>>>>>>>>>>> cluster.self-heal-window-size:
1024
>>>>>>>>>>>>
cluster.background-self-heal-count: 16
>>>>>>>>>>>> nfs.enable-ino32: off
>>>>>>>>>>>> nfs.addr-namelookup: off
>>>>>>>>>>>> nfs.disable: on
>>>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>>>>> cluster.granular-entry-heal: on
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Aug 29, 2016 at 2:20
PM, David Gossage <
>>>>>>>>>>>> dgossage at
carouselchecks.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Aug 29, 2016 at
7:01 AM, Anuradha Talur <
>>>>>>>>>>>>> atalur at redhat.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ----- Original Message
-----
>>>>>>>>>>>>>> > From: "David
Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>>>>> > To: "Anuradha
Talur" <atalur at redhat.com>
>>>>>>>>>>>>>> > Cc:
"gluster-users at gluster.org List" <
>>>>>>>>>>>>>> Gluster-users at
gluster.org>, "Krutika Dhananjay" <
>>>>>>>>>>>>>> kdhananj at
redhat.com>
>>>>>>>>>>>>>> > Sent: Monday,
August 29, 2016 5:12:42 PM
>>>>>>>>>>>>>> > Subject: Re:
[Gluster-users] 3.8.3 Shards Healing Glacier
>>>>>>>>>>>>>> Slow
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > On Mon, Aug 29,
2016 at 5:39 AM, Anuradha Talur <
>>>>>>>>>>>>>> atalur at
redhat.com> wrote:
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > > Response
inline.
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> > > -----
Original Message -----
>>>>>>>>>>>>>> > > > From:
"Krutika Dhananjay" <kdhananj at redhat.com>
>>>>>>>>>>>>>> > > > To:
"David Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>>>>> > > > Cc:
"gluster-users at gluster.org List" <
>>>>>>>>>>>>>> Gluster-users at
gluster.org>
>>>>>>>>>>>>>> > > > Sent:
Monday, August 29, 2016 3:55:04 PM
>>>>>>>>>>>>>> > > > Subject:
Re: [Gluster-users] 3.8.3 Shards Healing
>>>>>>>>>>>>>> Glacier Slow
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > Could
you attach both client and brick logs? Meanwhile
>>>>>>>>>>>>>> I will try these
>>>>>>>>>>>>>> > > steps
>>>>>>>>>>>>>> > > > out on
my machines and see if it is easily recreatable.
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > -Krutika
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > On Mon,
Aug 29, 2016 at 2:31 PM, David Gossage <
>>>>>>>>>>>>>> > > dgossage at
carouselchecks.com
>>>>>>>>>>>>>> > > > >
wrote:
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > Centos 7
Gluster 3.8.3
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > Brick1:
ccgl1.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>>> > > > Brick2:
ccgl2.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>>> > > > Brick3:
ccgl4.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>>>> > > > Options
Reconfigured:
>>>>>>>>>>>>>> > > >
cluster.data-self-heal-algorithm: full
>>>>>>>>>>>>>> > > >
cluster.self-heal-daemon: on
>>>>>>>>>>>>>> > > >
cluster.locking-scheme: granular
>>>>>>>>>>>>>> > > >
features.shard-block-size: 64MB
>>>>>>>>>>>>>> > > >
features.shard: on
>>>>>>>>>>>>>> > > >
performance.readdir-ahead: on
>>>>>>>>>>>>>> > > >
storage.owner-uid: 36
>>>>>>>>>>>>>> > > >
storage.owner-gid: 36
>>>>>>>>>>>>>> > > >
performance.quick-read: off
>>>>>>>>>>>>>> > > >
performance.read-ahead: off
>>>>>>>>>>>>>> > > >
performance.io-cache: off
>>>>>>>>>>>>>> > > >
performance.stat-prefetch: on
>>>>>>>>>>>>>> > > >
cluster.eager-lock: enable
>>>>>>>>>>>>>> > > >
network.remote-dio: enable
>>>>>>>>>>>>>> > > >
cluster.quorum-type: auto
>>>>>>>>>>>>>> > > >
cluster.server-quorum-type: server
>>>>>>>>>>>>>> > > >
server.allow-insecure: on
>>>>>>>>>>>>>> > > >
cluster.self-heal-window-size: 1024
>>>>>>>>>>>>>> > > >
cluster.background-self-heal-count: 16
>>>>>>>>>>>>>> > > >
performance.strict-write-ordering: off
>>>>>>>>>>>>>> > > >
nfs.disable: on
>>>>>>>>>>>>>> > > >
nfs.addr-namelookup: off
>>>>>>>>>>>>>> > > >
nfs.enable-ino32: off
>>>>>>>>>>>>>> > > >
cluster.granular-entry-heal: on
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > Friday
did rolling upgrade from 3.8.3->3.8.3 no issues.
>>>>>>>>>>>>>> > > >
Following steps detailed in previous recommendations
>>>>>>>>>>>>>> began proces of
>>>>>>>>>>>>>> > > >
replacing and healngbricks one node at a time.
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > 1) kill
pid of brick
>>>>>>>>>>>>>> > > > 2)
reconfigure brick from raid6 to raid10
>>>>>>>>>>>>>> > > > 3)
recreate directory of brick
>>>>>>>>>>>>>> > > > 4)
gluster volume start <> force
>>>>>>>>>>>>>> > > > 5)
gluster volume heal <> full
>>>>>>>>>>>>>> > > Hi,
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> > > I'd
suggest that full heal is not used. There are a few
>>>>>>>>>>>>>> bugs in full heal.
>>>>>>>>>>>>>> > > Better safe
than sorry ;)
>>>>>>>>>>>>>> > > Instead
I'd suggest the following steps:
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> > > Currently I
brought the node down by systemctl stop
>>>>>>>>>>>>>> glusterd as I was
>>>>>>>>>>>>>> > getting sporadic
io issues and a few VM's paused so hoping
>>>>>>>>>>>>>> that will help.
>>>>>>>>>>>>>> > I may wait to do
this till around 4PM when most work is
>>>>>>>>>>>>>> done in case it
>>>>>>>>>>>>>> > shoots load up.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > > 1) kill pid
of brick
>>>>>>>>>>>>>> > > 2) to
configuring of brick that you need
>>>>>>>>>>>>>> > > 3) recreate
brick dir
>>>>>>>>>>>>>> > > 4) while the
brick is still down, from the mount point:
>>>>>>>>>>>>>> > >    a) create
a dummy non existent dir under / of mount.
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > so if noee 2 is
down brick, pick node for example 3 and
>>>>>>>>>>>>>> make a test dir
>>>>>>>>>>>>>> > under its brick
directory that doesnt exist on 2 or should
>>>>>>>>>>>>>> I be dong this
>>>>>>>>>>>>>> > over a gluster
mount?
>>>>>>>>>>>>>> You should be doing
this over gluster mount.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > >    b) set a
non existent extended attribute on / of mount.
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Could you give me
an example of an attribute to set?   I've
>>>>>>>>>>>>>> read a tad on
>>>>>>>>>>>>>> > this, and looked
up attributes but haven't set any yet
>>>>>>>>>>>>>> myself.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> Sure. setfattr -n
"user.some-name" -v "some-value"
>>>>>>>>>>>>>> <path-to-mount>
>>>>>>>>>>>>>> > Doing these steps
will ensure that heal happens only from
>>>>>>>>>>>>>> updated brick to
>>>>>>>>>>>>>> > > down brick.
>>>>>>>>>>>>>> > > 5) gluster v
start <> force
>>>>>>>>>>>>>> > > 6) gluster v
heal <>
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Will it matter if
somewhere in gluster the full heal
>>>>>>>>>>>>>> command was run other
>>>>>>>>>>>>>> > day?  Not sure if
it eventually stops or times out.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> full heal will stop
once the crawl is done. So if you want to
>>>>>>>>>>>>>> trigger heal again,
>>>>>>>>>>>>>> run gluster v heal
<>. Actually even brick up or volume start
>>>>>>>>>>>>>> force should
>>>>>>>>>>>>>> trigger the heal.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Did this on test bed today.
its one server with 3 bricks on
>>>>>>>>>>>>> same machine so take that
for what its worth.  also it still runs 3.8.2.
>>>>>>>>>>>>> Maybe ill update and re-run
test.
>>>>>>>>>>>>>
>>>>>>>>>>>>> killed brick
>>>>>>>>>>>>> deleted brick dir
>>>>>>>>>>>>> recreated brick dir
>>>>>>>>>>>>> created fake dir on gluster
mount
>>>>>>>>>>>>> set suggested fake
attribute on it
>>>>>>>>>>>>> ran volume start <>
force
>>>>>>>>>>>>>
>>>>>>>>>>>>> looked at files it said
needed healing and it was just 8
>>>>>>>>>>>>> shards that were modified
for few minutes I ran through steps
>>>>>>>>>>>>>
>>>>>>>>>>>>> gave it few minutes and it
stayed same
>>>>>>>>>>>>> ran gluster volume <>
heal
>>>>>>>>>>>>>
>>>>>>>>>>>>> it healed all the
directories and files you can see over mount
>>>>>>>>>>>>> including fakedir.
>>>>>>>>>>>>>
>>>>>>>>>>>>> same issue for shards
though.  it adds more shards to heal at
>>>>>>>>>>>>> glacier pace.  slight jump
in speed if I stat every file and dir in VM
>>>>>>>>>>>>> running but not all shards.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It started with 8 shards to
heal and is now only at 33 out of
>>>>>>>>>>>>> 800 and probably wont
finish adding for few days at rate it goes.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> > > > 1st node
worked as expected took 12 hours to heal 1TB
>>>>>>>>>>>>>> data. Load was
>>>>>>>>>>>>>> > > little
>>>>>>>>>>>>>> > > > heavy
but nothing shocking.
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > About an
hour after node 1 finished I began same
>>>>>>>>>>>>>> process on node2. Heal
>>>>>>>>>>>>>> > > > proces
kicked in as before and the files in directories
>>>>>>>>>>>>>> visible from
>>>>>>>>>>>>>> > > mount
>>>>>>>>>>>>>> > > > and
.glusterfs healed in short time. Then it began
>>>>>>>>>>>>>> crawl of .shard adding
>>>>>>>>>>>>>> > > > those
files to heal count at which point the entire
>>>>>>>>>>>>>> proces ground to a
>>>>>>>>>>>>>> > > halt
>>>>>>>>>>>>>> > > >
basically. After 48 hours out of 19k shards it has
>>>>>>>>>>>>>> added 5900 to heal
>>>>>>>>>>>>>> > > list.
>>>>>>>>>>>>>> > > > Load on
all 3 machnes is negligible. It was suggested
>>>>>>>>>>>>>> to change this
>>>>>>>>>>>>>> > > value
>>>>>>>>>>>>>> > > > to full
cluster.data-self-heal-algorithm and restart
>>>>>>>>>>>>>> volume which I
>>>>>>>>>>>>>> > > did. No
>>>>>>>>>>>>>> > > > efffect.
Tried relaunching heal no effect, despite any
>>>>>>>>>>>>>> node picked. I
>>>>>>>>>>>>>> > > > started
each VM and performed a stat of all files from
>>>>>>>>>>>>>> within it, or a
>>>>>>>>>>>>>> > > full
>>>>>>>>>>>>>> > > > virus
scan and that seemed to cause short small spikes
>>>>>>>>>>>>>> in shards added,
>>>>>>>>>>>>>> > > but
>>>>>>>>>>>>>> > > > not by
much. Logs are showing no real messages
>>>>>>>>>>>>>> indicating anything is
>>>>>>>>>>>>>> > > going
>>>>>>>>>>>>>> > > > on. I
get hits to brick log on occasion of null lookups
>>>>>>>>>>>>>> making me think
>>>>>>>>>>>>>> > > its
>>>>>>>>>>>>>> > > > not
really crawling shards directory but waiting for a
>>>>>>>>>>>>>> shard lookup to
>>>>>>>>>>>>>> > > add
>>>>>>>>>>>>>> > > > it.
I'll get following in brick log but not constant
>>>>>>>>>>>>>> and sometime
>>>>>>>>>>>>>> > > multiple
>>>>>>>>>>>>>> > > > for same
shard.
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > >
[2016-08-29 08:31:57.478125] W [MSGID: 115009]
>>>>>>>>>>>>>> > > >
[server-resolve.c:569:server_resolve]
>>>>>>>>>>>>>> 0-GLUSTER1-server: no
resolution
>>>>>>>>>>>>>> > > type
>>>>>>>>>>>>>> > > > for
(null) (LOOKUP)
>>>>>>>>>>>>>> > > >
[2016-08-29 08:31:57.478170] E [MSGID: 115050]
>>>>>>>>>>>>>> > > >
[server-rpc-fops.c:156:server_lookup_cbk]
>>>>>>>>>>>>>> 0-GLUSTER1-server:
12591783:
>>>>>>>>>>>>>> > > > LOOKUP
(null) (00000000-0000-0000-00
>>>>>>>>>>>>>> > > >
00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221)
>>>>>>>>>>>>>> ==> (Invalid
>>>>>>>>>>>>>> > > >
argument) [Invalid argument]
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > This one
repeated about 30 times in row then nothing
>>>>>>>>>>>>>> for 10 minutes then
>>>>>>>>>>>>>> > > one
>>>>>>>>>>>>>> > > > hit for
one different shard by itself.
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > How can
I determine if Heal is actually running? How
>>>>>>>>>>>>>> can I kill it or
>>>>>>>>>>>>>> > > force
>>>>>>>>>>>>>> > > > restart?
Does node I start it from determine which
>>>>>>>>>>>>>> directory gets
>>>>>>>>>>>>>> > > crawled to
>>>>>>>>>>>>>> > > >
determine heals?
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > > David
Gossage
>>>>>>>>>>>>>> > > > Carousel
Checks Inc. | System Administrator
>>>>>>>>>>>>>> > > > Office
708.613.2284
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > >
_______________________________________________
>>>>>>>>>>>>>> > > >
Gluster-users mailing list
>>>>>>>>>>>>>> > > >
Gluster-users at gluster.org
>>>>>>>>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > >
>>>>>>>>>>>>>> > > >
_______________________________________________
>>>>>>>>>>>>>> > > >
Gluster-users mailing list
>>>>>>>>>>>>>> > > >
Gluster-users at gluster.org
>>>>>>>>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> > > --
>>>>>>>>>>>>>> > > Thanks,
>>>>>>>>>>>>>> > > Anuradha.
>>>>>>>>>>>>>> > >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Anuradha.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160831/c42fe90d/attachment.html>

Gluster users - Aug 2016 - 3.8.3 Shards Healing Glacier Slow

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

[Gluster-users] 3.8.3 Shards Healing Glacier Slow