thr3ads.net - Gluster users - [Gluster-users] 3.8.3 Shards Healing Glacier Slow [Aug 2016]

If this information is useful, please help other people find it:
Share via:

David Gossage

2016-Aug-30 23:13 UTC

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

Same issue brought up glusterd on problem node heal count still stuck at
6330.

Ran gluster v heal GUSTER1 full

glustershd on problem node shows a sweep starting and finishing in
seconds.  Other 2 nodes show no activity in log.  They should start a sweep
too shouldn't they?

Tried starting from scratch

kill -15 brickpid
rm -Rf /brick
mkdir -p /brick
mkdir mkdir /gsmount/fake2
setfattr -n "user.some-name" -v "some-value" /gsmount/fake2

Heals visible dirs instantly then stops.

gluster v heal GLUSTER1 full

see sweep star on problem node and end almost instantly.  no files added t
heal list no files healed no more logging

[2016-08-30 23:11:31.544331] I [MSGID: 108026]
[afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0: starting
full sweep on subvol GLUSTER1-client-1
[2016-08-30 23:11:33.776235] I [MSGID: 108026]
[afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0: finished
full sweep on subvol GLUSTER1-client-1

same results no matter which node you run command on.  Still stuck with
6330 files showing needing healed out of 19k.  still showing in logs no
heals are occuring.

Is their a way to forcibly reset any prior heal data?  Could it be stuck on
some past failed heal start?




*David Gossage*
*Carousel Checks Inc. | System Administrator*
*Office* 708.613.2284

On Tue, Aug 30, 2016 at 10:03 AM, David Gossage <dgossage at
carouselchecks.com> wrote:
> On Tue, Aug 30, 2016 at 10:02 AM, David Gossage <
> dgossage at carouselchecks.com> wrote:
>
>> updated test server to 3.8.3
>>
>> Brick1: 192.168.71.10:/gluster2/brick1/1
>> Brick2: 192.168.71.11:/gluster2/brick2/1
>> Brick3: 192.168.71.12:/gluster2/brick3/1
>> Options Reconfigured:
>> cluster.granular-entry-heal: on
>> performance.readdir-ahead: on
>> performance.read-ahead: off
>> nfs.disable: on
>> nfs.addr-namelookup: off
>> nfs.enable-ino32: off
>> cluster.background-self-heal-count: 16
>> cluster.self-heal-window-size: 1024
>> performance.quick-read: off
>> performance.io-cache: off
>> performance.stat-prefetch: off
>> cluster.eager-lock: enable
>> network.remote-dio: on
>> cluster.quorum-type: auto
>> cluster.server-quorum-type: server
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> server.allow-insecure: on
>> features.shard: on
>> features.shard-block-size: 64MB
>> performance.strict-o-direct: off
>> cluster.locking-scheme: granular
>>
>> kill -15 brickpid
>> rm -Rf /gluster2/brick3
>> mkdir -p /gluster2/brick3/1
>> mkdir mkdir
/rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard
>> /fake2
>> setfattr -n "user.some-name" -v "some-value"
>> /rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard/fake2
>> gluster v start glustershard force
>>
>> at this point brick process starts and all visible files including new
>> dir are made on brick
>> handful of shards are in heal statistics still but no .shard directory
>> created and no increase in shard count
>>
>> gluster v heal glustershard
>>
>> At this point still no increase in count or dir made no additional
>> activity in logs for healing generated.  waited few minutes tailing
logs to
>> check if anything kicked in.
>>
>> gluster v heal glustershard full
>>
>> gluster shards added to list and heal commences.  logs show full sweep
>> starting on all 3 nodes.  though this time it only shows as finishing
on
>> one which looks to be the one that had brick deleted.
>>
>> [2016-08-30 14:45:33.098589] I [MSGID: 108026]
>> [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0:
>> starting full sweep on subvol glustershard-client-0
>> [2016-08-30 14:45:33.099492] I [MSGID: 108026]
>> [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0:
>> starting full sweep on subvol glustershard-client-1
>> [2016-08-30 14:45:33.100093] I [MSGID: 108026]
>> [afr-self-heald.c:646:afr_shd_full_healer] 0-glustershard-replicate-0:
>> starting full sweep on subvol glustershard-client-2
>> [2016-08-30 14:52:29.760213] I [MSGID: 108026]
>> [afr-self-heald.c:656:afr_shd_full_healer] 0-glustershard-replicate-0:
>> finished full sweep on subvol glustershard-client-2
>>
>
> Just realized its still healing so that may be why sweep on 2 other bricks
> haven't replied as finished.
>
>>
>>
>> my hope is that later tonight a full heal will work on production.  Is
it
>> possible self-heal daemon can get stale or stop listening but still
show as
>> active?  Would stopping and starting self-heal daemon from gluster cli
>> before doing these heals be helpful?
>>
>>
>> On Tue, Aug 30, 2016 at 9:29 AM, David Gossage <
>> dgossage at carouselchecks.com> wrote:
>>
>>> On Tue, Aug 30, 2016 at 8:52 AM, David Gossage <
>>> dgossage at carouselchecks.com> wrote:
>>>
>>>> On Tue, Aug 30, 2016 at 8:01 AM, Krutika Dhananjay <kdhananj
at redhat.com
>>>> > wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 30, 2016 at 6:20 PM, Krutika Dhananjay <
>>>>> kdhananj at redhat.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 30, 2016 at 6:07 PM, David Gossage <
>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>
>>>>>>> On Tue, Aug 30, 2016 at 7:18 AM, Krutika Dhananjay
<
>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>
>>>>>>>> Could you also share the glustershd logs?
>>>>>>>>
>>>>>>>
>>>>>>> I'll get them when I get to work sure
>>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> I tried the same steps that you mentioned
multiple times, but heal
>>>>>>>> is running to completion without any issues.
>>>>>>>>
>>>>>>>> It must be said that 'heal full'
traverses the files and
>>>>>>>> directories in a depth-first order and does
heals also in the same order.
>>>>>>>> But if it gets interrupted in the middle (say
because self-heal-daemon was
>>>>>>>> either intentionally or unintentionally brought
offline and then brought
>>>>>>>> back up), self-heal will only pick up the
entries that are so far marked as
>>>>>>>> new-entries that need heal which it will find
in indices/xattrop directory.
>>>>>>>> What this means is that those files and
directories that were not visited
>>>>>>>> during the crawl, will remain untouched and
unhealed in this second
>>>>>>>> iteration of heal, unless you execute a
'heal-full' again.
>>>>>>>>
>>>>>>>
>>>>>>> So should it start healing shards as it crawls or
not until after it
>>>>>>> crawls the entire .shard directory?  At the pace it
was going that could be
>>>>>>> a week with one node appearing in the cluster but
with no shard files if
>>>>>>> anything tries to access a file on that node.  From
my experience other day
>>>>>>> telling it to heal full again did nothing
regardless of node used.
>>>>>>>
>>>>>>
>>>>> Crawl is started from '/' of the volume. Whenever
self-heal detects
>>>>> during the crawl that a file or directory is present in
some brick(s) and
>>>>> absent in others, it creates the file on the bricks where
it is absent and
>>>>> marks the fact that the file or directory might need
data/entry and
>>>>> metadata heal too (this also means that an index is created
under
>>>>> .glusterfs/indices/xattrop of the src bricks). And the
data/entry and
>>>>> metadata heal are picked up and done in
>>>>>
>>>> the background with the help of these indices.
>>>>>
>>>>
>>>> Looking at my 3rd node as example i find nearly an exact same
number of
>>>> files in xattrop dir as reported by heal count at time I
brought down node2
>>>> to try and alleviate read io errors that seemed to occur from
what I was
>>>> guessing as attempts to use the node with no shards for reads.
>>>>
>>>> Also attached are the glustershd logs from the 3 nodes, along
with the
>>>> test node i tried yesterday with same results.
>>>>
>>>
>>> Looking at my own logs I notice that a full sweep was only ever
recorded
>>> in glustershd.log on 2nd node with missing directory.  I believe I
should
>>> have found a sweep begun on every node correct?
>>>
>>> On my test dev when it did work I do see that
>>>
>>> [2016-08-30 13:56:25.223333] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> starting full sweep on subvol glustershard-client-0
>>> [2016-08-30 13:56:25.223522] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> starting full sweep on subvol glustershard-client-1
>>> [2016-08-30 13:56:25.224616] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> starting full sweep on subvol glustershard-client-2
>>> [2016-08-30 14:18:48.333740] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> finished full sweep on subvol glustershard-client-2
>>> [2016-08-30 14:18:48.356008] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> finished full sweep on subvol glustershard-client-1
>>> [2016-08-30 14:18:49.637811] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> finished full sweep on subvol glustershard-client-0
>>>
>>> While when looking at past few days of the 3 prod nodes i only
found
>>> that on my 2nd node
>>> [2016-08-27 01:26:42.638772] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> starting full sweep on subvol GLUSTER1-client-1
>>> [2016-08-27 11:37:01.732366] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> finished full sweep on subvol GLUSTER1-client-1
>>> [2016-08-27 12:58:34.597228] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> starting full sweep on subvol GLUSTER1-client-1
>>> [2016-08-27 12:59:28.041173] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> finished full sweep on subvol GLUSTER1-client-1
>>> [2016-08-27 20:03:42.560188] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> starting full sweep on subvol GLUSTER1-client-1
>>> [2016-08-27 20:03:44.278274] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> finished full sweep on subvol GLUSTER1-client-1
>>> [2016-08-27 21:00:42.603315] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> starting full sweep on subvol GLUSTER1-client-1
>>> [2016-08-27 21:00:46.148674] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
>>> finished full sweep on subvol GLUSTER1-client-1
>>>
>>>
>>>
>>>
>>>
>>>>
>>>>>
>>>>>>>
>>>>>>>> My suspicion is that this is what happened on
your setup. Could you
>>>>>>>> confirm if that was the case?
>>>>>>>>
>>>>>>>
>>>>>>> Brick was brought online with force start then a
full heal
>>>>>>> launched.  Hours later after it became evident that
it was not adding new
>>>>>>> files to heal I did try restarting self-heal daemon
and relaunching full
>>>>>>> heal again. But this was after the heal had
basically already failed to
>>>>>>> work as intended.
>>>>>>>
>>>>>>
>>>>>> OK. How did you figure it was not adding any new files?
I need to
>>>>>> know what places you were monitoring to come to this
conclusion.
>>>>>>
>>>>>> -Krutika
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> As for those logs, I did manager to do
something that caused these
>>>>>>>> warning messages you shared earlier to appear
in my client and server logs.
>>>>>>>> Although these logs are annoying and a bit
scary too, they didn't
>>>>>>>> do any harm to the data in my volume. Why they
appear just after a brick is
>>>>>>>> replaced and under no other circumstances is
something I'm still
>>>>>>>> investigating.
>>>>>>>>
>>>>>>>> But for future, it would be good to follow the
steps Anuradha gave
>>>>>>>> as that would allow self-heal to at least
detect that it has some repairing
>>>>>>>> to do whenever it is restarted whether
intentionally or otherwise.
>>>>>>>>
>>>>>>>
>>>>>>> I followed those steps as described on my test box
and ended up with
>>>>>>> exact same outcome of adding shards at an agonizing
slow pace and no
>>>>>>> creation of .shard directory or heals on shard
directory.  Directories
>>>>>>> visible from mount healed quickly.  This was with
one VM so it has only 800
>>>>>>> shards as well.  After hours at work it had added a
total of 33 shards to
>>>>>>> be healed.  I sent those logs yesterday as well
though not the glustershd.
>>>>>>>
>>>>>>> Does replace-brick command copy files in same
manner?  For these
>>>>>>> purposes I am contemplating just skipping the heal
route.
>>>>>>>
>>>>>>>
>>>>>>>> -Krutika
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2016 at 2:22 AM, David Gossage
<
>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>
>>>>>>>>> attached brick and client logs from test
machine where same
>>>>>>>>> behavior occurred not sure if anything new
is there.  its still on 3.8.2
>>>>>>>>>
>>>>>>>>> Number of Bricks: 1 x 3 = 3
>>>>>>>>> Transport-type: tcp
>>>>>>>>> Bricks:
>>>>>>>>> Brick1: 192.168.71.10:/gluster2/brick1/1
>>>>>>>>> Brick2: 192.168.71.11:/gluster2/brick2/1
>>>>>>>>> Brick3: 192.168.71.12:/gluster2/brick3/1
>>>>>>>>> Options Reconfigured:
>>>>>>>>> cluster.locking-scheme: granular
>>>>>>>>> performance.strict-o-direct: off
>>>>>>>>> features.shard-block-size: 64MB
>>>>>>>>> features.shard: on
>>>>>>>>> server.allow-insecure: on
>>>>>>>>> storage.owner-uid: 36
>>>>>>>>> storage.owner-gid: 36
>>>>>>>>> cluster.server-quorum-type: server
>>>>>>>>> cluster.quorum-type: auto
>>>>>>>>> network.remote-dio: on
>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>> performance.stat-prefetch: off
>>>>>>>>> performance.io-cache: off
>>>>>>>>> performance.quick-read: off
>>>>>>>>> cluster.self-heal-window-size: 1024
>>>>>>>>> cluster.background-self-heal-count: 16
>>>>>>>>> nfs.enable-ino32: off
>>>>>>>>> nfs.addr-namelookup: off
>>>>>>>>> nfs.disable: on
>>>>>>>>> performance.read-ahead: off
>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>> cluster.granular-entry-heal: on
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Aug 29, 2016 at 2:20 PM, David
Gossage <
>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Mon, Aug 29, 2016 at 7:01 AM,
Anuradha Talur <
>>>>>>>>>> atalur at redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>> > From: "David
Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>> > To: "Anuradha Talur"
<atalur at redhat.com>
>>>>>>>>>>> > Cc: "gluster-users at
gluster.org List" <
>>>>>>>>>>> Gluster-users at gluster.org>,
"Krutika Dhananjay" <
>>>>>>>>>>> kdhananj at redhat.com>
>>>>>>>>>>> > Sent: Monday, August 29, 2016
5:12:42 PM
>>>>>>>>>>> > Subject: Re: [Gluster-users]
3.8.3 Shards Healing Glacier Slow
>>>>>>>>>>> >
>>>>>>>>>>> > On Mon, Aug 29, 2016 at 5:39
AM, Anuradha Talur <
>>>>>>>>>>> atalur at redhat.com> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > > Response inline.
>>>>>>>>>>> > >
>>>>>>>>>>> > > ----- Original Message
-----
>>>>>>>>>>> > > > From: "Krutika
Dhananjay" <kdhananj at redhat.com>
>>>>>>>>>>> > > > To: "David
Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>> > > > Cc:
"gluster-users at gluster.org List" <
>>>>>>>>>>> Gluster-users at gluster.org>
>>>>>>>>>>> > > > Sent: Monday, August
29, 2016 3:55:04 PM
>>>>>>>>>>> > > > Subject: Re:
[Gluster-users] 3.8.3 Shards Healing Glacier
>>>>>>>>>>> Slow
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > Could you attach
both client and brick logs? Meanwhile I
>>>>>>>>>>> will try these
>>>>>>>>>>> > > steps
>>>>>>>>>>> > > > out on my machines
and see if it is easily recreatable.
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > -Krutika
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > On Mon, Aug 29, 2016
at 2:31 PM, David Gossage <
>>>>>>>>>>> > > dgossage at
carouselchecks.com
>>>>>>>>>>> > > > > wrote:
>>>>>>>>>>> > > >
>>>>>>>>>>> > > >
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > Centos 7 Gluster
3.8.3
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > Brick1:
ccgl1.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>> > > > Brick2:
ccgl2.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>> > > > Brick3:
ccgl4.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>> > > > Options
Reconfigured:
>>>>>>>>>>> > > >
cluster.data-self-heal-algorithm: full
>>>>>>>>>>> > > >
cluster.self-heal-daemon: on
>>>>>>>>>>> > > >
cluster.locking-scheme: granular
>>>>>>>>>>> > > >
features.shard-block-size: 64MB
>>>>>>>>>>> > > > features.shard: on
>>>>>>>>>>> > > >
performance.readdir-ahead: on
>>>>>>>>>>> > > > storage.owner-uid:
36
>>>>>>>>>>> > > > storage.owner-gid:
36
>>>>>>>>>>> > > >
performance.quick-read: off
>>>>>>>>>>> > > >
performance.read-ahead: off
>>>>>>>>>>> > > >
performance.io-cache: off
>>>>>>>>>>> > > >
performance.stat-prefetch: on
>>>>>>>>>>> > > > cluster.eager-lock:
enable
>>>>>>>>>>> > > > network.remote-dio:
enable
>>>>>>>>>>> > > > cluster.quorum-type:
auto
>>>>>>>>>>> > > >
cluster.server-quorum-type: server
>>>>>>>>>>> > > >
server.allow-insecure: on
>>>>>>>>>>> > > >
cluster.self-heal-window-size: 1024
>>>>>>>>>>> > > >
cluster.background-self-heal-count: 16
>>>>>>>>>>> > > >
performance.strict-write-ordering: off
>>>>>>>>>>> > > > nfs.disable: on
>>>>>>>>>>> > > > nfs.addr-namelookup:
off
>>>>>>>>>>> > > > nfs.enable-ino32:
off
>>>>>>>>>>> > > >
cluster.granular-entry-heal: on
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > Friday did rolling
upgrade from 3.8.3->3.8.3 no issues.
>>>>>>>>>>> > > > Following steps
detailed in previous recommendations began
>>>>>>>>>>> proces of
>>>>>>>>>>> > > > replacing and
healngbricks one node at a time.
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > 1) kill pid of brick
>>>>>>>>>>> > > > 2) reconfigure brick
from raid6 to raid10
>>>>>>>>>>> > > > 3) recreate
directory of brick
>>>>>>>>>>> > > > 4) gluster volume
start <> force
>>>>>>>>>>> > > > 5) gluster volume
heal <> full
>>>>>>>>>>> > > Hi,
>>>>>>>>>>> > >
>>>>>>>>>>> > > I'd suggest that full
heal is not used. There are a few bugs
>>>>>>>>>>> in full heal.
>>>>>>>>>>> > > Better safe than sorry ;)
>>>>>>>>>>> > > Instead I'd suggest
the following steps:
>>>>>>>>>>> > >
>>>>>>>>>>> > > Currently I brought the
node down by systemctl stop glusterd
>>>>>>>>>>> as I was
>>>>>>>>>>> > getting sporadic io issues and
a few VM's paused so hoping
>>>>>>>>>>> that will help.
>>>>>>>>>>> > I may wait to do this till
around 4PM when most work is done
>>>>>>>>>>> in case it
>>>>>>>>>>> > shoots load up.
>>>>>>>>>>> >
>>>>>>>>>>> >
>>>>>>>>>>> > > 1) kill pid of brick
>>>>>>>>>>> > > 2) to configuring of
brick that you need
>>>>>>>>>>> > > 3) recreate brick dir
>>>>>>>>>>> > > 4) while the brick is
still down, from the mount point:
>>>>>>>>>>> > >    a) create a dummy non
existent dir under / of mount.
>>>>>>>>>>> > >
>>>>>>>>>>> >
>>>>>>>>>>> > so if noee 2 is down brick,
pick node for example 3 and make a
>>>>>>>>>>> test dir
>>>>>>>>>>> > under its brick directory that
doesnt exist on 2 or should I
>>>>>>>>>>> be dong this
>>>>>>>>>>> > over a gluster mount?
>>>>>>>>>>> You should be doing this over
gluster mount.
>>>>>>>>>>> >
>>>>>>>>>>> > >    b) set a non existent
extended attribute on / of mount.
>>>>>>>>>>> > >
>>>>>>>>>>> >
>>>>>>>>>>> > Could you give me an example
of an attribute to set?   I've
>>>>>>>>>>> read a tad on
>>>>>>>>>>> > this, and looked up attributes
but haven't set any yet myself.
>>>>>>>>>>> >
>>>>>>>>>>> Sure. setfattr -n
"user.some-name" -v "some-value"
>>>>>>>>>>> <path-to-mount>
>>>>>>>>>>> > Doing these steps will ensure
that heal happens only from
>>>>>>>>>>> updated brick to
>>>>>>>>>>> > > down brick.
>>>>>>>>>>> > > 5) gluster v start
<> force
>>>>>>>>>>> > > 6) gluster v heal
<>
>>>>>>>>>>> > >
>>>>>>>>>>> >
>>>>>>>>>>> > Will it matter if somewhere in
gluster the full heal command
>>>>>>>>>>> was run other
>>>>>>>>>>> > day?  Not sure if it
eventually stops or times out.
>>>>>>>>>>> >
>>>>>>>>>>> full heal will stop once the crawl
is done. So if you want to
>>>>>>>>>>> trigger heal again,
>>>>>>>>>>> run gluster v heal <>.
Actually even brick up or volume start
>>>>>>>>>>> force should
>>>>>>>>>>> trigger the heal.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Did this on test bed today.  its one
server with 3 bricks on same
>>>>>>>>>> machine so take that for what its
worth.  also it still runs 3.8.2.  Maybe
>>>>>>>>>> ill update and re-run test.
>>>>>>>>>>
>>>>>>>>>> killed brick
>>>>>>>>>> deleted brick dir
>>>>>>>>>> recreated brick dir
>>>>>>>>>> created fake dir on gluster mount
>>>>>>>>>> set suggested fake attribute on it
>>>>>>>>>> ran volume start <> force
>>>>>>>>>>
>>>>>>>>>> looked at files it said needed healing
and it was just 8 shards
>>>>>>>>>> that were modified for few minutes I
ran through steps
>>>>>>>>>>
>>>>>>>>>> gave it few minutes and it stayed same
>>>>>>>>>> ran gluster volume <> heal
>>>>>>>>>>
>>>>>>>>>> it healed all the directories and files
you can see over mount
>>>>>>>>>> including fakedir.
>>>>>>>>>>
>>>>>>>>>> same issue for shards though.  it adds
more shards to heal at
>>>>>>>>>> glacier pace.  slight jump in speed if
I stat every file and dir in VM
>>>>>>>>>> running but not all shards.
>>>>>>>>>>
>>>>>>>>>> It started with 8 shards to heal and is
now only at 33 out of 800
>>>>>>>>>> and probably wont finish adding for few
days at rate it goes.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> > >
>>>>>>>>>>> > > > 1st node worked as
expected took 12 hours to heal 1TB
>>>>>>>>>>> data. Load was
>>>>>>>>>>> > > little
>>>>>>>>>>> > > > heavy but nothing
shocking.
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > About an hour after
node 1 finished I began same process
>>>>>>>>>>> on node2. Heal
>>>>>>>>>>> > > > proces kicked in as
before and the files in directories
>>>>>>>>>>> visible from
>>>>>>>>>>> > > mount
>>>>>>>>>>> > > > and .glusterfs
healed in short time. Then it began crawl
>>>>>>>>>>> of .shard adding
>>>>>>>>>>> > > > those files to heal
count at which point the entire proces
>>>>>>>>>>> ground to a
>>>>>>>>>>> > > halt
>>>>>>>>>>> > > > basically. After 48
hours out of 19k shards it has added
>>>>>>>>>>> 5900 to heal
>>>>>>>>>>> > > list.
>>>>>>>>>>> > > > Load on all 3
machnes is negligible. It was suggested to
>>>>>>>>>>> change this
>>>>>>>>>>> > > value
>>>>>>>>>>> > > > to full
cluster.data-self-heal-algorithm and restart
>>>>>>>>>>> volume which I
>>>>>>>>>>> > > did. No
>>>>>>>>>>> > > > efffect. Tried
relaunching heal no effect, despite any
>>>>>>>>>>> node picked. I
>>>>>>>>>>> > > > started each VM and
performed a stat of all files from
>>>>>>>>>>> within it, or a
>>>>>>>>>>> > > full
>>>>>>>>>>> > > > virus scan and that
seemed to cause short small spikes in
>>>>>>>>>>> shards added,
>>>>>>>>>>> > > but
>>>>>>>>>>> > > > not by much. Logs
are showing no real messages indicating
>>>>>>>>>>> anything is
>>>>>>>>>>> > > going
>>>>>>>>>>> > > > on. I get hits to
brick log on occasion of null lookups
>>>>>>>>>>> making me think
>>>>>>>>>>> > > its
>>>>>>>>>>> > > > not really crawling
shards directory but waiting for a
>>>>>>>>>>> shard lookup to
>>>>>>>>>>> > > add
>>>>>>>>>>> > > > it. I'll get
following in brick log but not constant and
>>>>>>>>>>> sometime
>>>>>>>>>>> > > multiple
>>>>>>>>>>> > > > for same shard.
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > [2016-08-29
08:31:57.478125] W [MSGID: 115009]
>>>>>>>>>>> > > >
[server-resolve.c:569:server_resolve] 0-GLUSTER1-server:
>>>>>>>>>>> no resolution
>>>>>>>>>>> > > type
>>>>>>>>>>> > > > for (null) (LOOKUP)
>>>>>>>>>>> > > > [2016-08-29
08:31:57.478170] E [MSGID: 115050]
>>>>>>>>>>> > > >
[server-rpc-fops.c:156:server_lookup_cbk]
>>>>>>>>>>> 0-GLUSTER1-server: 12591783:
>>>>>>>>>>> > > > LOOKUP (null)
(00000000-0000-0000-00
>>>>>>>>>>> > > >
00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221)
>>>>>>>>>>> ==> (Invalid
>>>>>>>>>>> > > > argument) [Invalid
argument]
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > This one repeated
about 30 times in row then nothing for
>>>>>>>>>>> 10 minutes then
>>>>>>>>>>> > > one
>>>>>>>>>>> > > > hit for one
different shard by itself.
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > How can I determine
if Heal is actually running? How can I
>>>>>>>>>>> kill it or
>>>>>>>>>>> > > force
>>>>>>>>>>> > > > restart? Does node I
start it from determine which
>>>>>>>>>>> directory gets
>>>>>>>>>>> > > crawled to
>>>>>>>>>>> > > > determine heals?
>>>>>>>>>>> > > >
>>>>>>>>>>> > > > David Gossage
>>>>>>>>>>> > > > Carousel Checks Inc.
| System Administrator
>>>>>>>>>>> > > > Office 708.613.2284
>>>>>>>>>>> > > >
>>>>>>>>>>> > > >
_______________________________________________
>>>>>>>>>>> > > > Gluster-users
mailing list
>>>>>>>>>>> > > > Gluster-users at
gluster.org
>>>>>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>> > > >
>>>>>>>>>>> > > >
>>>>>>>>>>> > > >
_______________________________________________
>>>>>>>>>>> > > > Gluster-users
mailing list
>>>>>>>>>>> > > > Gluster-users at
gluster.org
>>>>>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>> > >
>>>>>>>>>>> > > --
>>>>>>>>>>> > > Thanks,
>>>>>>>>>>> > > Anuradha.
>>>>>>>>>>> > >
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Anuradha.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160830/663dac03/attachment.html>

Krutika Dhananjay

2016-Aug-31 05:59 UTC

head link

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

Tried this.

With me, only 'fake2' gets healed after i bring the 'empty'
brick back up
and it stops there unless I do a 'heal-full'.

Is that what you're seeing as well?

-Krutika

On Wed, Aug 31, 2016 at 4:43 AM, David Gossage <dgossage at
carouselchecks.com>
wrote:
> Same issue brought up glusterd on problem node heal count still stuck at
> 6330.
>
> Ran gluster v heal GUSTER1 full
>
> glustershd on problem node shows a sweep starting and finishing in
> seconds.  Other 2 nodes show no activity in log.  They should start a sweep
> too shouldn't they?
>
> Tried starting from scratch
>
> kill -15 brickpid
> rm -Rf /brick
> mkdir -p /brick
> mkdir mkdir /gsmount/fake2
> setfattr -n "user.some-name" -v "some-value"
/gsmount/fake2
>
> Heals visible dirs instantly then stops.
>
> gluster v heal GLUSTER1 full
>
> see sweep star on problem node and end almost instantly.  no files added t
> heal list no files healed no more logging
>
> [2016-08-30 23:11:31.544331] I [MSGID: 108026]
> [afr-self-heald.c:646:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
> starting full sweep on subvol GLUSTER1-client-1
> [2016-08-30 23:11:33.776235] I [MSGID: 108026]
> [afr-self-heald.c:656:afr_shd_full_healer] 0-GLUSTER1-replicate-0:
> finished full sweep on subvol GLUSTER1-client-1
>
> same results no matter which node you run command on.  Still stuck with
> 6330 files showing needing healed out of 19k.  still showing in logs no
> heals are occuring.
>
> Is their a way to forcibly reset any prior heal data?  Could it be stuck
> on some past failed heal start?
>
>
>
>
> *David Gossage*
> *Carousel Checks Inc. | System Administrator*
> *Office* 708.613.2284
>
> On Tue, Aug 30, 2016 at 10:03 AM, David Gossage <
> dgossage at carouselchecks.com> wrote:
>
>> On Tue, Aug 30, 2016 at 10:02 AM, David Gossage <
>> dgossage at carouselchecks.com> wrote:
>>
>>> updated test server to 3.8.3
>>>
>>> Brick1: 192.168.71.10:/gluster2/brick1/1
>>> Brick2: 192.168.71.11:/gluster2/brick2/1
>>> Brick3: 192.168.71.12:/gluster2/brick3/1
>>> Options Reconfigured:
>>> cluster.granular-entry-heal: on
>>> performance.readdir-ahead: on
>>> performance.read-ahead: off
>>> nfs.disable: on
>>> nfs.addr-namelookup: off
>>> nfs.enable-ino32: off
>>> cluster.background-self-heal-count: 16
>>> cluster.self-heal-window-size: 1024
>>> performance.quick-read: off
>>> performance.io-cache: off
>>> performance.stat-prefetch: off
>>> cluster.eager-lock: enable
>>> network.remote-dio: on
>>> cluster.quorum-type: auto
>>> cluster.server-quorum-type: server
>>> storage.owner-gid: 36
>>> storage.owner-uid: 36
>>> server.allow-insecure: on
>>> features.shard: on
>>> features.shard-block-size: 64MB
>>> performance.strict-o-direct: off
>>> cluster.locking-scheme: granular
>>>
>>> kill -15 brickpid
>>> rm -Rf /gluster2/brick3
>>> mkdir -p /gluster2/brick3/1
>>> mkdir mkdir
/rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard
>>> /fake2
>>> setfattr -n "user.some-name" -v "some-value"
>>> /rhev/data-center/mnt/glusterSD/192.168.71.10\:_glustershard/fake2
>>> gluster v start glustershard force
>>>
>>> at this point brick process starts and all visible files including
new
>>> dir are made on brick
>>> handful of shards are in heal statistics still but no .shard
directory
>>> created and no increase in shard count
>>>
>>> gluster v heal glustershard
>>>
>>> At this point still no increase in count or dir made no additional
>>> activity in logs for healing generated.  waited few minutes tailing
logs to
>>> check if anything kicked in.
>>>
>>> gluster v heal glustershard full
>>>
>>> gluster shards added to list and heal commences.  logs show full
sweep
>>> starting on all 3 nodes.  though this time it only shows as
finishing on
>>> one which looks to be the one that had brick deleted.
>>>
>>> [2016-08-30 14:45:33.098589] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> starting full sweep on subvol glustershard-client-0
>>> [2016-08-30 14:45:33.099492] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> starting full sweep on subvol glustershard-client-1
>>> [2016-08-30 14:45:33.100093] I [MSGID: 108026]
>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> starting full sweep on subvol glustershard-client-2
>>> [2016-08-30 14:52:29.760213] I [MSGID: 108026]
>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-glustershard-replicate-0:
>>> finished full sweep on subvol glustershard-client-2
>>>
>>
>> Just realized its still healing so that may be why sweep on 2 other
>> bricks haven't replied as finished.
>>
>>>
>>>
>>> my hope is that later tonight a full heal will work on production. 
Is
>>> it possible self-heal daemon can get stale or stop listening but
still show
>>> as active?  Would stopping and starting self-heal daemon from
gluster cli
>>> before doing these heals be helpful?
>>>
>>>
>>> On Tue, Aug 30, 2016 at 9:29 AM, David Gossage <
>>> dgossage at carouselchecks.com> wrote:
>>>
>>>> On Tue, Aug 30, 2016 at 8:52 AM, David Gossage <
>>>> dgossage at carouselchecks.com> wrote:
>>>>
>>>>> On Tue, Aug 30, 2016 at 8:01 AM, Krutika Dhananjay <
>>>>> kdhananj at redhat.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 30, 2016 at 6:20 PM, Krutika Dhananjay <
>>>>>> kdhananj at redhat.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Aug 30, 2016 at 6:07 PM, David Gossage <
>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>
>>>>>>>> On Tue, Aug 30, 2016 at 7:18 AM, Krutika
Dhananjay <
>>>>>>>> kdhananj at redhat.com> wrote:
>>>>>>>>
>>>>>>>>> Could you also share the glustershd logs?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I'll get them when I get to work sure
>>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I tried the same steps that you mentioned
multiple times, but heal
>>>>>>>>> is running to completion without any
issues.
>>>>>>>>>
>>>>>>>>> It must be said that 'heal full'
traverses the files and
>>>>>>>>> directories in a depth-first order and does
heals also in the same order.
>>>>>>>>> But if it gets interrupted in the middle
(say because self-heal-daemon was
>>>>>>>>> either intentionally or unintentionally
brought offline and then brought
>>>>>>>>> back up), self-heal will only pick up the
entries that are so far marked as
>>>>>>>>> new-entries that need heal which it will
find in indices/xattrop directory.
>>>>>>>>> What this means is that those files and
directories that were not visited
>>>>>>>>> during the crawl, will remain untouched and
unhealed in this second
>>>>>>>>> iteration of heal, unless you execute a
'heal-full' again.
>>>>>>>>>
>>>>>>>>
>>>>>>>> So should it start healing shards as it crawls
or not until after
>>>>>>>> it crawls the entire .shard directory?  At the
pace it was going that could
>>>>>>>> be a week with one node appearing in the
cluster but with no shard files if
>>>>>>>> anything tries to access a file on that node. 
From my experience other day
>>>>>>>> telling it to heal full again did nothing
regardless of node used.
>>>>>>>>
>>>>>>>
>>>>>> Crawl is started from '/' of the volume.
Whenever self-heal detects
>>>>>> during the crawl that a file or directory is present in
some brick(s) and
>>>>>> absent in others, it creates the file on the bricks
where it is absent and
>>>>>> marks the fact that the file or directory might need
data/entry and
>>>>>> metadata heal too (this also means that an index is
created under
>>>>>> .glusterfs/indices/xattrop of the src bricks). And the
data/entry and
>>>>>> metadata heal are picked up and done in
>>>>>>
>>>>> the background with the help of these indices.
>>>>>>
>>>>>
>>>>> Looking at my 3rd node as example i find nearly an exact
same number
>>>>> of files in xattrop dir as reported by heal count at time I
brought down
>>>>> node2 to try and alleviate read io errors that seemed to
occur from what I
>>>>> was guessing as attempts to use the node with no shards for
reads.
>>>>>
>>>>> Also attached are the glustershd logs from the 3 nodes,
along with the
>>>>> test node i tried yesterday with same results.
>>>>>
>>>>
>>>> Looking at my own logs I notice that a full sweep was only ever
>>>> recorded in glustershd.log on 2nd node with missing directory. 
I believe I
>>>> should have found a sweep begun on every node correct?
>>>>
>>>> On my test dev when it did work I do see that
>>>>
>>>> [2016-08-30 13:56:25.223333] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> starting full sweep on subvol glustershard-client-0
>>>> [2016-08-30 13:56:25.223522] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> starting full sweep on subvol glustershard-client-1
>>>> [2016-08-30 13:56:25.224616] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> starting full sweep on subvol glustershard-client-2
>>>> [2016-08-30 14:18:48.333740] I [MSGID: 108026]
>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> finished full sweep on subvol glustershard-client-2
>>>> [2016-08-30 14:18:48.356008] I [MSGID: 108026]
>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> finished full sweep on subvol glustershard-client-1
>>>> [2016-08-30 14:18:49.637811] I [MSGID: 108026]
>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-glustershard-replicate-0:
>>>> finished full sweep on subvol glustershard-client-0
>>>>
>>>> While when looking at past few days of the 3 prod nodes i only
found
>>>> that on my 2nd node
>>>> [2016-08-27 01:26:42.638772] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>> starting full sweep on subvol GLUSTER1-client-1
>>>> [2016-08-27 11:37:01.732366] I [MSGID: 108026]
>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>> finished full sweep on subvol GLUSTER1-client-1
>>>> [2016-08-27 12:58:34.597228] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>> starting full sweep on subvol GLUSTER1-client-1
>>>> [2016-08-27 12:59:28.041173] I [MSGID: 108026]
>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>> finished full sweep on subvol GLUSTER1-client-1
>>>> [2016-08-27 20:03:42.560188] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>> starting full sweep on subvol GLUSTER1-client-1
>>>> [2016-08-27 20:03:44.278274] I [MSGID: 108026]
>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>> finished full sweep on subvol GLUSTER1-client-1
>>>> [2016-08-27 21:00:42.603315] I [MSGID: 108026]
>>>> [afr-self-heald.c:646:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>> starting full sweep on subvol GLUSTER1-client-1
>>>> [2016-08-27 21:00:46.148674] I [MSGID: 108026]
>>>> [afr-self-heald.c:656:afr_shd_full_healer]
0-GLUSTER1-replicate-0:
>>>> finished full sweep on subvol GLUSTER1-client-1
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>> My suspicion is that this is what happened
on your setup. Could
>>>>>>>>> you confirm if that was the case?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Brick was brought online with force start then
a full heal
>>>>>>>> launched.  Hours later after it became evident
that it was not adding new
>>>>>>>> files to heal I did try restarting self-heal
daemon and relaunching full
>>>>>>>> heal again. But this was after the heal had
basically already failed to
>>>>>>>> work as intended.
>>>>>>>>
>>>>>>>
>>>>>>> OK. How did you figure it was not adding any new
files? I need to
>>>>>>> know what places you were monitoring to come to
this conclusion.
>>>>>>>
>>>>>>> -Krutika
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> As for those logs, I did manager to do
something that caused these
>>>>>>>>> warning messages you shared earlier to
appear in my client and server logs.
>>>>>>>>> Although these logs are annoying and a bit
scary too, they didn't
>>>>>>>>> do any harm to the data in my volume. Why
they appear just after a brick is
>>>>>>>>> replaced and under no other circumstances
is something I'm still
>>>>>>>>> investigating.
>>>>>>>>>
>>>>>>>>> But for future, it would be good to follow
the steps Anuradha gave
>>>>>>>>> as that would allow self-heal to at least
detect that it has some repairing
>>>>>>>>> to do whenever it is restarted whether
intentionally or otherwise.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I followed those steps as described on my test
box and ended up
>>>>>>>> with exact same outcome of adding shards at an
agonizing slow pace and no
>>>>>>>> creation of .shard directory or heals on shard
directory.  Directories
>>>>>>>> visible from mount healed quickly.  This was
with one VM so it has only 800
>>>>>>>> shards as well.  After hours at work it had
added a total of 33 shards to
>>>>>>>> be healed.  I sent those logs yesterday as well
though not the glustershd.
>>>>>>>>
>>>>>>>> Does replace-brick command copy files in same
manner?  For these
>>>>>>>> purposes I am contemplating just skipping the
heal route.
>>>>>>>>
>>>>>>>>
>>>>>>>>> -Krutika
>>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2016 at 2:22 AM, David
Gossage <
>>>>>>>>> dgossage at carouselchecks.com> wrote:
>>>>>>>>>
>>>>>>>>>> attached brick and client logs from
test machine where same
>>>>>>>>>> behavior occurred not sure if anything
new is there.  its still on 3.8.2
>>>>>>>>>>
>>>>>>>>>> Number of Bricks: 1 x 3 = 3
>>>>>>>>>> Transport-type: tcp
>>>>>>>>>> Bricks:
>>>>>>>>>> Brick1:
192.168.71.10:/gluster2/brick1/1
>>>>>>>>>> Brick2:
192.168.71.11:/gluster2/brick2/1
>>>>>>>>>> Brick3:
192.168.71.12:/gluster2/brick3/1
>>>>>>>>>> Options Reconfigured:
>>>>>>>>>> cluster.locking-scheme: granular
>>>>>>>>>> performance.strict-o-direct: off
>>>>>>>>>> features.shard-block-size: 64MB
>>>>>>>>>> features.shard: on
>>>>>>>>>> server.allow-insecure: on
>>>>>>>>>> storage.owner-uid: 36
>>>>>>>>>> storage.owner-gid: 36
>>>>>>>>>> cluster.server-quorum-type: server
>>>>>>>>>> cluster.quorum-type: auto
>>>>>>>>>> network.remote-dio: on
>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>> performance.stat-prefetch: off
>>>>>>>>>> performance.io-cache: off
>>>>>>>>>> performance.quick-read: off
>>>>>>>>>> cluster.self-heal-window-size: 1024
>>>>>>>>>> cluster.background-self-heal-count: 16
>>>>>>>>>> nfs.enable-ino32: off
>>>>>>>>>> nfs.addr-namelookup: off
>>>>>>>>>> nfs.disable: on
>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>>> cluster.granular-entry-heal: on
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Aug 29, 2016 at 2:20 PM, David
Gossage <
>>>>>>>>>> dgossage at carouselchecks.com>
wrote:
>>>>>>>>>>
>>>>>>>>>>> On Mon, Aug 29, 2016 at 7:01 AM,
Anuradha Talur <
>>>>>>>>>>> atalur at redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>>> > From: "David
Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>>> > To: "Anuradha
Talur" <atalur at redhat.com>
>>>>>>>>>>>> > Cc: "gluster-users at
gluster.org List" <
>>>>>>>>>>>> Gluster-users at
gluster.org>, "Krutika Dhananjay" <
>>>>>>>>>>>> kdhananj at redhat.com>
>>>>>>>>>>>> > Sent: Monday, August 29,
2016 5:12:42 PM
>>>>>>>>>>>> > Subject: Re:
[Gluster-users] 3.8.3 Shards Healing Glacier Slow
>>>>>>>>>>>> >
>>>>>>>>>>>> > On Mon, Aug 29, 2016 at
5:39 AM, Anuradha Talur <
>>>>>>>>>>>> atalur at redhat.com> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > > Response inline.
>>>>>>>>>>>> > >
>>>>>>>>>>>> > > ----- Original
Message -----
>>>>>>>>>>>> > > > From:
"Krutika Dhananjay" <kdhananj at redhat.com>
>>>>>>>>>>>> > > > To: "David
Gossage" <dgossage at carouselchecks.com>
>>>>>>>>>>>> > > > Cc:
"gluster-users at gluster.org List" <
>>>>>>>>>>>> Gluster-users at
gluster.org>
>>>>>>>>>>>> > > > Sent: Monday,
August 29, 2016 3:55:04 PM
>>>>>>>>>>>> > > > Subject: Re:
[Gluster-users] 3.8.3 Shards Healing Glacier
>>>>>>>>>>>> Slow
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > Could you attach
both client and brick logs? Meanwhile I
>>>>>>>>>>>> will try these
>>>>>>>>>>>> > > steps
>>>>>>>>>>>> > > > out on my
machines and see if it is easily recreatable.
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > -Krutika
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > On Mon, Aug 29,
2016 at 2:31 PM, David Gossage <
>>>>>>>>>>>> > > dgossage at
carouselchecks.com
>>>>>>>>>>>> > > > > wrote:
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > Centos 7 Gluster
3.8.3
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > Brick1:
ccgl1.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>> > > > Brick2:
ccgl2.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>> > > > Brick3:
ccgl4.gl.local:/gluster1/BRICK1/1
>>>>>>>>>>>> > > > Options
Reconfigured:
>>>>>>>>>>>> > > >
cluster.data-self-heal-algorithm: full
>>>>>>>>>>>> > > >
cluster.self-heal-daemon: on
>>>>>>>>>>>> > > >
cluster.locking-scheme: granular
>>>>>>>>>>>> > > >
features.shard-block-size: 64MB
>>>>>>>>>>>> > > > features.shard:
on
>>>>>>>>>>>> > > >
performance.readdir-ahead: on
>>>>>>>>>>>> > > >
storage.owner-uid: 36
>>>>>>>>>>>> > > >
storage.owner-gid: 36
>>>>>>>>>>>> > > >
performance.quick-read: off
>>>>>>>>>>>> > > >
performance.read-ahead: off
>>>>>>>>>>>> > > >
performance.io-cache: off
>>>>>>>>>>>> > > >
performance.stat-prefetch: on
>>>>>>>>>>>> > > >
cluster.eager-lock: enable
>>>>>>>>>>>> > > >
network.remote-dio: enable
>>>>>>>>>>>> > > >
cluster.quorum-type: auto
>>>>>>>>>>>> > > >
cluster.server-quorum-type: server
>>>>>>>>>>>> > > >
server.allow-insecure: on
>>>>>>>>>>>> > > >
cluster.self-heal-window-size: 1024
>>>>>>>>>>>> > > >
cluster.background-self-heal-count: 16
>>>>>>>>>>>> > > >
performance.strict-write-ordering: off
>>>>>>>>>>>> > > > nfs.disable: on
>>>>>>>>>>>> > > >
nfs.addr-namelookup: off
>>>>>>>>>>>> > > >
nfs.enable-ino32: off
>>>>>>>>>>>> > > >
cluster.granular-entry-heal: on
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > Friday did
rolling upgrade from 3.8.3->3.8.3 no issues.
>>>>>>>>>>>> > > > Following steps
detailed in previous recommendations
>>>>>>>>>>>> began proces of
>>>>>>>>>>>> > > > replacing and
healngbricks one node at a time.
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > 1) kill pid of
brick
>>>>>>>>>>>> > > > 2) reconfigure
brick from raid6 to raid10
>>>>>>>>>>>> > > > 3) recreate
directory of brick
>>>>>>>>>>>> > > > 4) gluster
volume start <> force
>>>>>>>>>>>> > > > 5) gluster
volume heal <> full
>>>>>>>>>>>> > > Hi,
>>>>>>>>>>>> > >
>>>>>>>>>>>> > > I'd suggest that
full heal is not used. There are a few
>>>>>>>>>>>> bugs in full heal.
>>>>>>>>>>>> > > Better safe than
sorry ;)
>>>>>>>>>>>> > > Instead I'd
suggest the following steps:
>>>>>>>>>>>> > >
>>>>>>>>>>>> > > Currently I brought
the node down by systemctl stop
>>>>>>>>>>>> glusterd as I was
>>>>>>>>>>>> > getting sporadic io issues
and a few VM's paused so hoping
>>>>>>>>>>>> that will help.
>>>>>>>>>>>> > I may wait to do this till
around 4PM when most work is done
>>>>>>>>>>>> in case it
>>>>>>>>>>>> > shoots load up.
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> > > 1) kill pid of brick
>>>>>>>>>>>> > > 2) to configuring of
brick that you need
>>>>>>>>>>>> > > 3) recreate brick dir
>>>>>>>>>>>> > > 4) while the brick is
still down, from the mount point:
>>>>>>>>>>>> > >    a) create a dummy
non existent dir under / of mount.
>>>>>>>>>>>> > >
>>>>>>>>>>>> >
>>>>>>>>>>>> > so if noee 2 is down
brick, pick node for example 3 and make
>>>>>>>>>>>> a test dir
>>>>>>>>>>>> > under its brick directory
that doesnt exist on 2 or should I
>>>>>>>>>>>> be dong this
>>>>>>>>>>>> > over a gluster mount?
>>>>>>>>>>>> You should be doing this over
gluster mount.
>>>>>>>>>>>> >
>>>>>>>>>>>> > >    b) set a non
existent extended attribute on / of mount.
>>>>>>>>>>>> > >
>>>>>>>>>>>> >
>>>>>>>>>>>> > Could you give me an
example of an attribute to set?   I've
>>>>>>>>>>>> read a tad on
>>>>>>>>>>>> > this, and looked up
attributes but haven't set any yet myself.
>>>>>>>>>>>> >
>>>>>>>>>>>> Sure. setfattr -n
"user.some-name" -v "some-value"
>>>>>>>>>>>> <path-to-mount>
>>>>>>>>>>>> > Doing these steps will
ensure that heal happens only from
>>>>>>>>>>>> updated brick to
>>>>>>>>>>>> > > down brick.
>>>>>>>>>>>> > > 5) gluster v start
<> force
>>>>>>>>>>>> > > 6) gluster v heal
<>
>>>>>>>>>>>> > >
>>>>>>>>>>>> >
>>>>>>>>>>>> > Will it matter if
somewhere in gluster the full heal command
>>>>>>>>>>>> was run other
>>>>>>>>>>>> > day?  Not sure if it
eventually stops or times out.
>>>>>>>>>>>> >
>>>>>>>>>>>> full heal will stop once the
crawl is done. So if you want to
>>>>>>>>>>>> trigger heal again,
>>>>>>>>>>>> run gluster v heal <>.
Actually even brick up or volume start
>>>>>>>>>>>> force should
>>>>>>>>>>>> trigger the heal.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Did this on test bed today.  its
one server with 3 bricks on
>>>>>>>>>>> same machine so take that for what
its worth.  also it still runs 3.8.2.
>>>>>>>>>>> Maybe ill update and re-run test.
>>>>>>>>>>>
>>>>>>>>>>> killed brick
>>>>>>>>>>> deleted brick dir
>>>>>>>>>>> recreated brick dir
>>>>>>>>>>> created fake dir on gluster mount
>>>>>>>>>>> set suggested fake attribute on it
>>>>>>>>>>> ran volume start <> force
>>>>>>>>>>>
>>>>>>>>>>> looked at files it said needed
healing and it was just 8 shards
>>>>>>>>>>> that were modified for few minutes
I ran through steps
>>>>>>>>>>>
>>>>>>>>>>> gave it few minutes and it stayed
same
>>>>>>>>>>> ran gluster volume <> heal
>>>>>>>>>>>
>>>>>>>>>>> it healed all the directories and
files you can see over mount
>>>>>>>>>>> including fakedir.
>>>>>>>>>>>
>>>>>>>>>>> same issue for shards though.  it
adds more shards to heal at
>>>>>>>>>>> glacier pace.  slight jump in speed
if I stat every file and dir in VM
>>>>>>>>>>> running but not all shards.
>>>>>>>>>>>
>>>>>>>>>>> It started with 8 shards to heal
and is now only at 33 out of
>>>>>>>>>>> 800 and probably wont finish adding
for few days at rate it goes.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> > >
>>>>>>>>>>>> > > > 1st node worked
as expected took 12 hours to heal 1TB
>>>>>>>>>>>> data. Load was
>>>>>>>>>>>> > > little
>>>>>>>>>>>> > > > heavy but
nothing shocking.
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > About an hour
after node 1 finished I began same process
>>>>>>>>>>>> on node2. Heal
>>>>>>>>>>>> > > > proces kicked in
as before and the files in directories
>>>>>>>>>>>> visible from
>>>>>>>>>>>> > > mount
>>>>>>>>>>>> > > > and .glusterfs
healed in short time. Then it began crawl
>>>>>>>>>>>> of .shard adding
>>>>>>>>>>>> > > > those files to
heal count at which point the entire
>>>>>>>>>>>> proces ground to a
>>>>>>>>>>>> > > halt
>>>>>>>>>>>> > > > basically. After
48 hours out of 19k shards it has added
>>>>>>>>>>>> 5900 to heal
>>>>>>>>>>>> > > list.
>>>>>>>>>>>> > > > Load on all 3
machnes is negligible. It was suggested to
>>>>>>>>>>>> change this
>>>>>>>>>>>> > > value
>>>>>>>>>>>> > > > to full
cluster.data-self-heal-algorithm and restart
>>>>>>>>>>>> volume which I
>>>>>>>>>>>> > > did. No
>>>>>>>>>>>> > > > efffect. Tried
relaunching heal no effect, despite any
>>>>>>>>>>>> node picked. I
>>>>>>>>>>>> > > > started each VM
and performed a stat of all files from
>>>>>>>>>>>> within it, or a
>>>>>>>>>>>> > > full
>>>>>>>>>>>> > > > virus scan and
that seemed to cause short small spikes in
>>>>>>>>>>>> shards added,
>>>>>>>>>>>> > > but
>>>>>>>>>>>> > > > not by much.
Logs are showing no real messages indicating
>>>>>>>>>>>> anything is
>>>>>>>>>>>> > > going
>>>>>>>>>>>> > > > on. I get hits
to brick log on occasion of null lookups
>>>>>>>>>>>> making me think
>>>>>>>>>>>> > > its
>>>>>>>>>>>> > > > not really
crawling shards directory but waiting for a
>>>>>>>>>>>> shard lookup to
>>>>>>>>>>>> > > add
>>>>>>>>>>>> > > > it. I'll get
following in brick log but not constant and
>>>>>>>>>>>> sometime
>>>>>>>>>>>> > > multiple
>>>>>>>>>>>> > > > for same shard.
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > [2016-08-29
08:31:57.478125] W [MSGID: 115009]
>>>>>>>>>>>> > > >
[server-resolve.c:569:server_resolve] 0-GLUSTER1-server:
>>>>>>>>>>>> no resolution
>>>>>>>>>>>> > > type
>>>>>>>>>>>> > > > for (null)
(LOOKUP)
>>>>>>>>>>>> > > > [2016-08-29
08:31:57.478170] E [MSGID: 115050]
>>>>>>>>>>>> > > >
[server-rpc-fops.c:156:server_lookup_cbk]
>>>>>>>>>>>> 0-GLUSTER1-server: 12591783:
>>>>>>>>>>>> > > > LOOKUP (null)
(00000000-0000-0000-00
>>>>>>>>>>>> > > >
00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221)
>>>>>>>>>>>> ==> (Invalid
>>>>>>>>>>>> > > > argument)
[Invalid argument]
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > This one
repeated about 30 times in row then nothing for
>>>>>>>>>>>> 10 minutes then
>>>>>>>>>>>> > > one
>>>>>>>>>>>> > > > hit for one
different shard by itself.
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > How can I
determine if Heal is actually running? How can
>>>>>>>>>>>> I kill it or
>>>>>>>>>>>> > > force
>>>>>>>>>>>> > > > restart? Does
node I start it from determine which
>>>>>>>>>>>> directory gets
>>>>>>>>>>>> > > crawled to
>>>>>>>>>>>> > > > determine heals?
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > > David Gossage
>>>>>>>>>>>> > > > Carousel Checks
Inc. | System Administrator
>>>>>>>>>>>> > > > Office
708.613.2284
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > >
_______________________________________________
>>>>>>>>>>>> > > > Gluster-users
mailing list
>>>>>>>>>>>> > > > Gluster-users at
gluster.org
>>>>>>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > >
>>>>>>>>>>>> > > >
_______________________________________________
>>>>>>>>>>>> > > > Gluster-users
mailing list
>>>>>>>>>>>> > > > Gluster-users at
gluster.org
>>>>>>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>> > >
>>>>>>>>>>>> > > --
>>>>>>>>>>>> > > Thanks,
>>>>>>>>>>>> > > Anuradha.
>>>>>>>>>>>> > >
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Anuradha.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160831/71f2a639/attachment.html>

Gluster users - Aug 2016 - 3.8.3 Shards Healing Glacier Slow

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

[Gluster-users] 3.8.3 Shards Healing Glacier Slow