thr3ads.net - Gluster users - [Gluster-users] 3.8.3 Shards Healing Glacier Slow [Aug 2016]

If this information is useful, please help other people find it:
Share via:

Krutika Dhananjay

2016-Aug-30 13:01 UTC

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

On Tue, Aug 30, 2016 at 6:20 PM, Krutika Dhananjay <kdhananj at
redhat.com>
wrote:
>
>
> On Tue, Aug 30, 2016 at 6:07 PM, David Gossage <
> dgossage at carouselchecks.com> wrote:
>
>> On Tue, Aug 30, 2016 at 7:18 AM, Krutika Dhananjay <kdhananj at
redhat.com>
>> wrote:
>>
>>> Could you also share the glustershd logs?
>>>
>>
>> I'll get them when I get to work sure.
>>
>>
>>>
>>> I tried the same steps that you mentioned multiple times, but heal
is
>>> running to completion without any issues.
>>>
>>> It must be said that 'heal full' traverses the files and
directories in
>>> a depth-first order and does heals also in the same order. But if
it gets
>>> interrupted in the middle (say because self-heal-daemon was either
>>> intentionally or unintentionally brought offline and then brought
back up),
>>> self-heal will only pick up the entries that are so far marked as
>>> new-entries that need heal which it will find in indices/xattrop
directory.
>>> What this means is that those files and directories that were not
visited
>>> during the crawl, will remain untouched and unhealed in this second
>>> iteration of heal, unless you execute a 'heal-full' again.
>>>
>>
>> So should it start healing shards as it crawls or not until after it
>> crawls the entire .shard directory?  At the pace it was going that
could be
>> a week with one node appearing in the cluster but with no shard files
if
>> anything tries to access a file on that node.  From my experience other
day
>> telling it to heal full again did nothing regardless of node used.
>>
>Crawl is started from '/' of the volume. Whenever self-heal detects
during
the crawl that a file or directory is present in some brick(s) and absent
in others, it creates the file on the bricks where it is absent and marks
the fact that the file or directory might need data/entry and metadata heal
too (this also means that an index is created under
.glusterfs/indices/xattrop of the src bricks). And the data/entry and
metadata heal are picked up and done in the background with the help of
these indices.

>>
>>> My suspicion is that this is what happened on your setup. Could you
>>> confirm if that was the case?
>>>
>>
>> Brick was brought online with force start then a full heal launched.
>> Hours later after it became evident that it was not adding new files to
>> heal I did try restarting self-heal daemon and relaunching full heal
again.
>> But this was after the heal had basically already failed to work as
>> intended.
>>
>
> OK. How did you figure it was not adding any new files? I need to know
> what places you were monitoring to come to this conclusion.
>
> -Krutika
>
>
>>
>>
>>> As for those logs, I did manager to do something that caused these
>>> warning messages you shared earlier to appear in my client and
server logs.
>>> Although these logs are annoying and a bit scary too, they
didn't do any
>>> harm to the data in my volume. Why they appear just after a brick
is
>>> replaced and under no other circumstances is something I'm
still
>>> investigating.
>>>
>>> But for future, it would be good to follow the steps Anuradha gave
as
>>> that would allow self-heal to at least detect that it has some
repairing to
>>> do whenever it is restarted whether intentionally or otherwise.
>>>
>>
>> I followed those steps as described on my test box and ended up with
>> exact same outcome of adding shards at an agonizing slow pace and no
>> creation of .shard directory or heals on shard directory.  Directories
>> visible from mount healed quickly.  This was with one VM so it has only
800
>> shards as well.  After hours at work it had added a total of 33 shards
to
>> be healed.  I sent those logs yesterday as well though not the
glustershd.
>>
>> Does replace-brick command copy files in same manner?  For these
purposes
>> I am contemplating just skipping the heal route.
>>
>>
>>> -Krutika
>>>
>>> On Tue, Aug 30, 2016 at 2:22 AM, David Gossage <
>>> dgossage at carouselchecks.com> wrote:
>>>
>>>> attached brick and client logs from test machine where same
behavior
>>>> occurred not sure if anything new is there.  its still on 3.8.2
>>>>
>>>> Number of Bricks: 1 x 3 = 3
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: 192.168.71.10:/gluster2/brick1/1
>>>> Brick2: 192.168.71.11:/gluster2/brick2/1
>>>> Brick3: 192.168.71.12:/gluster2/brick3/1
>>>> Options Reconfigured:
>>>> cluster.locking-scheme: granular
>>>> performance.strict-o-direct: off
>>>> features.shard-block-size: 64MB
>>>> features.shard: on
>>>> server.allow-insecure: on
>>>> storage.owner-uid: 36
>>>> storage.owner-gid: 36
>>>> cluster.server-quorum-type: server
>>>> cluster.quorum-type: auto
>>>> network.remote-dio: on
>>>> cluster.eager-lock: enable
>>>> performance.stat-prefetch: off
>>>> performance.io-cache: off
>>>> performance.quick-read: off
>>>> cluster.self-heal-window-size: 1024
>>>> cluster.background-self-heal-count: 16
>>>> nfs.enable-ino32: off
>>>> nfs.addr-namelookup: off
>>>> nfs.disable: on
>>>> performance.read-ahead: off
>>>> performance.readdir-ahead: on
>>>> cluster.granular-entry-heal: on
>>>>
>>>>
>>>>
>>>> On Mon, Aug 29, 2016 at 2:20 PM, David Gossage <
>>>> dgossage at carouselchecks.com> wrote:
>>>>
>>>>> On Mon, Aug 29, 2016 at 7:01 AM, Anuradha Talur <atalur
at redhat.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> ----- Original Message -----
>>>>>> > From: "David Gossage" <dgossage at
carouselchecks.com>
>>>>>> > To: "Anuradha Talur" <atalur at
redhat.com>
>>>>>> > Cc: "gluster-users at gluster.org List"
<Gluster-users at gluster.org>,
>>>>>> "Krutika Dhananjay" <kdhananj at
redhat.com>
>>>>>> > Sent: Monday, August 29, 2016 5:12:42 PM
>>>>>> > Subject: Re: [Gluster-users] 3.8.3 Shards Healing
Glacier Slow
>>>>>> >
>>>>>> > On Mon, Aug 29, 2016 at 5:39 AM, Anuradha Talur
<atalur at redhat.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > > Response inline.
>>>>>> > >
>>>>>> > > ----- Original Message -----
>>>>>> > > > From: "Krutika Dhananjay"
<kdhananj at redhat.com>
>>>>>> > > > To: "David Gossage"
<dgossage at carouselchecks.com>
>>>>>> > > > Cc: "gluster-users at gluster.org
List" <Gluster-users at gluster.org
>>>>>> >
>>>>>> > > > Sent: Monday, August 29, 2016 3:55:04 PM
>>>>>> > > > Subject: Re: [Gluster-users] 3.8.3
Shards Healing Glacier Slow
>>>>>> > > >
>>>>>> > > > Could you attach both client and brick
logs? Meanwhile I will
>>>>>> try these
>>>>>> > > steps
>>>>>> > > > out on my machines and see if it is
easily recreatable.
>>>>>> > > >
>>>>>> > > > -Krutika
>>>>>> > > >
>>>>>> > > > On Mon, Aug 29, 2016 at 2:31 PM, David
Gossage <
>>>>>> > > dgossage at carouselchecks.com
>>>>>> > > > > wrote:
>>>>>> > > >
>>>>>> > > >
>>>>>> > > >
>>>>>> > > > Centos 7 Gluster 3.8.3
>>>>>> > > >
>>>>>> > > > Brick1:
ccgl1.gl.local:/gluster1/BRICK1/1
>>>>>> > > > Brick2:
ccgl2.gl.local:/gluster1/BRICK1/1
>>>>>> > > > Brick3:
ccgl4.gl.local:/gluster1/BRICK1/1
>>>>>> > > > Options Reconfigured:
>>>>>> > > > cluster.data-self-heal-algorithm: full
>>>>>> > > > cluster.self-heal-daemon: on
>>>>>> > > > cluster.locking-scheme: granular
>>>>>> > > > features.shard-block-size: 64MB
>>>>>> > > > features.shard: on
>>>>>> > > > performance.readdir-ahead: on
>>>>>> > > > storage.owner-uid: 36
>>>>>> > > > storage.owner-gid: 36
>>>>>> > > > performance.quick-read: off
>>>>>> > > > performance.read-ahead: off
>>>>>> > > > performance.io-cache: off
>>>>>> > > > performance.stat-prefetch: on
>>>>>> > > > cluster.eager-lock: enable
>>>>>> > > > network.remote-dio: enable
>>>>>> > > > cluster.quorum-type: auto
>>>>>> > > > cluster.server-quorum-type: server
>>>>>> > > > server.allow-insecure: on
>>>>>> > > > cluster.self-heal-window-size: 1024
>>>>>> > > > cluster.background-self-heal-count: 16
>>>>>> > > > performance.strict-write-ordering: off
>>>>>> > > > nfs.disable: on
>>>>>> > > > nfs.addr-namelookup: off
>>>>>> > > > nfs.enable-ino32: off
>>>>>> > > > cluster.granular-entry-heal: on
>>>>>> > > >
>>>>>> > > > Friday did rolling upgrade from
3.8.3->3.8.3 no issues.
>>>>>> > > > Following steps detailed in previous
recommendations began
>>>>>> proces of
>>>>>> > > > replacing and healngbricks one node at a
time.
>>>>>> > > >
>>>>>> > > > 1) kill pid of brick
>>>>>> > > > 2) reconfigure brick from raid6 to
raid10
>>>>>> > > > 3) recreate directory of brick
>>>>>> > > > 4) gluster volume start <> force
>>>>>> > > > 5) gluster volume heal <> full
>>>>>> > > Hi,
>>>>>> > >
>>>>>> > > I'd suggest that full heal is not used.
There are a few bugs in
>>>>>> full heal.
>>>>>> > > Better safe than sorry ;)
>>>>>> > > Instead I'd suggest the following steps:
>>>>>> > >
>>>>>> > > Currently I brought the node down by
systemctl stop glusterd as I
>>>>>> was
>>>>>> > getting sporadic io issues and a few VM's
paused so hoping that
>>>>>> will help.
>>>>>> > I may wait to do this till around 4PM when most
work is done in
>>>>>> case it
>>>>>> > shoots load up.
>>>>>> >
>>>>>> >
>>>>>> > > 1) kill pid of brick
>>>>>> > > 2) to configuring of brick that you need
>>>>>> > > 3) recreate brick dir
>>>>>> > > 4) while the brick is still down, from the
mount point:
>>>>>> > >    a) create a dummy non existent dir under /
of mount.
>>>>>> > >
>>>>>> >
>>>>>> > so if noee 2 is down brick, pick node for example
3 and make a test
>>>>>> dir
>>>>>> > under its brick directory that doesnt exist on 2
or should I be
>>>>>> dong this
>>>>>> > over a gluster mount?
>>>>>> You should be doing this over gluster mount.
>>>>>> >
>>>>>> > >    b) set a non existent extended attribute
on / of mount.
>>>>>> > >
>>>>>> >
>>>>>> > Could you give me an example of an attribute to
set?   I've read a
>>>>>> tad on
>>>>>> > this, and looked up attributes but haven't set
any yet myself.
>>>>>> >
>>>>>> Sure. setfattr -n "user.some-name" -v
"some-value" <path-to-mount>
>>>>>> > Doing these steps will ensure that heal happens
only from updated
>>>>>> brick to
>>>>>> > > down brick.
>>>>>> > > 5) gluster v start <> force
>>>>>> > > 6) gluster v heal <>
>>>>>> > >
>>>>>> >
>>>>>> > Will it matter if somewhere in gluster the full
heal command was
>>>>>> run other
>>>>>> > day?  Not sure if it eventually stops or times
out.
>>>>>> >
>>>>>> full heal will stop once the crawl is done. So if you
want to trigger
>>>>>> heal again,
>>>>>> run gluster v heal <>. Actually even brick up or
volume start force
>>>>>> should
>>>>>> trigger the heal.
>>>>>>
>>>>>
>>>>> Did this on test bed today.  its one server with 3 bricks
on same
>>>>> machine so take that for what its worth.  also it still
runs 3.8.2.  Maybe
>>>>> ill update and re-run test.
>>>>>
>>>>> killed brick
>>>>> deleted brick dir
>>>>> recreated brick dir
>>>>> created fake dir on gluster mount
>>>>> set suggested fake attribute on it
>>>>> ran volume start <> force
>>>>>
>>>>> looked at files it said needed healing and it was just 8
shards that
>>>>> were modified for few minutes I ran through steps
>>>>>
>>>>> gave it few minutes and it stayed same
>>>>> ran gluster volume <> heal
>>>>>
>>>>> it healed all the directories and files you can see over
mount
>>>>> including fakedir.
>>>>>
>>>>> same issue for shards though.  it adds more shards to heal
at glacier
>>>>> pace.  slight jump in speed if I stat every file and dir in
VM running but
>>>>> not all shards.
>>>>>
>>>>> It started with 8 shards to heal and is now only at 33 out
of 800 and
>>>>> probably wont finish adding for few days at rate it goes.
>>>>>
>>>>>
>>>>>
>>>>>> > >
>>>>>> > > > 1st node worked as expected took 12
hours to heal 1TB data.
>>>>>> Load was
>>>>>> > > little
>>>>>> > > > heavy but nothing shocking.
>>>>>> > > >
>>>>>> > > > About an hour after node 1 finished I
began same process on
>>>>>> node2. Heal
>>>>>> > > > proces kicked in as before and the files
in directories visible
>>>>>> from
>>>>>> > > mount
>>>>>> > > > and .glusterfs healed in short time.
Then it began crawl of
>>>>>> .shard adding
>>>>>> > > > those files to heal count at which point
the entire proces
>>>>>> ground to a
>>>>>> > > halt
>>>>>> > > > basically. After 48 hours out of 19k
shards it has added 5900
>>>>>> to heal
>>>>>> > > list.
>>>>>> > > > Load on all 3 machnes is negligible. It
was suggested to change
>>>>>> this
>>>>>> > > value
>>>>>> > > > to full cluster.data-self-heal-algorithm
and restart volume
>>>>>> which I
>>>>>> > > did. No
>>>>>> > > > efffect. Tried relaunching heal no
effect, despite any node
>>>>>> picked. I
>>>>>> > > > started each VM and performed a stat of
all files from within
>>>>>> it, or a
>>>>>> > > full
>>>>>> > > > virus scan and that seemed to cause
short small spikes in
>>>>>> shards added,
>>>>>> > > but
>>>>>> > > > not by much. Logs are showing no real
messages indicating
>>>>>> anything is
>>>>>> > > going
>>>>>> > > > on. I get hits to brick log on occasion
of null lookups making
>>>>>> me think
>>>>>> > > its
>>>>>> > > > not really crawling shards directory but
waiting for a shard
>>>>>> lookup to
>>>>>> > > add
>>>>>> > > > it. I'll get following in brick log
but not constant and
>>>>>> sometime
>>>>>> > > multiple
>>>>>> > > > for same shard.
>>>>>> > > >
>>>>>> > > > [2016-08-29 08:31:57.478125] W [MSGID:
115009]
>>>>>> > > > [server-resolve.c:569:server_resolve]
0-GLUSTER1-server: no
>>>>>> resolution
>>>>>> > > type
>>>>>> > > > for (null) (LOOKUP)
>>>>>> > > > [2016-08-29 08:31:57.478170] E [MSGID:
115050]
>>>>>> > > >
[server-rpc-fops.c:156:server_lookup_cbk] 0-GLUSTER1-server:
>>>>>> 12591783:
>>>>>> > > > LOOKUP (null) (00000000-0000-0000-00
>>>>>> > > >
00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221) ==>
>>>>>> (Invalid
>>>>>> > > > argument) [Invalid argument]
>>>>>> > > >
>>>>>> > > > This one repeated about 30 times in row
then nothing for 10
>>>>>> minutes then
>>>>>> > > one
>>>>>> > > > hit for one different shard by itself.
>>>>>> > > >
>>>>>> > > > How can I determine if Heal is actually
running? How can I kill
>>>>>> it or
>>>>>> > > force
>>>>>> > > > restart? Does node I start it from
determine which directory
>>>>>> gets
>>>>>> > > crawled to
>>>>>> > > > determine heals?
>>>>>> > > >
>>>>>> > > > David Gossage
>>>>>> > > > Carousel Checks Inc. | System
Administrator
>>>>>> > > > Office 708.613.2284
>>>>>> > > >
>>>>>> > > >
_______________________________________________
>>>>>> > > > Gluster-users mailing list
>>>>>> > > > Gluster-users at gluster.org
>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>> > > >
>>>>>> > > >
>>>>>> > > >
_______________________________________________
>>>>>> > > > Gluster-users mailing list
>>>>>> > > > Gluster-users at gluster.org
>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>> > >
>>>>>> > > --
>>>>>> > > Thanks,
>>>>>> > > Anuradha.
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Anuradha.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160830/0ba0361f/attachment.html>

David Gossage

2016-Aug-30 13:52 UTC

head link

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

On Tue, Aug 30, 2016 at 8:01 AM, Krutika Dhananjay <kdhananj at
redhat.com>
wrote:
>
>
> On Tue, Aug 30, 2016 at 6:20 PM, Krutika Dhananjay <kdhananj at
redhat.com>
> wrote:
>
>>
>>
>> On Tue, Aug 30, 2016 at 6:07 PM, David Gossage <
>> dgossage at carouselchecks.com> wrote:
>>
>>> On Tue, Aug 30, 2016 at 7:18 AM, Krutika Dhananjay <kdhananj at
redhat.com>
>>> wrote:
>>>
>>>> Could you also share the glustershd logs?
>>>>
>>>
>>> I'll get them when I get to work sure
>>>
>>
>>>
>>>>
>>>> I tried the same steps that you mentioned multiple times, but
heal is
>>>> running to completion without any issues.
>>>>
>>>> It must be said that 'heal full' traverses the files
and directories in
>>>> a depth-first order and does heals also in the same order. But
if it gets
>>>> interrupted in the middle (say because self-heal-daemon was
either
>>>> intentionally or unintentionally brought offline and then
brought back up),
>>>> self-heal will only pick up the entries that are so far marked
as
>>>> new-entries that need heal which it will find in
indices/xattrop directory.
>>>> What this means is that those files and directories that were
not visited
>>>> during the crawl, will remain untouched and unhealed in this
second
>>>> iteration of heal, unless you execute a 'heal-full'
again.
>>>>
>>>
>>> So should it start healing shards as it crawls or not until after
it
>>> crawls the entire .shard directory?  At the pace it was going that
could be
>>> a week with one node appearing in the cluster but with no shard
files if
>>> anything tries to access a file on that node.  From my experience
other day
>>> telling it to heal full again did nothing regardless of node used.
>>>
>>
> Crawl is started from '/' of the volume. Whenever self-heal detects
during
> the crawl that a file or directory is present in some brick(s) and absent
> in others, it creates the file on the bricks where it is absent and marks
> the fact that the file or directory might need data/entry and metadata heal
> too (this also means that an index is created under
> .glusterfs/indices/xattrop of the src bricks). And the data/entry and
> metadata heal are picked up and done in
>
the background with the help of these indices.>
Looking at my 3rd node as example i find nearly an exact same number of
files in xattrop dir as reported by heal count at time I brought down node2
to try and alleviate read io errors that seemed to occur from what I was
guessing as attempts to use the node with no shards for reads.

Also attached are the glustershd logs from the 3 nodes, along with the test
node i tried yesterday with same results.
>
>
>>>
>>>> My suspicion is that this is what happened on your setup. Could
you
>>>> confirm if that was the case?
>>>>
>>>
>>> Brick was brought online with force start then a full heal
launched.
>>> Hours later after it became evident that it was not adding new
files to
>>> heal I did try restarting self-heal daemon and relaunching full
heal again.
>>> But this was after the heal had basically already failed to work as
>>> intended.
>>>
>>
>> OK. How did you figure it was not adding any new files? I need to know
>> what places you were monitoring to come to this conclusion.
>>
>> -Krutika
>>
>>
>>>
>>>
>>>> As for those logs, I did manager to do something that caused
these
>>>> warning messages you shared earlier to appear in my client and
server logs.
>>>> Although these logs are annoying and a bit scary too, they
didn't do
>>>> any harm to the data in my volume. Why they appear just after a
brick is
>>>> replaced and under no other circumstances is something I'm
still
>>>> investigating.
>>>>
>>>> But for future, it would be good to follow the steps Anuradha
gave as
>>>> that would allow self-heal to at least detect that it has some
repairing to
>>>> do whenever it is restarted whether intentionally or otherwise.
>>>>
>>>
>>> I followed those steps as described on my test box and ended up
with
>>> exact same outcome of adding shards at an agonizing slow pace and
no
>>> creation of .shard directory or heals on shard directory. 
Directories
>>> visible from mount healed quickly.  This was with one VM so it has
only 800
>>> shards as well.  After hours at work it had added a total of 33
shards to
>>> be healed.  I sent those logs yesterday as well though not the
glustershd.
>>>
>>> Does replace-brick command copy files in same manner?  For these
>>> purposes I am contemplating just skipping the heal route.
>>>
>>>
>>>> -Krutika
>>>>
>>>> On Tue, Aug 30, 2016 at 2:22 AM, David Gossage <
>>>> dgossage at carouselchecks.com> wrote:
>>>>
>>>>> attached brick and client logs from test machine where same
behavior
>>>>> occurred not sure if anything new is there.  its still on
3.8.2
>>>>>
>>>>> Number of Bricks: 1 x 3 = 3
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 192.168.71.10:/gluster2/brick1/1
>>>>> Brick2: 192.168.71.11:/gluster2/brick2/1
>>>>> Brick3: 192.168.71.12:/gluster2/brick3/1
>>>>> Options Reconfigured:
>>>>> cluster.locking-scheme: granular
>>>>> performance.strict-o-direct: off
>>>>> features.shard-block-size: 64MB
>>>>> features.shard: on
>>>>> server.allow-insecure: on
>>>>> storage.owner-uid: 36
>>>>> storage.owner-gid: 36
>>>>> cluster.server-quorum-type: server
>>>>> cluster.quorum-type: auto
>>>>> network.remote-dio: on
>>>>> cluster.eager-lock: enable
>>>>> performance.stat-prefetch: off
>>>>> performance.io-cache: off
>>>>> performance.quick-read: off
>>>>> cluster.self-heal-window-size: 1024
>>>>> cluster.background-self-heal-count: 16
>>>>> nfs.enable-ino32: off
>>>>> nfs.addr-namelookup: off
>>>>> nfs.disable: on
>>>>> performance.read-ahead: off
>>>>> performance.readdir-ahead: on
>>>>> cluster.granular-entry-heal: on
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Aug 29, 2016 at 2:20 PM, David Gossage <
>>>>> dgossage at carouselchecks.com> wrote:
>>>>>
>>>>>> On Mon, Aug 29, 2016 at 7:01 AM, Anuradha Talur
<atalur at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----- Original Message -----
>>>>>>> > From: "David Gossage" <dgossage
at carouselchecks.com>
>>>>>>> > To: "Anuradha Talur" <atalur at
redhat.com>
>>>>>>> > Cc: "gluster-users at gluster.org
List" <Gluster-users at gluster.org>,
>>>>>>> "Krutika Dhananjay" <kdhananj at
redhat.com>
>>>>>>> > Sent: Monday, August 29, 2016 5:12:42 PM
>>>>>>> > Subject: Re: [Gluster-users] 3.8.3 Shards
Healing Glacier Slow
>>>>>>> >
>>>>>>> > On Mon, Aug 29, 2016 at 5:39 AM, Anuradha
Talur <atalur at redhat.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > > Response inline.
>>>>>>> > >
>>>>>>> > > ----- Original Message -----
>>>>>>> > > > From: "Krutika Dhananjay"
<kdhananj at redhat.com>
>>>>>>> > > > To: "David Gossage"
<dgossage at carouselchecks.com>
>>>>>>> > > > Cc: "gluster-users at
gluster.org List" <
>>>>>>> Gluster-users at gluster.org>
>>>>>>> > > > Sent: Monday, August 29, 2016
3:55:04 PM
>>>>>>> > > > Subject: Re: [Gluster-users] 3.8.3
Shards Healing Glacier Slow
>>>>>>> > > >
>>>>>>> > > > Could you attach both client and
brick logs? Meanwhile I will
>>>>>>> try these
>>>>>>> > > steps
>>>>>>> > > > out on my machines and see if it is
easily recreatable.
>>>>>>> > > >
>>>>>>> > > > -Krutika
>>>>>>> > > >
>>>>>>> > > > On Mon, Aug 29, 2016 at 2:31 PM,
David Gossage <
>>>>>>> > > dgossage at carouselchecks.com
>>>>>>> > > > > wrote:
>>>>>>> > > >
>>>>>>> > > >
>>>>>>> > > >
>>>>>>> > > > Centos 7 Gluster 3.8.3
>>>>>>> > > >
>>>>>>> > > > Brick1:
ccgl1.gl.local:/gluster1/BRICK1/1
>>>>>>> > > > Brick2:
ccgl2.gl.local:/gluster1/BRICK1/1
>>>>>>> > > > Brick3:
ccgl4.gl.local:/gluster1/BRICK1/1
>>>>>>> > > > Options Reconfigured:
>>>>>>> > > > cluster.data-self-heal-algorithm:
full
>>>>>>> > > > cluster.self-heal-daemon: on
>>>>>>> > > > cluster.locking-scheme: granular
>>>>>>> > > > features.shard-block-size: 64MB
>>>>>>> > > > features.shard: on
>>>>>>> > > > performance.readdir-ahead: on
>>>>>>> > > > storage.owner-uid: 36
>>>>>>> > > > storage.owner-gid: 36
>>>>>>> > > > performance.quick-read: off
>>>>>>> > > > performance.read-ahead: off
>>>>>>> > > > performance.io-cache: off
>>>>>>> > > > performance.stat-prefetch: on
>>>>>>> > > > cluster.eager-lock: enable
>>>>>>> > > > network.remote-dio: enable
>>>>>>> > > > cluster.quorum-type: auto
>>>>>>> > > > cluster.server-quorum-type: server
>>>>>>> > > > server.allow-insecure: on
>>>>>>> > > > cluster.self-heal-window-size: 1024
>>>>>>> > > > cluster.background-self-heal-count:
16
>>>>>>> > > > performance.strict-write-ordering:
off
>>>>>>> > > > nfs.disable: on
>>>>>>> > > > nfs.addr-namelookup: off
>>>>>>> > > > nfs.enable-ino32: off
>>>>>>> > > > cluster.granular-entry-heal: on
>>>>>>> > > >
>>>>>>> > > > Friday did rolling upgrade from
3.8.3->3.8.3 no issues.
>>>>>>> > > > Following steps detailed in previous
recommendations began
>>>>>>> proces of
>>>>>>> > > > replacing and healngbricks one node
at a time.
>>>>>>> > > >
>>>>>>> > > > 1) kill pid of brick
>>>>>>> > > > 2) reconfigure brick from raid6 to
raid10
>>>>>>> > > > 3) recreate directory of brick
>>>>>>> > > > 4) gluster volume start <>
force
>>>>>>> > > > 5) gluster volume heal <> full
>>>>>>> > > Hi,
>>>>>>> > >
>>>>>>> > > I'd suggest that full heal is not
used. There are a few bugs in
>>>>>>> full heal.
>>>>>>> > > Better safe than sorry ;)
>>>>>>> > > Instead I'd suggest the following
steps:
>>>>>>> > >
>>>>>>> > > Currently I brought the node down by
systemctl stop glusterd as
>>>>>>> I was
>>>>>>> > getting sporadic io issues and a few VM's
paused so hoping that
>>>>>>> will help.
>>>>>>> > I may wait to do this till around 4PM when
most work is done in
>>>>>>> case it
>>>>>>> > shoots load up.
>>>>>>> >
>>>>>>> >
>>>>>>> > > 1) kill pid of brick
>>>>>>> > > 2) to configuring of brick that you need
>>>>>>> > > 3) recreate brick dir
>>>>>>> > > 4) while the brick is still down, from
the mount point:
>>>>>>> > >    a) create a dummy non existent dir
under / of mount.
>>>>>>> > >
>>>>>>> >
>>>>>>> > so if noee 2 is down brick, pick node for
example 3 and make a
>>>>>>> test dir
>>>>>>> > under its brick directory that doesnt exist on
2 or should I be
>>>>>>> dong this
>>>>>>> > over a gluster mount?
>>>>>>> You should be doing this over gluster mount.
>>>>>>> >
>>>>>>> > >    b) set a non existent extended
attribute on / of mount.
>>>>>>> > >
>>>>>>> >
>>>>>>> > Could you give me an example of an attribute
to set?   I've read a
>>>>>>> tad on
>>>>>>> > this, and looked up attributes but haven't
set any yet myself.
>>>>>>> >
>>>>>>> Sure. setfattr -n "user.some-name" -v
"some-value" <path-to-mount>
>>>>>>> > Doing these steps will ensure that heal
happens only from updated
>>>>>>> brick to
>>>>>>> > > down brick.
>>>>>>> > > 5) gluster v start <> force
>>>>>>> > > 6) gluster v heal <>
>>>>>>> > >
>>>>>>> >
>>>>>>> > Will it matter if somewhere in gluster the
full heal command was
>>>>>>> run other
>>>>>>> > day?  Not sure if it eventually stops or times
out.
>>>>>>> >
>>>>>>> full heal will stop once the crawl is done. So if
you want to
>>>>>>> trigger heal again,
>>>>>>> run gluster v heal <>. Actually even brick up
or volume start force
>>>>>>> should
>>>>>>> trigger the heal.
>>>>>>>
>>>>>>
>>>>>> Did this on test bed today.  its one server with 3
bricks on same
>>>>>> machine so take that for what its worth.  also it still
runs 3.8.2.  Maybe
>>>>>> ill update and re-run test.
>>>>>>
>>>>>> killed brick
>>>>>> deleted brick dir
>>>>>> recreated brick dir
>>>>>> created fake dir on gluster mount
>>>>>> set suggested fake attribute on it
>>>>>> ran volume start <> force
>>>>>>
>>>>>> looked at files it said needed healing and it was just
8 shards that
>>>>>> were modified for few minutes I ran through steps
>>>>>>
>>>>>> gave it few minutes and it stayed same
>>>>>> ran gluster volume <> heal
>>>>>>
>>>>>> it healed all the directories and files you can see
over mount
>>>>>> including fakedir.
>>>>>>
>>>>>> same issue for shards though.  it adds more shards to
heal at glacier
>>>>>> pace.  slight jump in speed if I stat every file and
dir in VM running but
>>>>>> not all shards.
>>>>>>
>>>>>> It started with 8 shards to heal and is now only at 33
out of 800 and
>>>>>> probably wont finish adding for few days at rate it
goes.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> > >
>>>>>>> > > > 1st node worked as expected took 12
hours to heal 1TB data.
>>>>>>> Load was
>>>>>>> > > little
>>>>>>> > > > heavy but nothing shocking.
>>>>>>> > > >
>>>>>>> > > > About an hour after node 1 finished
I began same process on
>>>>>>> node2. Heal
>>>>>>> > > > proces kicked in as before and the
files in directories
>>>>>>> visible from
>>>>>>> > > mount
>>>>>>> > > > and .glusterfs healed in short time.
Then it began crawl of
>>>>>>> .shard adding
>>>>>>> > > > those files to heal count at which
point the entire proces
>>>>>>> ground to a
>>>>>>> > > halt
>>>>>>> > > > basically. After 48 hours out of 19k
shards it has added 5900
>>>>>>> to heal
>>>>>>> > > list.
>>>>>>> > > > Load on all 3 machnes is negligible.
It was suggested to
>>>>>>> change this
>>>>>>> > > value
>>>>>>> > > > to full
cluster.data-self-heal-algorithm and restart volume
>>>>>>> which I
>>>>>>> > > did. No
>>>>>>> > > > efffect. Tried relaunching heal no
effect, despite any node
>>>>>>> picked. I
>>>>>>> > > > started each VM and performed a stat
of all files from within
>>>>>>> it, or a
>>>>>>> > > full
>>>>>>> > > > virus scan and that seemed to cause
short small spikes in
>>>>>>> shards added,
>>>>>>> > > but
>>>>>>> > > > not by much. Logs are showing no
real messages indicating
>>>>>>> anything is
>>>>>>> > > going
>>>>>>> > > > on. I get hits to brick log on
occasion of null lookups making
>>>>>>> me think
>>>>>>> > > its
>>>>>>> > > > not really crawling shards directory
but waiting for a shard
>>>>>>> lookup to
>>>>>>> > > add
>>>>>>> > > > it. I'll get following in brick
log but not constant and
>>>>>>> sometime
>>>>>>> > > multiple
>>>>>>> > > > for same shard.
>>>>>>> > > >
>>>>>>> > > > [2016-08-29 08:31:57.478125] W
[MSGID: 115009]
>>>>>>> > > >
[server-resolve.c:569:server_resolve] 0-GLUSTER1-server: no
>>>>>>> resolution
>>>>>>> > > type
>>>>>>> > > > for (null) (LOOKUP)
>>>>>>> > > > [2016-08-29 08:31:57.478170] E
[MSGID: 115050]
>>>>>>> > > >
[server-rpc-fops.c:156:server_lookup_cbk] 0-GLUSTER1-server:
>>>>>>> 12591783:
>>>>>>> > > > LOOKUP (null) (00000000-0000-0000-00
>>>>>>> > > >
00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221) ==>
>>>>>>> (Invalid
>>>>>>> > > > argument) [Invalid argument]
>>>>>>> > > >
>>>>>>> > > > This one repeated about 30 times in
row then nothing for 10
>>>>>>> minutes then
>>>>>>> > > one
>>>>>>> > > > hit for one different shard by
itself.
>>>>>>> > > >
>>>>>>> > > > How can I determine if Heal is
actually running? How can I
>>>>>>> kill it or
>>>>>>> > > force
>>>>>>> > > > restart? Does node I start it from
determine which directory
>>>>>>> gets
>>>>>>> > > crawled to
>>>>>>> > > > determine heals?
>>>>>>> > > >
>>>>>>> > > > David Gossage
>>>>>>> > > > Carousel Checks Inc. | System
Administrator
>>>>>>> > > > Office 708.613.2284
>>>>>>> > > >
>>>>>>> > > >
_______________________________________________
>>>>>>> > > > Gluster-users mailing list
>>>>>>> > > > Gluster-users at gluster.org
>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>> > > >
>>>>>>> > > >
>>>>>>> > > >
_______________________________________________
>>>>>>> > > > Gluster-users mailing list
>>>>>>> > > > Gluster-users at gluster.org
>>>>>>> > > >
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>> > >
>>>>>>> > > --
>>>>>>> > > Thanks,
>>>>>>> > > Anuradha.
>>>>>>> > >
>>>>>>> >
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>> Anuradha.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160830/8ba8b529/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glustershd-node1
Type: application/octet-stream
Size: 322716 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160830/8ba8b529/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glustershd-node2.gz
Type: application/x-gzip
Size: 645489 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160830/8ba8b529/attachment-0002.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glustershd-node3.gz
Type: application/x-gzip
Size: 296635 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160830/8ba8b529/attachment-0003.gz>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glustershd-testnode
Type: application/octet-stream
Size: 20910 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160830/8ba8b529/attachment-0003.obj>

Gluster users - Aug 2016 - 3.8.3 Shards Healing Glacier Slow

[Gluster-users] 3.8.3 Shards Healing Glacier Slow

[Gluster-users] 3.8.3 Shards Healing Glacier Slow