----- Original Message -----> From: "David Gossage" <dgossage at carouselchecks.com> > To: "Anuradha Talur" <atalur at redhat.com> > Cc: "gluster-users at gluster.org List" <Gluster-users at gluster.org>, "Krutika Dhananjay" <kdhananj at redhat.com> > Sent: Monday, August 29, 2016 5:12:42 PM > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow > > On Mon, Aug 29, 2016 at 5:39 AM, Anuradha Talur <atalur at redhat.com> wrote: > > > Response inline. > > > > ----- Original Message ----- > > > From: "Krutika Dhananjay" <kdhananj at redhat.com> > > > To: "David Gossage" <dgossage at carouselchecks.com> > > > Cc: "gluster-users at gluster.org List" <Gluster-users at gluster.org> > > > Sent: Monday, August 29, 2016 3:55:04 PM > > > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow > > > > > > Could you attach both client and brick logs? Meanwhile I will try these > > steps > > > out on my machines and see if it is easily recreatable. > > > > > > -Krutika > > > > > > On Mon, Aug 29, 2016 at 2:31 PM, David Gossage < > > dgossage at carouselchecks.com > > > > wrote: > > > > > > > > > > > > Centos 7 Gluster 3.8.3 > > > > > > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1 > > > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1 > > > Brick3: ccgl4.gl.local:/gluster1/BRICK1/1 > > > Options Reconfigured: > > > cluster.data-self-heal-algorithm: full > > > cluster.self-heal-daemon: on > > > cluster.locking-scheme: granular > > > features.shard-block-size: 64MB > > > features.shard: on > > > performance.readdir-ahead: on > > > storage.owner-uid: 36 > > > storage.owner-gid: 36 > > > performance.quick-read: off > > > performance.read-ahead: off > > > performance.io-cache: off > > > performance.stat-prefetch: on > > > cluster.eager-lock: enable > > > network.remote-dio: enable > > > cluster.quorum-type: auto > > > cluster.server-quorum-type: server > > > server.allow-insecure: on > > > cluster.self-heal-window-size: 1024 > > > cluster.background-self-heal-count: 16 > > > performance.strict-write-ordering: off > > > nfs.disable: on > > > nfs.addr-namelookup: off > > > nfs.enable-ino32: off > > > cluster.granular-entry-heal: on > > > > > > Friday did rolling upgrade from 3.8.3->3.8.3 no issues. > > > Following steps detailed in previous recommendations began proces of > > > replacing and healngbricks one node at a time. > > > > > > 1) kill pid of brick > > > 2) reconfigure brick from raid6 to raid10 > > > 3) recreate directory of brick > > > 4) gluster volume start <> force > > > 5) gluster volume heal <> full > > Hi, > > > > I'd suggest that full heal is not used. There are a few bugs in full heal. > > Better safe than sorry ;) > > Instead I'd suggest the following steps: > > > > Currently I brought the node down by systemctl stop glusterd as I was > getting sporadic io issues and a few VM's paused so hoping that will help. > I may wait to do this till around 4PM when most work is done in case it > shoots load up. > > > > 1) kill pid of brick > > 2) to configuring of brick that you need > > 3) recreate brick dir > > 4) while the brick is still down, from the mount point: > > a) create a dummy non existent dir under / of mount. > > > > so if noee 2 is down brick, pick node for example 3 and make a test dir > under its brick directory that doesnt exist on 2 or should I be dong this > over a gluster mount?You should be doing this over gluster mount.> > > b) set a non existent extended attribute on / of mount. > > > > Could you give me an example of an attribute to set? I've read a tad on > this, and looked up attributes but haven't set any yet myself. >Sure. setfattr -n "user.some-name" -v "some-value" <path-to-mount>> Doing these steps will ensure that heal happens only from updated brick to > > down brick. > > 5) gluster v start <> force > > 6) gluster v heal <> > > > > Will it matter if somewhere in gluster the full heal command was run other > day? Not sure if it eventually stops or times out. >full heal will stop once the crawl is done. So if you want to trigger heal again, run gluster v heal <>. Actually even brick up or volume start force should trigger the heal.> > > > > 1st node worked as expected took 12 hours to heal 1TB data. Load was > > little > > > heavy but nothing shocking. > > > > > > About an hour after node 1 finished I began same process on node2. Heal > > > proces kicked in as before and the files in directories visible from > > mount > > > and .glusterfs healed in short time. Then it began crawl of .shard adding > > > those files to heal count at which point the entire proces ground to a > > halt > > > basically. After 48 hours out of 19k shards it has added 5900 to heal > > list. > > > Load on all 3 machnes is negligible. It was suggested to change this > > value > > > to full cluster.data-self-heal-algorithm and restart volume which I > > did. No > > > efffect. Tried relaunching heal no effect, despite any node picked. I > > > started each VM and performed a stat of all files from within it, or a > > full > > > virus scan and that seemed to cause short small spikes in shards added, > > but > > > not by much. Logs are showing no real messages indicating anything is > > going > > > on. I get hits to brick log on occasion of null lookups making me think > > its > > > not really crawling shards directory but waiting for a shard lookup to > > add > > > it. I'll get following in brick log but not constant and sometime > > multiple > > > for same shard. > > > > > > [2016-08-29 08:31:57.478125] W [MSGID: 115009] > > > [server-resolve.c:569:server_resolve] 0-GLUSTER1-server: no resolution > > type > > > for (null) (LOOKUP) > > > [2016-08-29 08:31:57.478170] E [MSGID: 115050] > > > [server-rpc-fops.c:156:server_lookup_cbk] 0-GLUSTER1-server: 12591783: > > > LOOKUP (null) (00000000-0000-0000-00 > > > 00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221) ==> (Invalid > > > argument) [Invalid argument] > > > > > > This one repeated about 30 times in row then nothing for 10 minutes then > > one > > > hit for one different shard by itself. > > > > > > How can I determine if Heal is actually running? How can I kill it or > > force > > > restart? Does node I start it from determine which directory gets > > crawled to > > > determine heals? > > > > > > David Gossage > > > Carousel Checks Inc. | System Administrator > > > Office 708.613.2284 > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > -- > > Thanks, > > Anuradha. > > >-- Thanks, Anuradha.
On Mon, Aug 29, 2016 at 7:01 AM, Anuradha Talur <atalur at redhat.com> wrote:> > > ----- Original Message ----- > > From: "David Gossage" <dgossage at carouselchecks.com> > > To: "Anuradha Talur" <atalur at redhat.com> > > Cc: "gluster-users at gluster.org List" <Gluster-users at gluster.org>, > "Krutika Dhananjay" <kdhananj at redhat.com> > > Sent: Monday, August 29, 2016 5:12:42 PM > > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow > > > > On Mon, Aug 29, 2016 at 5:39 AM, Anuradha Talur <atalur at redhat.com> > wrote: > > > > > Response inline. > > > > > > ----- Original Message ----- > > > > From: "Krutika Dhananjay" <kdhananj at redhat.com> > > > > To: "David Gossage" <dgossage at carouselchecks.com> > > > > Cc: "gluster-users at gluster.org List" <Gluster-users at gluster.org> > > > > Sent: Monday, August 29, 2016 3:55:04 PM > > > > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow > > > > > > > > Could you attach both client and brick logs? Meanwhile I will try > these > > > steps > > > > out on my machines and see if it is easily recreatable. > > > > > > > > -Krutika > > > > > > > > On Mon, Aug 29, 2016 at 2:31 PM, David Gossage < > > > dgossage at carouselchecks.com > > > > > wrote: > > > > > > > > > > > > > > > > Centos 7 Gluster 3.8.3 > > > > > > > > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1 > > > > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1 > > > > Brick3: ccgl4.gl.local:/gluster1/BRICK1/1 > > > > Options Reconfigured: > > > > cluster.data-self-heal-algorithm: full > > > > cluster.self-heal-daemon: on > > > > cluster.locking-scheme: granular > > > > features.shard-block-size: 64MB > > > > features.shard: on > > > > performance.readdir-ahead: on > > > > storage.owner-uid: 36 > > > > storage.owner-gid: 36 > > > > performance.quick-read: off > > > > performance.read-ahead: off > > > > performance.io-cache: off > > > > performance.stat-prefetch: on > > > > cluster.eager-lock: enable > > > > network.remote-dio: enable > > > > cluster.quorum-type: auto > > > > cluster.server-quorum-type: server > > > > server.allow-insecure: on > > > > cluster.self-heal-window-size: 1024 > > > > cluster.background-self-heal-count: 16 > > > > performance.strict-write-ordering: off > > > > nfs.disable: on > > > > nfs.addr-namelookup: off > > > > nfs.enable-ino32: off > > > > cluster.granular-entry-heal: on > > > > > > > > Friday did rolling upgrade from 3.8.3->3.8.3 no issues. > > > > Following steps detailed in previous recommendations began proces of > > > > replacing and healngbricks one node at a time. > > > > > > > > 1) kill pid of brick > > > > 2) reconfigure brick from raid6 to raid10 > > > > 3) recreate directory of brick > > > > 4) gluster volume start <> force > > > > 5) gluster volume heal <> full > > > Hi, > > > > > > I'd suggest that full heal is not used. There are a few bugs in full > heal. > > > Better safe than sorry ;) > > > Instead I'd suggest the following steps: > > > > > > Currently I brought the node down by systemctl stop glusterd as I was > > getting sporadic io issues and a few VM's paused so hoping that will > help. > > I may wait to do this till around 4PM when most work is done in case it > > shoots load up. > > > > > > > 1) kill pid of brick > > > 2) to configuring of brick that you need > > > 3) recreate brick dir > > > 4) while the brick is still down, from the mount point: > > > a) create a dummy non existent dir under / of mount. > > > > > > > so if noee 2 is down brick, pick node for example 3 and make a test dir > > under its brick directory that doesnt exist on 2 or should I be dong this > > over a gluster mount? > You should be doing this over gluster mount. > > > > > b) set a non existent extended attribute on / of mount. > > > > > > > Could you give me an example of an attribute to set? I've read a tad on > > this, and looked up attributes but haven't set any yet myself. > > > Sure. setfattr -n "user.some-name" -v "some-value" <path-to-mount> >And that can be done over gluster mount as well? and if not and on brick would it need to be done from both nodes that are up?> > Doing these steps will ensure that heal happens only from updated brick > to > > > down brick. > > > 5) gluster v start <> force > > > 6) gluster v heal <> > > > > > > > Will it matter if somewhere in gluster the full heal command was run > other > > day? Not sure if it eventually stops or times out. > > > full heal will stop once the crawl is done. So if you want to trigger heal > again, > run gluster v heal <>. Actually even brick up or volume start force should > trigger the heal. >So until it stops the initial crawl that started with heal full before that was almost not moving am I stuck? Does a volume restart or killing a certain process release it? Turning off self-heal daemon or something?> > > > > > 1st node worked as expected took 12 hours to heal 1TB data. Load was > > > little > > > > heavy but nothing shocking. > > > > > > > > About an hour after node 1 finished I began same process on node2. > Heal > > > > proces kicked in as before and the files in directories visible from > > > mount > > > > and .glusterfs healed in short time. Then it began crawl of .shard > adding > > > > those files to heal count at which point the entire proces ground to > a > > > halt > > > > basically. After 48 hours out of 19k shards it has added 5900 to heal > > > list. > > > > Load on all 3 machnes is negligible. It was suggested to change this > > > value > > > > to full cluster.data-self-heal-algorithm and restart volume which I > > > did. No > > > > efffect. Tried relaunching heal no effect, despite any node picked. I > > > > started each VM and performed a stat of all files from within it, or > a > > > full > > > > virus scan and that seemed to cause short small spikes in shards > added, > > > but > > > > not by much. Logs are showing no real messages indicating anything is > > > going > > > > on. I get hits to brick log on occasion of null lookups making me > think > > > its > > > > not really crawling shards directory but waiting for a shard lookup > to > > > add > > > > it. I'll get following in brick log but not constant and sometime > > > multiple > > > > for same shard. > > > > > > > > [2016-08-29 08:31:57.478125] W [MSGID: 115009] > > > > [server-resolve.c:569:server_resolve] 0-GLUSTER1-server: no > resolution > > > type > > > > for (null) (LOOKUP) > > > > [2016-08-29 08:31:57.478170] E [MSGID: 115050] > > > > [server-rpc-fops.c:156:server_lookup_cbk] 0-GLUSTER1-server: > 12591783: > > > > LOOKUP (null) (00000000-0000-0000-00 > > > > 00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221) ==> > (Invalid > > > > argument) [Invalid argument] > > > > > > > > This one repeated about 30 times in row then nothing for 10 minutes > then > > > one > > > > hit for one different shard by itself. > > > > > > > > How can I determine if Heal is actually running? How can I kill it or > > > force > > > > restart? Does node I start it from determine which directory gets > > > crawled to > > > > determine heals? > > > > > > > > David Gossage > > > > Carousel Checks Inc. | System Administrator > > > > Office 708.613.2284 > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > -- > > > Thanks, > > > Anuradha. > > > > > > > -- > Thanks, > Anuradha. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160829/26005554/attachment.html>
On Mon, Aug 29, 2016 at 7:01 AM, Anuradha Talur <atalur at redhat.com> wrote:> > > ----- Original Message ----- > > From: "David Gossage" <dgossage at carouselchecks.com> > > To: "Anuradha Talur" <atalur at redhat.com> > > Cc: "gluster-users at gluster.org List" <Gluster-users at gluster.org>, > "Krutika Dhananjay" <kdhananj at redhat.com> > > Sent: Monday, August 29, 2016 5:12:42 PM > > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow > > > > On Mon, Aug 29, 2016 at 5:39 AM, Anuradha Talur <atalur at redhat.com> > wrote: > > > > > Response inline. > > > > > > ----- Original Message ----- > > > > From: "Krutika Dhananjay" <kdhananj at redhat.com> > > > > To: "David Gossage" <dgossage at carouselchecks.com> > > > > Cc: "gluster-users at gluster.org List" <Gluster-users at gluster.org> > > > > Sent: Monday, August 29, 2016 3:55:04 PM > > > > Subject: Re: [Gluster-users] 3.8.3 Shards Healing Glacier Slow > > > > > > > > Could you attach both client and brick logs? Meanwhile I will try > these > > > steps > > > > out on my machines and see if it is easily recreatable. > > > > > > > > -Krutika > > > > > > > > On Mon, Aug 29, 2016 at 2:31 PM, David Gossage < > > > dgossage at carouselchecks.com > > > > > wrote: > > > > > > > > > > > > > > > > Centos 7 Gluster 3.8.3 > > > > > > > > Brick1: ccgl1.gl.local:/gluster1/BRICK1/1 > > > > Brick2: ccgl2.gl.local:/gluster1/BRICK1/1 > > > > Brick3: ccgl4.gl.local:/gluster1/BRICK1/1 > > > > Options Reconfigured: > > > > cluster.data-self-heal-algorithm: full > > > > cluster.self-heal-daemon: on > > > > cluster.locking-scheme: granular > > > > features.shard-block-size: 64MB > > > > features.shard: on > > > > performance.readdir-ahead: on > > > > storage.owner-uid: 36 > > > > storage.owner-gid: 36 > > > > performance.quick-read: off > > > > performance.read-ahead: off > > > > performance.io-cache: off > > > > performance.stat-prefetch: on > > > > cluster.eager-lock: enable > > > > network.remote-dio: enable > > > > cluster.quorum-type: auto > > > > cluster.server-quorum-type: server > > > > server.allow-insecure: on > > > > cluster.self-heal-window-size: 1024 > > > > cluster.background-self-heal-count: 16 > > > > performance.strict-write-ordering: off > > > > nfs.disable: on > > > > nfs.addr-namelookup: off > > > > nfs.enable-ino32: off > > > > cluster.granular-entry-heal: on > > > > > > > > Friday did rolling upgrade from 3.8.3->3.8.3 no issues. > > > > Following steps detailed in previous recommendations began proces of > > > > replacing and healngbricks one node at a time. > > > > > > > > 1) kill pid of brick > > > > 2) reconfigure brick from raid6 to raid10 > > > > 3) recreate directory of brick > > > > 4) gluster volume start <> force > > > > 5) gluster volume heal <> full > > > Hi, > > > > > > I'd suggest that full heal is not used. There are a few bugs in full > heal. > > > Better safe than sorry ;) > > > Instead I'd suggest the following steps: > > > > > > Currently I brought the node down by systemctl stop glusterd as I was > > getting sporadic io issues and a few VM's paused so hoping that will > help. > > I may wait to do this till around 4PM when most work is done in case it > > shoots load up. > > > > > > > 1) kill pid of brick > > > 2) to configuring of brick that you need > > > 3) recreate brick dir > > > 4) while the brick is still down, from the mount point: > > > a) create a dummy non existent dir under / of mount. > > > > > > > so if noee 2 is down brick, pick node for example 3 and make a test dir > > under its brick directory that doesnt exist on 2 or should I be dong this > > over a gluster mount? > You should be doing this over gluster mount. > > > > > b) set a non existent extended attribute on / of mount. > > > > > > > Could you give me an example of an attribute to set? I've read a tad on > > this, and looked up attributes but haven't set any yet myself. > > > Sure. setfattr -n "user.some-name" -v "some-value" <path-to-mount> > > Doing these steps will ensure that heal happens only from updated brick > to > > > down brick. > > > 5) gluster v start <> force > > > 6) gluster v heal <> > > > > > > > Will it matter if somewhere in gluster the full heal command was run > other > > day? Not sure if it eventually stops or times out. > > > full heal will stop once the crawl is done. So if you want to trigger heal > again, > run gluster v heal <>. Actually even brick up or volume start force should > trigger the heal. >Did this on test bed today. its one server with 3 bricks on same machine so take that for what its worth. also it still runs 3.8.2. Maybe ill update and re-run test. killed brick deleted brick dir recreated brick dir created fake dir on gluster mount set suggested fake attribute on it ran volume start <> force looked at files it said needed healing and it was just 8 shards that were modified for few minutes I ran through steps gave it few minutes and it stayed same ran gluster volume <> heal it healed all the directories and files you can see over mount including fakedir. same issue for shards though. it adds more shards to heal at glacier pace. slight jump in speed if I stat every file and dir in VM running but not all shards. It started with 8 shards to heal and is now only at 33 out of 800 and probably wont finish adding for few days at rate it goes.> > > > > > > 1st node worked as expected took 12 hours to heal 1TB data. Load was > > > little > > > > heavy but nothing shocking. > > > > > > > > About an hour after node 1 finished I began same process on node2. > Heal > > > > proces kicked in as before and the files in directories visible from > > > mount > > > > and .glusterfs healed in short time. Then it began crawl of .shard > adding > > > > those files to heal count at which point the entire proces ground to > a > > > halt > > > > basically. After 48 hours out of 19k shards it has added 5900 to heal > > > list. > > > > Load on all 3 machnes is negligible. It was suggested to change this > > > value > > > > to full cluster.data-self-heal-algorithm and restart volume which I > > > did. No > > > > efffect. Tried relaunching heal no effect, despite any node picked. I > > > > started each VM and performed a stat of all files from within it, or > a > > > full > > > > virus scan and that seemed to cause short small spikes in shards > added, > > > but > > > > not by much. Logs are showing no real messages indicating anything is > > > going > > > > on. I get hits to brick log on occasion of null lookups making me > think > > > its > > > > not really crawling shards directory but waiting for a shard lookup > to > > > add > > > > it. I'll get following in brick log but not constant and sometime > > > multiple > > > > for same shard. > > > > > > > > [2016-08-29 08:31:57.478125] W [MSGID: 115009] > > > > [server-resolve.c:569:server_resolve] 0-GLUSTER1-server: no > resolution > > > type > > > > for (null) (LOOKUP) > > > > [2016-08-29 08:31:57.478170] E [MSGID: 115050] > > > > [server-rpc-fops.c:156:server_lookup_cbk] 0-GLUSTER1-server: > 12591783: > > > > LOOKUP (null) (00000000-0000-0000-00 > > > > 00-000000000000/241a55ed-f0d5-4dbc-a6ce-ab784a0ba6ff.221) ==> > (Invalid > > > > argument) [Invalid argument] > > > > > > > > This one repeated about 30 times in row then nothing for 10 minutes > then > > > one > > > > hit for one different shard by itself. > > > > > > > > How can I determine if Heal is actually running? How can I kill it or > > > force > > > > restart? Does node I start it from determine which directory gets > > > crawled to > > > > determine heals? > > > > > > > > David Gossage > > > > Carousel Checks Inc. | System Administrator > > > > Office 708.613.2284 > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > -- > > > Thanks, > > > Anuradha. > > > > > > > -- > Thanks, > Anuradha. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160829/50f8b40a/attachment.html>