gencer at gencgiyen.com
2017-Jun-30 12:03 UTC
[Gluster-users] Very slow performance on Sharded GlusterFS
Hi Krutika, Sure, here is volume info: root at sr-09-loc-50-14-18:/# gluster volume info testvol Volume Name: testvol Type: Distributed-Replicate Volume ID: 30426017-59d5-4091-b6bc-279a905b704a Status: Started Snapshot Count: 0 Number of Bricks: 10 x 2 = 20 Transport-type: tcp Bricks: Brick1: sr-09-loc-50-14-18:/bricks/brick1 Brick2: sr-09-loc-50-14-18:/bricks/brick2 Brick3: sr-09-loc-50-14-18:/bricks/brick3 Brick4: sr-09-loc-50-14-18:/bricks/brick4 Brick5: sr-09-loc-50-14-18:/bricks/brick5 Brick6: sr-09-loc-50-14-18:/bricks/brick6 Brick7: sr-09-loc-50-14-18:/bricks/brick7 Brick8: sr-09-loc-50-14-18:/bricks/brick8 Brick9: sr-09-loc-50-14-18:/bricks/brick9 Brick10: sr-09-loc-50-14-18:/bricks/brick10 Brick11: sr-10-loc-50-14-18:/bricks/brick1 Brick12: sr-10-loc-50-14-18:/bricks/brick2 Brick13: sr-10-loc-50-14-18:/bricks/brick3 Brick14: sr-10-loc-50-14-18:/bricks/brick4 Brick15: sr-10-loc-50-14-18:/bricks/brick5 Brick16: sr-10-loc-50-14-18:/bricks/brick6 Brick17: sr-10-loc-50-14-18:/bricks/brick7 Brick18: sr-10-loc-50-14-18:/bricks/brick8 Brick19: sr-10-loc-50-14-18:/bricks/brick9 Brick20: sr-10-loc-50-14-18:/bricks/brick10 Options Reconfigured: features.shard-block-size: 32MB features.shard: on transport.address-family: inet nfs.disable: on -Gencer. From: Krutika Dhananjay [mailto:kdhananj at redhat.com] Sent: Friday, June 30, 2017 2:50 PM To: gencer at gencgiyen.com Cc: gluster-user <gluster-users at gluster.org> Subject: Re: [Gluster-users] Very slow performance on Sharded GlusterFS Could you please provide the volume-info output? -Krutika On Fri, Jun 30, 2017 at 4:23 PM, <gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> > wrote: Hi, I have an 2 nodes with 20 bricks in total (10+10). First test: 2 Nodes with Distributed ? Striped ? Replicated (2 x 2) 10GbE Speed between nodes ?dd? performance: 400mb/s and higher Downloading a large file from internet and directly to the gluster: 250-300mb/s Now same test without Stripe but with sharding. This results are same when I set shard size 4MB or 32MB. (Again 2x Replica here) Dd performance: 70mb/s Download directly to the gluster performance : 60mb/s Now, If we do this test twice at the same time (two dd or two doewnload at the same time) it goes below 25/mb each or slower. I thought sharding is at least equal or a little slower (maybe?) but these results are terribly slow. I tried tuning (cache, window-size etc..). Nothing helps. GlusterFS 3.11 and Debian 9 used. Kernel also tuned. Disks are ?xfs? and 4TB each. Is there any tweak/tuning out there to make it fast? Or is this an expected behavior? If its, It is unacceptable. So slow. I cannot use this on production as it is terribly slow. The reason behind I use shard instead of stripe is i would like to eleminate files that bigger than brick size. Thanks, Gencer. _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170630/842d43e6/attachment.html>
Krutika Dhananjay
2017-Jun-30 12:49 UTC
[Gluster-users] Very slow performance on Sharded GlusterFS
Just noticed that the way you have configured your brick order during volume-create makes both replicas of every set reside on the same machine. That apart, do you see any difference if you change shard-block-size to 512MB? Could you try that? If it doesn't help, could you share the volume-profile output for both the tests (separate)? Here's what you do: 1. Start profile before starting your test - it could be dd or it could be file download. # gluster volume profile <VOL> start 2. Run your test - again either dd or file-download. 3. Once the test has completed, run `gluster volume profile <VOL> info` and redirect its output to a tmp file. 4. Stop profile # gluster volume profile <VOL> stop And attach the volume-profile output file that you saved at a temporary location in step 3. -Krutika On Fri, Jun 30, 2017 at 5:33 PM, <gencer at gencgiyen.com> wrote:> Hi Krutika, > > > > Sure, here is volume info: > > > > root at sr-09-loc-50-14-18:/# gluster volume info testvol > > > > Volume Name: testvol > > Type: Distributed-Replicate > > Volume ID: 30426017-59d5-4091-b6bc-279a905b704a > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 10 x 2 = 20 > > Transport-type: tcp > > Bricks: > > Brick1: sr-09-loc-50-14-18:/bricks/brick1 > > Brick2: sr-09-loc-50-14-18:/bricks/brick2 > > Brick3: sr-09-loc-50-14-18:/bricks/brick3 > > Brick4: sr-09-loc-50-14-18:/bricks/brick4 > > Brick5: sr-09-loc-50-14-18:/bricks/brick5 > > Brick6: sr-09-loc-50-14-18:/bricks/brick6 > > Brick7: sr-09-loc-50-14-18:/bricks/brick7 > > Brick8: sr-09-loc-50-14-18:/bricks/brick8 > > Brick9: sr-09-loc-50-14-18:/bricks/brick9 > > Brick10: sr-09-loc-50-14-18:/bricks/brick10 > > Brick11: sr-10-loc-50-14-18:/bricks/brick1 > > Brick12: sr-10-loc-50-14-18:/bricks/brick2 > > Brick13: sr-10-loc-50-14-18:/bricks/brick3 > > Brick14: sr-10-loc-50-14-18:/bricks/brick4 > > Brick15: sr-10-loc-50-14-18:/bricks/brick5 > > Brick16: sr-10-loc-50-14-18:/bricks/brick6 > > Brick17: sr-10-loc-50-14-18:/bricks/brick7 > > Brick18: sr-10-loc-50-14-18:/bricks/brick8 > > Brick19: sr-10-loc-50-14-18:/bricks/brick9 > > Brick20: sr-10-loc-50-14-18:/bricks/brick10 > > Options Reconfigured: > > features.shard-block-size: 32MB > > features.shard: on > > transport.address-family: inet > > nfs.disable: on > > > > -Gencer. > > > > *From:* Krutika Dhananjay [mailto:kdhananj at redhat.com] > *Sent:* Friday, June 30, 2017 2:50 PM > *To:* gencer at gencgiyen.com > *Cc:* gluster-user <gluster-users at gluster.org> > *Subject:* Re: [Gluster-users] Very slow performance on Sharded GlusterFS > > > > Could you please provide the volume-info output? > > -Krutika > > > > On Fri, Jun 30, 2017 at 4:23 PM, <gencer at gencgiyen.com> wrote: > > Hi, > > > > I have an 2 nodes with 20 bricks in total (10+10). > > > > First test: > > > > 2 Nodes with Distributed ? Striped ? Replicated (2 x 2) > > 10GbE Speed between nodes > > > > ?dd? performance: 400mb/s and higher > > Downloading a large file from internet and directly to the gluster: > 250-300mb/s > > > > Now same test without Stripe but with sharding. This results are same when > I set shard size 4MB or 32MB. (Again 2x Replica here) > > > > Dd performance: 70mb/s > > Download directly to the gluster performance : 60mb/s > > > > Now, If we do this test twice at the same time (two dd or two doewnload at > the same time) it goes below 25/mb each or slower. > > > > I thought sharding is at least equal or a little slower (maybe?) but these > results are terribly slow. > > > > I tried tuning (cache, window-size etc..). Nothing helps. > > > > GlusterFS 3.11 and Debian 9 used. Kernel also tuned. Disks are ?xfs? and > 4TB each. > > > > Is there any tweak/tuning out there to make it fast? > > > > Or is this an expected behavior? If its, It is unacceptable. So slow. I > cannot use this on production as it is terribly slow. > > > > The reason behind I use shard instead of stripe is i would like to > eleminate files that bigger than brick size. > > > > Thanks, > > Gencer. > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170630/327c5ae1/attachment.html>
gencer at gencgiyen.com
2017-Jun-30 13:50 UTC
[Gluster-users] Very slow performance on Sharded GlusterFS
I already tried 512MB but re-try again now and results are the same. Both without tuning; Stripe 2 replica 2: dd performs 250~ mb/s but shard gives 77mb. I attached two logs (shard and stripe logs) Note: I also noticed that you said ?order?. Do you mean when we create via volume set we have to make an order for bricks? I thought gluster handles (and do the math) itself. Gencer From: Krutika Dhananjay [mailto:kdhananj at redhat.com] Sent: Friday, June 30, 2017 3:50 PM To: gencer at gencgiyen.com Cc: gluster-user <gluster-users at gluster.org> Subject: Re: [Gluster-users] Very slow performance on Sharded GlusterFS Just noticed that the way you have configured your brick order during volume-create makes both replicas of every set reside on the same machine. That apart, do you see any difference if you change shard-block-size to 512MB? Could you try that? If it doesn't help, could you share the volume-profile output for both the tests (separate)? Here's what you do: 1. Start profile before starting your test - it could be dd or it could be file download. # gluster volume profile <VOL> start 2. Run your test - again either dd or file-download. 3. Once the test has completed, run `gluster volume profile <VOL> info` and redirect its output to a tmp file. 4. Stop profile # gluster volume profile <VOL> stop And attach the volume-profile output file that you saved at a temporary location in step 3. -Krutika On Fri, Jun 30, 2017 at 5:33 PM, <gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> > wrote: Hi Krutika, Sure, here is volume info: root at sr-09-loc-50-14-18:/# gluster volume info testvol Volume Name: testvol Type: Distributed-Replicate Volume ID: 30426017-59d5-4091-b6bc-279a905b704a Status: Started Snapshot Count: 0 Number of Bricks: 10 x 2 = 20 Transport-type: tcp Bricks: Brick1: sr-09-loc-50-14-18:/bricks/brick1 Brick2: sr-09-loc-50-14-18:/bricks/brick2 Brick3: sr-09-loc-50-14-18:/bricks/brick3 Brick4: sr-09-loc-50-14-18:/bricks/brick4 Brick5: sr-09-loc-50-14-18:/bricks/brick5 Brick6: sr-09-loc-50-14-18:/bricks/brick6 Brick7: sr-09-loc-50-14-18:/bricks/brick7 Brick8: sr-09-loc-50-14-18:/bricks/brick8 Brick9: sr-09-loc-50-14-18:/bricks/brick9 Brick10: sr-09-loc-50-14-18:/bricks/brick10 Brick11: sr-10-loc-50-14-18:/bricks/brick1 Brick12: sr-10-loc-50-14-18:/bricks/brick2 Brick13: sr-10-loc-50-14-18:/bricks/brick3 Brick14: sr-10-loc-50-14-18:/bricks/brick4 Brick15: sr-10-loc-50-14-18:/bricks/brick5 Brick16: sr-10-loc-50-14-18:/bricks/brick6 Brick17: sr-10-loc-50-14-18:/bricks/brick7 Brick18: sr-10-loc-50-14-18:/bricks/brick8 Brick19: sr-10-loc-50-14-18:/bricks/brick9 Brick20: sr-10-loc-50-14-18:/bricks/brick10 Options Reconfigured: features.shard-block-size: 32MB features.shard: on transport.address-family: inet nfs.disable: on -Gencer. From: Krutika Dhananjay [mailto:kdhananj at redhat.com <mailto:kdhananj at redhat.com> ] Sent: Friday, June 30, 2017 2:50 PM To: gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> Cc: gluster-user <gluster-users at gluster.org <mailto:gluster-users at gluster.org> > Subject: Re: [Gluster-users] Very slow performance on Sharded GlusterFS Could you please provide the volume-info output? -Krutika On Fri, Jun 30, 2017 at 4:23 PM, <gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> > wrote: Hi, I have an 2 nodes with 20 bricks in total (10+10). First test: 2 Nodes with Distributed ? Striped ? Replicated (2 x 2) 10GbE Speed between nodes ?dd? performance: 400mb/s and higher Downloading a large file from internet and directly to the gluster: 250-300mb/s Now same test without Stripe but with sharding. This results are same when I set shard size 4MB or 32MB. (Again 2x Replica here) Dd performance: 70mb/s Download directly to the gluster performance : 60mb/s Now, If we do this test twice at the same time (two dd or two doewnload at the same time) it goes below 25/mb each or slower. I thought sharding is at least equal or a little slower (maybe?) but these results are terribly slow. I tried tuning (cache, window-size etc..). Nothing helps. GlusterFS 3.11 and Debian 9 used. Kernel also tuned. Disks are ?xfs? and 4TB each. Is there any tweak/tuning out there to make it fast? Or is this an expected behavior? If its, It is unacceptable. So slow. I cannot use this on production as it is terribly slow. The reason behind I use shard instead of stripe is i would like to eleminate files that bigger than brick size. Thanks, Gencer. _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170630/fac7c7d9/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: shard.log Type: application/octet-stream Size: 28988 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170630/fac7c7d9/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: stripe.log Type: application/octet-stream Size: 35756 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170630/fac7c7d9/attachment-0001.obj>
gencer at gencgiyen.com
2017-Jul-03 15:12 UTC
[Gluster-users] Very slow performance on Sharded GlusterFS
Hi Krutika, Have you be able to look out my profiles? Do you have any clue, idea or suggestion? Thanks, -Gencer From: Krutika Dhananjay [mailto:kdhananj at redhat.com] Sent: Friday, June 30, 2017 3:50 PM To: gencer at gencgiyen.com Cc: gluster-user <gluster-users at gluster.org> Subject: Re: [Gluster-users] Very slow performance on Sharded GlusterFS Just noticed that the way you have configured your brick order during volume-create makes both replicas of every set reside on the same machine. That apart, do you see any difference if you change shard-block-size to 512MB? Could you try that? If it doesn't help, could you share the volume-profile output for both the tests (separate)? Here's what you do: 1. Start profile before starting your test - it could be dd or it could be file download. # gluster volume profile <VOL> start 2. Run your test - again either dd or file-download. 3. Once the test has completed, run `gluster volume profile <VOL> info` and redirect its output to a tmp file. 4. Stop profile # gluster volume profile <VOL> stop And attach the volume-profile output file that you saved at a temporary location in step 3. -Krutika On Fri, Jun 30, 2017 at 5:33 PM, <gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> > wrote: Hi Krutika, Sure, here is volume info: root at sr-09-loc-50-14-18:/# gluster volume info testvol Volume Name: testvol Type: Distributed-Replicate Volume ID: 30426017-59d5-4091-b6bc-279a905b704a Status: Started Snapshot Count: 0 Number of Bricks: 10 x 2 = 20 Transport-type: tcp Bricks: Brick1: sr-09-loc-50-14-18:/bricks/brick1 Brick2: sr-09-loc-50-14-18:/bricks/brick2 Brick3: sr-09-loc-50-14-18:/bricks/brick3 Brick4: sr-09-loc-50-14-18:/bricks/brick4 Brick5: sr-09-loc-50-14-18:/bricks/brick5 Brick6: sr-09-loc-50-14-18:/bricks/brick6 Brick7: sr-09-loc-50-14-18:/bricks/brick7 Brick8: sr-09-loc-50-14-18:/bricks/brick8 Brick9: sr-09-loc-50-14-18:/bricks/brick9 Brick10: sr-09-loc-50-14-18:/bricks/brick10 Brick11: sr-10-loc-50-14-18:/bricks/brick1 Brick12: sr-10-loc-50-14-18:/bricks/brick2 Brick13: sr-10-loc-50-14-18:/bricks/brick3 Brick14: sr-10-loc-50-14-18:/bricks/brick4 Brick15: sr-10-loc-50-14-18:/bricks/brick5 Brick16: sr-10-loc-50-14-18:/bricks/brick6 Brick17: sr-10-loc-50-14-18:/bricks/brick7 Brick18: sr-10-loc-50-14-18:/bricks/brick8 Brick19: sr-10-loc-50-14-18:/bricks/brick9 Brick20: sr-10-loc-50-14-18:/bricks/brick10 Options Reconfigured: features.shard-block-size: 32MB features.shard: on transport.address-family: inet nfs.disable: on -Gencer. From: Krutika Dhananjay [mailto:kdhananj at redhat.com <mailto:kdhananj at redhat.com> ] Sent: Friday, June 30, 2017 2:50 PM To: gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> Cc: gluster-user <gluster-users at gluster.org <mailto:gluster-users at gluster.org> > Subject: Re: [Gluster-users] Very slow performance on Sharded GlusterFS Could you please provide the volume-info output? -Krutika On Fri, Jun 30, 2017 at 4:23 PM, <gencer at gencgiyen.com <mailto:gencer at gencgiyen.com> > wrote: Hi, I have an 2 nodes with 20 bricks in total (10+10). First test: 2 Nodes with Distributed ? Striped ? Replicated (2 x 2) 10GbE Speed between nodes ?dd? performance: 400mb/s and higher Downloading a large file from internet and directly to the gluster: 250-300mb/s Now same test without Stripe but with sharding. This results are same when I set shard size 4MB or 32MB. (Again 2x Replica here) Dd performance: 70mb/s Download directly to the gluster performance : 60mb/s Now, If we do this test twice at the same time (two dd or two doewnload at the same time) it goes below 25/mb each or slower. I thought sharding is at least equal or a little slower (maybe?) but these results are terribly slow. I tried tuning (cache, window-size etc..). Nothing helps. GlusterFS 3.11 and Debian 9 used. Kernel also tuned. Disks are ?xfs? and 4TB each. Is there any tweak/tuning out there to make it fast? Or is this an expected behavior? If its, It is unacceptable. So slow. I cannot use this on production as it is terribly slow. The reason behind I use shard instead of stripe is i would like to eleminate files that bigger than brick size. Thanks, Gencer. _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170703/eccdcf8d/attachment.html>