thr3ads.net - Gluster users - [Gluster-users] GFS performance under heavy traffic [Dec 2019]

If this information is useful, please help other people find it:
Share via:

Strahil

2019-Dec-20 22:26 UTC

[Gluster-users] GFS performance under heavy traffic

Hi David,

Also consider using the  mount option to specify backup server via
'backupvolfile-server=server2:server3' (you can define more but I
don't thing replica volumes  greater that 3 are usefull (maybe  in some
special cases).

In such way, when the primary is lost, your client can reach a backup one
without disruption.

P.S.: Client may 'hang' - if the primary server got rebooted
ungracefully - as the communication must timeout before FUSE addresses the next
server. There is a special script for  killing gluster processes in
'/usr/share/gluster/scripts' which can be used  for  setting up a
systemd service to do that for you on shutdown.

Best Regards,
Strahil NikolovOn Dec 20, 2019 23:49, David Cunningham <dcunningham at
voisonics.com> wrote:>
> Hi Stahil,
>
> Ah, that is an important point. One of the nodes is not accessible from the
client, and we assumed that it only needed to reach the GFS node that was
mounted so didn't think anything of it.
>
> We will try making all nodes accessible, as well as
"direct-io-mode=disable".
>
> Thank you.
>
>
> On Sat, 21 Dec 2019 at 10:29, Strahil Nikolov <hunter86_bg at
yahoo.com> wrote:
>>
>> Actually I haven't clarified myself.
>> FUSE mounts on the client side is connecting directly to all bricks
consisted of the volume.
>> If for some reason (bad routing, firewall blocked) there could be cases
where the client can reach 2 out of 3 bricks and this can constantly cause
healing to happen (as one of the bricks is never updated) which will degrade the
performance and cause excessive network usage.
>> As your attachment is from one of the gluster nodes, this could be the
case.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> ? ?????, 20 ???????? 2019 ?., 01:49:56 ?. ???????+2, David Cunningham
<dcunningham at voisonics.com> ??????:
>>
>>
>> Hi Strahil,
>>
>> The chart attached to my original email is taken from the GFS server.
>>
>> I'm not sure what you mean by accessing all bricks simultaneously.
We've mounted it from the client like this:
>> gfs1:/gvol0 /mnt/glusterfs/ glusterfs
defaults,direct-io-mode=disable,_netdev,backupvolfile-server=gfs2,fetch-attempts=10
0 0
>>
>> Should we do something different to access all bricks simultaneously?
>>
>> Thanks for your help!
>>
>>
>> On Fri, 20 Dec 2019 at 11:47, Strahil Nikolov <hunter86_bg at
yahoo.com> wrote:
>>>
>>> I'm not sure if you did measure the traffic from client side
(tcpdump on a client machine) or from Server side.
>>>
>>> In both cases , please verify that the client accesses all bricks
simultaneously, as this can cause unnecessary heals.
>>>
>>> Have you thought about upgrading to v6? There are some enhancements
in v6 which could be beneficial.
>>>
>>> Yet, it is indeed strange that so much traffic is generated with
FUSE.
>>>
>>> Another aproach is to test with NFSGanesha which suports pNFS and
can natively speak with Gluster, which cant bring you closer to the previous
setup and also provide some extra performance.
>>>
>>>
>>> Best Regards,
>>> Strahil Nikolov
>>>
>>>
>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20191221/95152a77/attachment-0001.html>

David Cunningham

2019-Dec-23 00:09 UTC

head link

[Gluster-users] GFS performance under heavy traffic

Hi Strahil,

Thanks for that. We do have one backup server specified, but will add the
second backup as well.


On Sat, 21 Dec 2019 at 11:26, Strahil <hunter86_bg at yahoo.com> wrote:
> Hi David,
>
> Also consider using the  mount option to specify backup server via
> 'backupvolfile-server=server2:server3' (you can define more but I
don't
> thing replica volumes  greater that 3 are usefull (maybe  in some special
> cases).
>
> In such way, when the primary is lost, your client can reach a backup one
> without disruption.
>
> P.S.: Client may 'hang' - if the primary server got rebooted
ungracefully
> - as the communication must timeout before FUSE addresses the next server.
> There is a special script for  killing gluster processes in
> '/usr/share/gluster/scripts' which can be used  for  setting up a
systemd
> service to do that for you on shutdown.
>
> Best Regards,
> Strahil Nikolov
> On Dec 20, 2019 23:49, David Cunningham <dcunningham at
voisonics.com> wrote:
>
> Hi Stahil,
>
> Ah, that is an important point. One of the nodes is not accessible from
> the client, and we assumed that it only needed to reach the GFS node that
> was mounted so didn't think anything of it.
>
> We will try making all nodes accessible, as well as
> "direct-io-mode=disable".
>
> Thank you.
>
>
> On Sat, 21 Dec 2019 at 10:29, Strahil Nikolov <hunter86_bg at
yahoo.com>
> wrote:
>
> Actually I haven't clarified myself.
> FUSE mounts on the client side is connecting directly to all bricks
> consisted of the volume.
> If for some reason (bad routing, firewall blocked) there could be cases
> where the client can reach 2 out of 3 bricks and this can constantly cause
> healing to happen (as one of the bricks is never updated) which will
> degrade the performance and cause excessive network usage.
> As your attachment is from one of the gluster nodes, this could be the
> case.
>
> Best Regards,
> Strahil Nikolov
>
> ? ?????, 20 ???????? 2019 ?., 01:49:56 ?. ???????+2, David Cunningham <
> dcunningham at voisonics.com> ??????:
>
>
> Hi Strahil,
>
> The chart attached to my original email is taken from the GFS server.
>
> I'm not sure what you mean by accessing all bricks simultaneously.
We've
> mounted it from the client like this:
> gfs1:/gvol0 /mnt/glusterfs/ glusterfs
>
defaults,direct-io-mode=disable,_netdev,backupvolfile-server=gfs2,fetch-attempts=10
> 0 0
>
> Should we do something different to access all bricks simultaneously?
>
> Thanks for your help!
>
>
> On Fri, 20 Dec 2019 at 11:47, Strahil Nikolov <hunter86_bg at
yahoo.com>
> wrote:
>
> I'm not sure if you did measure the traffic from client side (tcpdump
on a
> client machine) or from Server side.
>
> In both cases , please verify that the client accesses all bricks
> simultaneously, as this can cause unnecessary heals.
>
> Have you thought about upgrading to v6? There are some enhancements in v6
> which could be beneficial.
>
> Yet, it is indeed strange that so much traffic is generated with FUSE.
>
> Another aproach is to test with NFSGanesha which suports pNFS and can
> natively speak with Gluster, which cant bring you closer to the previous
> setup and also provide some extra performance.
>
>
> Best Regards,
> Strahil Nikolov
>
>
>
>
-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20191223/bfccb5d6/attachment.html>

Gluster users - Dec 2019 - GFS performance under heavy traffic

[Gluster-users] GFS performance under heavy traffic

[Gluster-users] GFS performance under heavy traffic