thr3ads.net - Gluster users - [Gluster-users] GFS performance under heavy traffic [Dec 2019]

If this information is useful, please help other people find it:
Share via:

David Cunningham

2019-Dec-19 00:28 UTC

[Gluster-users] GFS performance under heavy traffic

Hi Raghavendra and Strahil,

We are using GFS version 5.6-1.el7 from the CentOS repository.
Unfortunately we can't modify the application and it expects to read and
write from a normal filesystem.

There's around 25GB of data being written during a business day, so over 10
hours that's around 0.7 MBps, which has me mystified as to how it can
generate 114MBps of network traffic. Granted we have read traffic as well,
but still. The chart shows much more inbound traffic to the GFS server than
outbound, suggesting the problem is with data writes.

Is it possible with GFS to not check with the other nodes when reading? Our
data is mostly static and we don't require 100% guarantee that the data is
up-to-date when reading.

Thanks for any assistance.


On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa <rgowdapp at
redhat.com>
wrote:
> What version of Glusterfs are you using? Though, not sure what's the
root
> cause of your problem, just wanted to point out a bug with read-ahead which
> would cause read-amplification over network [1][2], which should be fixed
> in recent versions.
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1393419
>
> On Wed, Dec 18, 2019 at 2:50 AM David Cunningham <
> dcunningham at voisonics.com> wrote:
>
>> Hello,
>>
>> We switched a production system to using GFS instead of NFS at the
>> weekend, however it didn't go well on Monday when full load hit.
The
>> application started crashing regularly and we had to revert to NFS. It
>> seems that the problem was high network traffic used by GFS.
>>
>> We've two GFS nodes plus one arbiter node, each about 1.3ms latency
from
>> each other. Attached is a chart of network traffic on one of the GFS
nodes.
>> We see that it saturated the 1Gbps link before we reverted to NFS at
15:10.
>>
>> The question is, why does GFS use so much network traffic and is there
>> anything we can do about it? NFS traffic doesn't exceed 4MBps, so
120MBps
>> for GFS seems awfully high.
>>
>> It would also be good to have faster read performance from GFS, but
>> that's another issue.
>>
>> Thanks in advance for any assistance.
>>
>> --
>> David Cunningham, Voisonics Limited
>> http://voisonics.com/
>> USA: +1 213 221 1092
>> New Zealand: +64 (0)28 2558 3782
>> ________
>>
>> Community Meeting Calendar:
>>
>> APAC Schedule -
>> Every 2nd and 4th Tuesday at 11:30 AM IST
>> Bridge: https://bluejeans.com/441850968
>>
>> NA/EMEA Schedule -
>> Every 1st and 3rd Tuesday at 01:00 PM EDT
>> Bridge: https://bluejeans.com/441850968
>>
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20191219/0b1b7c3f/attachment.html>

Jorick Astrego

2019-Dec-19 13:55 UTC

head link

[Gluster-users] GFS performance under heavy traffic

Hi David,

Did you try setting "direct-io-mode=disable" on the client mounts? As
it
is mostly static content it would help to use the kernel caching and
read-ahead mechanisms.

I think the default is enabled.

Regards,

Jorick Astrego

On 12/19/19 1:28 AM, David Cunningham wrote:> Hi Raghavendra and Strahil,
>
> We are using GFS version 5.6-1.el7 from the CentOS repository.
> Unfortunately we can't modify the application and it expects to read
> and write from a normal filesystem.
>
> There's around 25GB of data being written during a business day, so
> over 10 hours that's around 0.7 MBps, which has me mystified as to how
> it can generate 114MBps of network traffic. Granted we have read
> traffic as well, but still. The chart shows much more inbound traffic
> to the GFS server than outbound, suggesting the problem is with data
> writes.
>
> Is it possible with GFS to not check with the other nodes when
> reading? Our data is mostly static and we don't require 100% guarantee
> that the data is up-to-date when reading.
>
> Thanks for any assistance.
>
>
> On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa
> <rgowdapp at redhat.com <mailto:rgowdapp at redhat.com>> wrote:
>
>     What version of Glusterfs are you using? Though, not sure what's
>     the root cause of your problem, just wanted to point out a bug
>     with read-ahead which would cause read-amplification over network
>     [1][2], which should be fixed in recent versions.
>
>     [1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489
>     [2] https://bugzilla.redhat.com/show_bug.cgi?id=1393419
>
>     On Wed, Dec 18, 2019 at 2:50 AM David Cunningham
>     <dcunningham at voisonics.com <mailto:dcunningham at
voisonics.com>> wrote:
>
>         Hello,
>
>         We switched a production system to using GFS instead of NFS at
>         the weekend, however it didn't go well on Monday when full
>         load hit. The application started crashing regularly and we
>         had to revert to NFS. It seems that the problem was high
>         network traffic used by GFS.
>
>         We've two GFS nodes plus one arbiter node, each about 1.3ms
>         latency from each other. Attached is a chart of network
>         traffic on one of the GFS nodes. We see that it saturated the
>         1Gbps link before we reverted to NFS at 15:10.
>
>         The question is, why does GFS use so much network traffic and
>         is there anything we can do about it? NFS traffic doesn't
>         exceed 4MBps, so 120MBps for GFS seems awfully high.
>
>         It would also be good to have faster read performance from
>         GFS, but that's another issue.
>
>         Thanks in advance for any assistance.
>
>         -- 
>         David Cunningham, Voisonics Limited
>         http://voisonics.com/
>         USA: +1 213 221 1092
>         New Zealand: +64 (0)28 2558 3782
>         ________
>
>         Community Meeting Calendar:
>
>         APAC Schedule -
>         Every 2nd and 4th Tuesday at 11:30 AM IST
>         Bridge: https://bluejeans.com/441850968
>
>         NA/EMEA Schedule -
>         Every 1st and 3rd Tuesday at 01:00 PM EDT
>         Bridge: https://bluejeans.com/441850968
>
>         Gluster-users mailing list
>         Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>         https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
> -- 
> David Cunningham, Voisonics Limited
> http://voisonics.com/
> USA: +1 213 221 1092
> New Zealand: +64 (0)28 2558 3782
>
> ________
>
> Community Meeting Calendar:
>
> APAC Schedule -
> Every 2nd and 4th Tuesday at 11:30 AM IST
> Bridge: https://bluejeans.com/441850968
>
> NA/EMEA Schedule -
> Every 1st and 3rd Tuesday at 01:00 PM EDT
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts 

----------------

	Tel: 053 20 30 270 	info at netbulae.eu 	Staalsteden 4-3A 	KvK 08198180
 	Fax: 053 20 30 271 	www.netbulae.eu 	7547 TA Enschede 	BTW NL821234584B01

----------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20191219/ae27cfb0/attachment.html>

Strahil Nikolov

2019-Dec-19 22:47 UTC

head link

[Gluster-users] GFS performance under heavy traffic

I'm not sure if you did measure the traffic from client side (tcpdump on a
client machine) or from Server side.
In both cases , please verify that the client accesses all bricks
simultaneously, as this can cause unnecessary heals.
Have you thought about upgrading to v6? There are some enhancements in v6 which
could be beneficial.
Yet, it is indeed strange that so much traffic is generated with FUSE.
Another aproach is to test with NFSGanesha which suports pNFS and can natively
speak with Gluster, which cant bring you closer to the previous setup and also
provide some extra performance.

Best Regards,Strahil Nikolov

? ?????????, 19 ???????? 2019 ?., 02:28:55 ?. ???????+2, David Cunningham
<dcunningham at voisonics.com> ??????:

Hi Raghavendra and Strahil,
We are using GFS version 5.6-1.el7 from the CentOS repository. Unfortunately we
can't modify the application and it expects to read and write from a normal
filesystem.
There's around 25GB of data being written during a business day, so over 10
hours that's around 0.7 MBps, which has me mystified as to how it can
generate 114MBps of network traffic. Granted we have read traffic as well, but
still. The chart shows much more inbound traffic to the GFS server than
outbound, suggesting the problem is with data writes.

Is it possible with GFS to not check with the other nodes when reading? Our data
is mostly static and we don't require 100% guarantee that the data is
up-to-date when reading.
Thanks for any assistance.

On Wed, 18 Dec 2019 at 16:39, Raghavendra Gowdappa <rgowdapp at
redhat.com> wrote:

What version of Glusterfs are you using? Though, not sure what's the root
cause of your problem, just wanted to point out a bug with read-ahead which
would cause read-amplification over network [1][2], which should be fixed in
recent versions.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1214489[2]
https://bugzilla.redhat.com/show_bug.cgi?id=1393419
On Wed, Dec 18, 2019 at 2:50 AM David Cunningham <dcunningham at
voisonics.com> wrote:

Hello,
We switched a production system to using GFS instead of NFS at the weekend,
however it didn't go well on Monday when full load hit. The application
started crashing regularly and we had to revert to NFS. It seems that the
problem was high network traffic used by GFS.

We've two GFS nodes plus one arbiter node, each about 1.3ms latency from
each other. Attached is a chart of network traffic on one of the GFS nodes. We
see that it saturated the 1Gbps link before we reverted to NFS at 15:10.
The question is, why does GFS use so much network traffic and is there anything
we can do about it? NFS traffic doesn't exceed 4MBps, so 120MBps for GFS
seems awfully high.
It would also be good to have faster read performance from GFS, but that's
another issue.

Thanks in advance for any assistance.

--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20191219/425af215/attachment.html>

Gluster users - Dec 2019 - GFS performance under heavy traffic

[Gluster-users] GFS performance under heavy traffic

[Gluster-users] GFS performance under heavy traffic

[Gluster-users] GFS performance under heavy traffic