thr3ads.net - Gluster users - [Gluster-users] Gluster High CPU/Clients Hanging on Heavy Writes [Aug 2018]

If this information is useful, please help other people find it:
Share via:

Yuhao Zhang

2018-Aug-08 05:49 UTC

[Gluster-users] Gluster High CPU/Clients Hanging on Heavy Writes

Hi Xavi,

Thank you for the suggestions, these are extremely helpful. I haven't
thought it could be ZFS problem. I went back and checked a longer monitoring
window and now I can see a pattern. Please see this attached Grafana screenshot
(also available here: https://cl.ly/070J2y3n1u0F
<https://cl.ly/070J2y3n1u0F> . Note that the data gaps were when I took
down the server for rebooting):

Between 8/4 - 8/6, I tried two transfer tests, and experienced 2 the gluster
hanging problems. One during the first transfer, and another one happened
shortly after the second transfer. I blocked both in pink lines.

Looks like during my transfer tests, free memory was almost exhausted. The
system has a very high cached memory, which I think was due to ZFS ARC. However,
I am under the impression that ZFS will release space from ARC if it observes
low system available memory. I am not sure why it didn't do that.

I did't tweak related ZFS parameters. zfs_arc_max was set to 0 (default
value). According to doc, it is "Max arc size of ARC in bytes. If set to 0
then it will consume 1/2  of  system RAM." So it appeared that this setting
didn't work.

When the server was under heavy IO, the used memory was instead decreased, which
I can't explain.

May I ask if you, or anyone else in this group, has recommendation on ZFS
settings for my setup? My server has 64GB physical memory and 150GB SSD space
reserved for L2_ARC.The zpool has 6 vdevs and each has 12TB * 10 hard drives on
raidz2. Total usable space in the zpool is 482TB.

Thank you,
Yuhao
> On Aug 7, 2018, at 01:36, Xavi Hernandez <jahernan at redhat.com>
wrote:
> 
> Hi Yuhao, 
> 
> On Mon, 6 Aug 2018, 15:26 Yuhao Zhang, <zzyzxd at gmail.com
<mailto:zzyzxd at gmail.com>> wrote:
> Hello,
> 
> I just experienced another hanging one hour ago and the server was not even
under heavy IO.
> 
> Atin, I attached the process monitoring results and another statedump.
> 
> Xavi, ZFS was fine, during the hanging, I can still write directly to the
ZFS volume. My ZFS version: ZFS: Loaded module v0.6.5.6-0ubuntu16, ZFS pool
version 5000, ZFS filesystem version 5
> 
> I highly recommend you to upgrade to version 0.6.5.8 at least. It fixes a
kernel panic that can happen when used with gluster. However this is not your
current problem.
> 
> Top statistics show low available memory and high CPU utilization of kswapd
process (along with one of the gluster processes). I've seen frequent memory
management problems with ZFS. Have you configured any ZFS parameters? It's
highly recommendable to tweak some memory limits.
> 
> If that were the problem, there's one thing that should alleviate it
(and see if it could be related):
> 
> echo 3 >/proc/sys/vm/drop_caches
> 
> This should be done on all bricks from time to time. You can wait until the
problem appears, but in this case the recovery time can be larger.
> 
> I think this should fix the high CPU usage of kswapd. If so, we'll need
to tweak some ZFS parameters.
> 
> I'm not sure if the high CPU usage of gluster could be related to this
or not.
> 
> Xavi
> 
> Thank you,
> Yuhao
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180808/a5a9a628/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Image 2018-08-07 at 23.59.09.png
Type: image/png
Size: 471519 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180808/a5a9a628/attachment-0001.png>

Pui Edylie

2018-Aug-08 06:23 UTC

head link

[Gluster-users] Gluster High CPU/Clients Hanging on Heavy Writes

Hi Yuhao,

Since ram is relatively inexpensive, if you have another 64GB laying 
around, why don't you stick them in to make it total 128GB to serve your 
482TB?

 From what I have read it appears there is a general recommendation 1GB 
Ram / 1 TB Disk

Our setup here we are using a Raid card to setup RAID 10 and present it 
to gluster as a single brick. I have seen issue if i directly write to 
the RAID 10, I would get around 250MB/sec, via gluster mounted volume 
100MB and if i put NFS on top it would reduce to 30MB/sec

Regards,
Edy

On 8/8/2018 1:49 PM, Yuhao Zhang wrote:> Hi Xavi,
>
> Thank you for the suggestions, these are extremely helpful. I haven't 
> thought it could be ZFS problem. I went back and checked a longer 
> monitoring window and now I can see a pattern. Please see this 
> attached Grafana screenshot (also available here: 
> https://cl.ly/070J2y3n1u0F???. Note that the data gaps were when I took 
> down the server for rebooting):
>
>
>
> Between 8/4 - 8/6, I tried two transfer tests, and experienced 2 the 
> gluster hanging problems. One during the first transfer, and another 
> one happened shortly after the second transfer. I blocked both in pink 
> lines.
>
> Looks like during my transfer tests, free memory was almost exhausted. 
> The system has a very high cached memory, which I think was due to ZFS 
> ARC. However, I am under the impression that ZFS will release space 
> from ARC if it observes low system available memory. I am not sure why 
> it didn't do that.
>
> I did't tweak related ZFS parameters.???zfs_arc_max was set to 0 
> (default value). According to doc, it is "Max arc size of ARC in 
> bytes. If set to 0 then it will consume 1/2 ???of ???system RAM." So
it
> appeared that this setting didn't work.
>
> When the server was under heavy IO, the used memory was instead 
> decreased, which I can't explain.
>
> May I ask if you, or anyone else in this group, has recommendation on 
> ZFS settings for my setup? My server has 64GB physical memory and 
> 150GB SSD space reserved for L2_ARC.The zpool has 6 vdevs and each has 
> 12TB * 10 hard drives on raidz2. Total usable space in the zpool is 482TB.
>
> Thank you,
> Yuhao
>
>> On Aug 7, 2018, at 01:36, Xavi Hernandez <jahernan at redhat.com 
>> <mailto:jahernan at redhat.com>> wrote:
>>
>> Hi Yuhao,
>>
>> On Mon, 6 Aug 2018, 15:26 Yuhao Zhang, <zzyzxd at gmail.com 
>> <mailto:zzyzxd at gmail.com>> wrote:
>>
>>     Hello,
>>
>>     I just experienced another hanging one hour ago and the server
>>     was not even under heavy IO.
>>
>>     Atin, I attached the process monitoring results and another
>>     statedump.
>>
>>     Xavi, ZFS was fine, during the hanging, I can still write
>>     directly to the ZFS volume. My ZFS version: ZFS: Loaded module
>>     v0.6.5.6-0ubuntu16, ZFS pool version 5000, ZFS filesystem version 5
>>
>>
>> I highly recommend you to upgrade to version 0.6.5.8 at least. It 
>> fixes a kernel panic that can happen when used with gluster. However 
>> this is not your current problem.
>>
>> Top statistics show low available memory and high CPU utilization of 
>> kswapd process (along with one of the gluster processes). I've seen
>> frequent memory management problems with ZFS. Have you configured any 
>> ZFS parameters? It's highly recommendable to tweak some memory
limits.
>>
>> If that were the problem, there's one thing that should alleviate
it
>> (and see if it could be related):
>>
>> echo 3 >/proc/sys/vm/drop_caches
>>
>> This should be done on all bricks from time to time. You can wait 
>> until the problem appears, but in this case the recovery time can be 
>> larger.
>>
>> I think this should fix the high CPU usage of kswapd. If so, we'll 
>> need to tweak some ZFS parameters.
>>
>> I'm not sure if the high CPU usage of gluster could be related to 
>> this or not.
>>
>> Xavi
>>
>>
>>     Thank you,
>>     Yuhao
>>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180808/2ab60ec1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Image 2018-08-07 at 23.59.09.png
Type: image/png
Size: 471519 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180808/2ab60ec1/attachment-0001.png>

Xavi Hernandez

2018-Aug-23 23:28 UTC

head link

[Gluster-users] Gluster High CPU/Clients Hanging on Heavy Writes

Hi Yuhao,

sorry for the late answer. I've had holidays and just returned.

On Wed, 8 Aug 2018, 07:49 Yuhao Zhang, <zzyzxd at gmail.com> wrote:
> Hi Xavi,
>
> Thank you for the suggestions, these are extremely helpful. I haven't
> thought it could be ZFS problem. I went back and checked a longer
> monitoring window and now I can see a pattern. Please see this attached
> Grafana screenshot (also available here: https://cl.ly/070J2y3n1u0F .
> Note that the data gaps were when I took down the server for rebooting):
>
>
>
> Between 8/4 - 8/6, I tried two transfer tests, and experienced 2 the
> gluster hanging problems. One during the first transfer, and another one
> happened shortly after the second transfer. I blocked both in pink lines.
>
> Looks like during my transfer tests, free memory was almost exhausted. The
> system has a very high cached memory, which I think was due to ZFS ARC.
> However, I am under the impression that ZFS will release space from ARC if
> it observes low system available memory. I am not sure why it didn't do
> that.
>
Yes, it should release memory, but for some reason I don't understand, when
there's high metadata load, it's not able to release the allocated
memory
fast enough (or so it seems). I've observed high CPU utilization by a ZFS
process at this point.

>
> I did't tweak related ZFS parameters. zfs_arc_max was set to 0 (default
> value). According to doc, it is "Max arc size of ARC in bytes. If set
to 0
> then it will consume 1/2  of  system RAM." So it appeared that this
setting
> didn't work.
>
>From my experience, with high metadata load this limit is not respected.Using 1/8 of system RAM seemed to keep memory consumption under control, at
least for the workloads I used.

In theory, ZFS 0.7.x.y should solve the memory management problems, but I
haven't tested it.

> When the server was under heavy IO, the used memory was instead decreased,
> which I can't explain.
>
I've only seen this problem when accessing large amounts of different files
(typical on a copy, rsync or find on a volume with thousands or millions of
files and directories). However, high IO on small set of files doesn't
cause any trouble.

It's related with caching of metadata, so high IO on a small set of files
doesn't require much metadata.

> May I ask if you, or anyone else in this group, has recommendation on ZFS
> settings for my setup? My server has 64GB physical memory and 150GB SSD
> space reserved for L2_ARC.The zpool has 6 vdevs and each has 12TB * 10 hard
> drives on raidz2. Total usable space in the zpool is 482TB.
>
As I said, I would try with 1/8 of system memory for ARC (it will use more
than that anyway). A drop cache also helps when memory is getting
exhausted. It causes ZFS to release memory faster, though I don't consider
it a good solution.

Also make sure that zfs_txg_timeout is set to 5 or a similar value to avoid
long disk access bursts. Other options to consider, depending on the use
case, are: zfs_disable_prefetch=1 and zfs_nocacheflush=1.

For better performance with gluster, xattr option on ZFS datasets should be
set to "sa", but this needs to be done on volume creation, before
creating
files. Otherwise it will only be applied to newer files. To use "sa"
safely, version 0.6.5.8 or higher should be used.

Xavi

> Thank you,
> Yuhao
>
> On Aug 7, 2018, at 01:36, Xavi Hernandez <jahernan at redhat.com>
wrote:
>
> Hi Yuhao,
>
> On Mon, 6 Aug 2018, 15:26 Yuhao Zhang, <zzyzxd at gmail.com> wrote:
>
>> Hello,
>>
>> I just experienced another hanging one hour ago and the server was not
>> even under heavy IO.
>>
>> Atin, I attached the process monitoring results and another statedump.
>>
>> Xavi, ZFS was fine, during the hanging, I can still write directly to
the
>> ZFS volume. My ZFS version: ZFS: Loaded module v0.6.5.6-0ubuntu16, ZFS
pool
>> version 5000, ZFS filesystem version 5
>>
>
> I highly recommend you to upgrade to version 0.6.5.8 at least. It fixes a
> kernel panic that can happen when used with gluster. However this is not
> your current problem.
>
> Top statistics show low available memory and high CPU utilization of
> kswapd process (along with one of the gluster processes). I've seen
> frequent memory management problems with ZFS. Have you configured any ZFS
> parameters? It's highly recommendable to tweak some memory limits.
>
> If that were the problem, there's one thing that should alleviate it
(and
> see if it could be related):
>
> echo 3 >/proc/sys/vm/drop_caches
>
> This should be done on all bricks from time to time. You can wait until
> the problem appears, but in this case the recovery time can be larger.
>
> I think this should fix the high CPU usage of kswapd. If so, we'll need
to
> tweak some ZFS parameters.
>
> I'm not sure if the high CPU usage of gluster could be related to this
or
> not.
>
> Xavi
>
>>
>> Thank you,
>> Yuhao
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180824/20552ecc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Image 2018-08-07 at 23.59.09.png
Type: image/png
Size: 471519 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180824/20552ecc/attachment-0001.png>

Gluster users - Aug 2018 - Gluster High CPU/Clients Hanging on Heavy Writes

[Gluster-users] Gluster High CPU/Clients Hanging on Heavy Writes

[Gluster-users] Gluster High CPU/Clients Hanging on Heavy Writes

[Gluster-users] Gluster High CPU/Clients Hanging on Heavy Writes