thr3ads.net - Gluster users - [Gluster-users] How to configure? [Mar 2023]

If this information is useful, please help other people find it:
Share via:

Diego Zuccato

2023-Mar-15 07:54 UTC

[Gluster-users] How to configure?

I enabled it yesterday and that greatly reduced memory pressure.
Current volume info:
-8<--
Volume Name: cluster_data
Type: Distributed-Replicate
Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
Status: Started
Snapshot Count: 0
Number of Bricks: 45 x (2 + 1) = 135
Transport-type: tcp
Bricks:
Brick1: clustor00:/srv/bricks/00/d
Brick2: clustor01:/srv/bricks/00/d
Brick3: clustor02:/srv/bricks/00/q (arbiter)
[...]
Brick133: clustor01:/srv/bricks/29/d
Brick134: clustor02:/srv/bricks/29/d
Brick135: clustor00:/srv/bricks/14/q (arbiter)
Options Reconfigured:
performance.quick-read: off
cluster.entry-self-heal: on
cluster.data-self-heal-algorithm: full
cluster.metadata-self-heal: on
cluster.shd-max-threads: 2
network.inode-lru-limit: 500000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
features.quota-deem-statfs: on
performance.readdir-ahead: on
cluster.granular-entry-heal: enable
features.scrub: Active
features.bitrot: on
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-refresh-timeout: 60
performance.parallel-readdir: on
performance.write-behind-window-size: 128MB
cluster.self-heal-daemon: enable
features.inode-quota: on
features.quota: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
client.event-threads: 1
features.scrub-throttle: normal
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
config.brick-threads: 0
cluster.lookup-unhashed: on
config.client-threads: 1
cluster.use-anonymous-inode: off
diagnostics.brick-sys-log-level: CRITICAL
features.scrub-freq: monthly
cluster.data-self-heal: on
cluster.brick-multiplex: on
cluster.daemon-log-level: ERROR
-8<--

htop reports that memory usage is up to 143G, there are 602 tasks and 
5232 threads (~20 running) on clustor00, 117G/49 tasks/1565 threads on 
clustor01 and 126G/45 tasks/1574 threads on clustor02.
I see quite a lot (284!) of glfsheal processes running on clustor00 (a 
"gluster v heal cluster_data info summary" is running on clustor02
since
yesterday, still no output). Shouldn't be just one per brick?

Diego

Il 15/03/2023 08:30, Strahil Nikolov ha scritto:> Do you use brick multiplexing ?
> 
> Best Regards,
> Strahil Nikolov
> 
>     On Tue, Mar 14, 2023 at 16:44, Diego Zuccato
>     <diego.zuccato at unibo.it> wrote:
>     Hello all.
> 
>     Our Gluster 9.6 cluster is showing increasing problems.
>     Currently it's composed of 3 servers (2x Intel Xeon 4210 [20 cores
dual
>     thread, total 40 threads], 192GB RAM, 30x HGST HUH721212AL5200 [12TB]),
>     configured in replica 3 arbiter 1. Using Debian packages from Gluster
>     9.x latest repository.
> 
>     Seems 192G RAM are not enough to handle 30 data bricks + 15 arbiters
>     and
>     I often had to reload glusterfsd because glusterfs processed got killed
>     for OOM.
>     On top of that, performance have been quite bad, especially when we
>     reached about 20M files. On top of that, one of the servers have had
>     mobo issues that resulted in memory errors that corrupted some
>     bricks fs
>     (XFS, it required "xfs_reparir -L" to fix).
>     Now I'm getting lots of "stale file handle" errors and
other errors
>     (like directories that seem empty from the client but still containing
>     files in some bricks) and auto healing seems unable to complete.
> 
>     Since I can't keep up continuing to manually fix all the issues,
I'm
>     thinking about backup+destroy+recreate strategy.
> 
>     I think that if I reduce the number of bricks per server to just 5
>     (RAID1 of 6x12TB disks) I might resolve RAM issues - at the cost of
>     longer heal times in case a disk fails. Am I right or it's useless?
>     Other recommendations?
>     Servers have space for another 6 disks. Maybe those could be used for
>     some SSDs to speed up access?
> 
>     TIA.
> 
>     -- 
>     Diego Zuccato
>     DIFA - Dip. di Fisica e Astronomia
>     Servizi Informatici
>     Alma Mater Studiorum - Universit? di Bologna
>     V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
>     tel.: +39 051 20 95786
>     ________
> 
> 
> 
>     Community Meeting Calendar:
> 
>     Schedule -
>     Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>     Bridge: https://meet.google.com/cpu-eiue-hvk
>     <https://meet.google.com/cpu-eiue-hvk>
>     Gluster-users mailing list
>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>     https://lists.gluster.org/mailman/listinfo/gluster-users
>     <https://lists.gluster.org/mailman/listinfo/gluster-users>
> 
-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Universit? di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

Strahil Nikolov

2023-Mar-15 19:11 UTC

head link

[Gluster-users] How to configure?

If you don't experience any OOM , you can focus on the heals.
284 processes of glfsheal seems odd.
Can you check the ppid for 2-3 randomly picked ?ps -o ppid= <pid>
Best Regards,Strahil Nikolov?
 
 
  On Wed, Mar 15, 2023 at 9:54, Diego Zuccato<diego.zuccato at unibo.it>
wrote:   I enabled it yesterday and that greatly reduced memory pressure.
Current volume info:
-8<--
Volume Name: cluster_data
Type: Distributed-Replicate
Volume ID: a8caaa90-d161-45bb-a68c-278263a8531a
Status: Started
Snapshot Count: 0
Number of Bricks: 45 x (2 + 1) = 135
Transport-type: tcp
Bricks:
Brick1: clustor00:/srv/bricks/00/d
Brick2: clustor01:/srv/bricks/00/d
Brick3: clustor02:/srv/bricks/00/q (arbiter)
[...]
Brick133: clustor01:/srv/bricks/29/d
Brick134: clustor02:/srv/bricks/29/d
Brick135: clustor00:/srv/bricks/14/q (arbiter)
Options Reconfigured:
performance.quick-read: off
cluster.entry-self-heal: on
cluster.data-self-heal-algorithm: full
cluster.metadata-self-heal: on
cluster.shd-max-threads: 2
network.inode-lru-limit: 500000
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
features.quota-deem-statfs: on
performance.readdir-ahead: on
cluster.granular-entry-heal: enable
features.scrub: Active
features.bitrot: on
cluster.lookup-optimize: on
performance.stat-prefetch: on
performance.cache-refresh-timeout: 60
performance.parallel-readdir: on
performance.write-behind-window-size: 128MB
cluster.self-heal-daemon: enable
features.inode-quota: on
features.quota: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
client.event-threads: 1
features.scrub-throttle: normal
diagnostics.brick-log-level: ERROR
diagnostics.client-log-level: ERROR
config.brick-threads: 0
cluster.lookup-unhashed: on
config.client-threads: 1
cluster.use-anonymous-inode: off
diagnostics.brick-sys-log-level: CRITICAL
features.scrub-freq: monthly
cluster.data-self-heal: on
cluster.brick-multiplex: on
cluster.daemon-log-level: ERROR
-8<--

htop reports that memory usage is up to 143G, there are 602 tasks and 
5232 threads (~20 running) on clustor00, 117G/49 tasks/1565 threads on 
clustor01 and 126G/45 tasks/1574 threads on clustor02.
I see quite a lot (284!) of glfsheal processes running on clustor00 (a 
"gluster v heal cluster_data info summary" is running on clustor02
since
yesterday, still no output). Shouldn't be just one per brick?

Diego

Il 15/03/2023 08:30, Strahil Nikolov ha scritto:> Do you use brick multiplexing ?
> 
> Best Regards,
> Strahil Nikolov
> 
>? ? On Tue, Mar 14, 2023 at 16:44, Diego Zuccato
>? ? <diego.zuccato at unibo.it> wrote:
>? ? Hello all.
> 
>? ? Our Gluster 9.6 cluster is showing increasing problems.
>? ? Currently it's composed of 3 servers (2x Intel Xeon 4210 [20 cores
dual
>? ? thread, total 40 threads], 192GB RAM, 30x HGST HUH721212AL5200 [12TB]),
>? ? configured in replica 3 arbiter 1. Using Debian packages from Gluster
>? ? 9.x latest repository.
> 
>? ? Seems 192G RAM are not enough to handle 30 data bricks + 15 arbiters
>? ? and
>? ? I often had to reload glusterfsd because glusterfs processed got killed
>? ? for OOM.
>? ? On top of that, performance have been quite bad, especially when we
>? ? reached about 20M files. On top of that, one of the servers have had
>? ? mobo issues that resulted in memory errors that corrupted some
>? ? bricks fs
>? ? (XFS, it required "xfs_reparir -L" to fix).
>? ? Now I'm getting lots of "stale file handle" errors and
other errors
>? ? (like directories that seem empty from the client but still containing
>? ? files in some bricks) and auto healing seems unable to complete.
> 
>? ? Since I can't keep up continuing to manually fix all the issues,
I'm
>? ? thinking about backup+destroy+recreate strategy.
> 
>? ? I think that if I reduce the number of bricks per server to just 5
>? ? (RAID1 of 6x12TB disks) I might resolve RAM issues - at the cost of
>? ? longer heal times in case a disk fails. Am I right or it's useless?
>? ? Other recommendations?
>? ? Servers have space for another 6 disks. Maybe those could be used for
>? ? some SSDs to speed up access?
> 
>? ? TIA.
> 
>? ? -- 
>? ? Diego Zuccato
>? ? DIFA - Dip. di Fisica e Astronomia
>? ? Servizi Informatici
>? ? Alma Mater Studiorum - Universit? di Bologna
>? ? V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
>? ? tel.: +39 051 20 95786
>? ? ________
> 
> 
> 
>? ? Community Meeting Calendar:
> 
>? ? Schedule -
>? ? Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>? ? Bridge: https://meet.google.com/cpu-eiue-hvk
>? ? <https://meet.google.com/cpu-eiue-hvk>
>? ? Gluster-users mailing list
>? ? Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>? ? https://lists.gluster.org/mailman/listinfo/gluster-users
>? ? <https://lists.gluster.org/mailman/listinfo/gluster-users>
> 
-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Universit? di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20230315/b58159bb/attachment.html>

Reasonably Related Threads

Search for more seemingly similar threads

Gluster users - Mar 2023 - How to configure?

[Gluster-users] How to configure?

[Gluster-users] How to configure?

Reasonably Related Threads