Mauro Tridici
2018-Sep-26 10:08 UTC
[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version
Dear All, Dear Nithya, after upgrading from 3.10.5 version to 3.12.14, I tried to start a rebalance process to distribute data across the bricks, but something goes wrong. Rebalance failed on different nodes and the time value needed to complete the procedure seems to be very high. [root at s01 ~]# gluster volume rebalance tier2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 19 161.6GB 537 2 2 in progress 0:32:23 s02-stg 25 212.7GB 526 5 2 in progress 0:32:25 s03-stg 4 69.1GB 511 0 0 in progress 0:32:25 s04-stg 4 484Bytes 12283 0 3 in progress 0:32:25 s05-stg 23 484Bytes 11049 0 10 in progress 0:32:25 s06-stg 3 1.2GB 8032 11 3 failed 0:17:57 Estimated time left for rebalance to complete : 3601:05:41 volume rebalance: tier2: success When rebalance processes fail, I can see the following kind of errors in /var/log/glusterfs/tier2-rebalance.log Error type 1) [2018-09-26 08:50:19.872575] W [MSGID: 122053] [ec-common.c:269:ec_check_status] 0-tier2-disperse-10: Operation failed on 2 of 6 subvolumes.(up=111111, mask=100111, remaining000000, good=100111, bad=011000) [2018-09-26 08:50:19.901792] W [MSGID: 122053] [ec-common.c:269:ec_check_status] 0-tier2-disperse-11: Operation failed on 1 of 6 subvolumes.(up=111111, mask=111101, remaining000000, good=111101, bad=000010) Error type 2) [2018-09-26 08:53:31.566836] W [socket.c:600:__socket_rwv] 0-tier2-client-53: readv on 192.168.0.55:49153 failed (Connection reset by peer) Error type 3) [2018-09-26 08:57:37.852590] W [MSGID: 122035] [ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation with some subvolumes unavailable (10) [2018-09-26 08:57:39.282306] W [MSGID: 122035] [ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation with some subvolumes unavailable (10) [2018-09-26 09:02:04.928408] W [MSGID: 109023] [dht-rebalance.c:1013:__dht_check_free_space] 0-tier2-dht: data movement of file {blocks:0 name:(/OPA/archive/historical/dts/MRE A/Observations/Observations/MREA14/Cs-1/CMCC/raw/CS013.ext)} would result in dst node (tier2-disperse-5:2440190848) having lower disk space than the source node (tier2-dispers e-11:71373083776).Skipping file. Error type 4) W [rpc-clnt-ping.c:223:rpc_clnt_ping_cbk] 0-tier2-client-7: socket disconnected Error type 5) [2018-09-26 09:07:42.333720] W [glusterfsd.c:1375:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7e25) [0x7f0417e0ee25] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55 90086004b5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55900860032b] ) 0-: received signum (15), shutting down Error type 6) [2018-09-25 08:09:18.340658] C [rpc-clnt-ping.c:166:rpc_clnt_ping_timer_expired] 0-tier2-client-4: server 192.168.0.52:49153 has not responded in the last 42 seconds, disconnecting. It seems that there are some network or timeout problems, but the network usage/traffic values are not so high. Do you think that, in my volume configuration, I have to modify some volume options related to thread and/or network parameters? Could you, please, help me to understand the cause of the problems above? You can find below our volume info: (volume is implemented on 6 servers; each server configuration: 2 cpu 10-cores, 64GB RAM, 1 SSD dedicated to the OS, 12 x 10TB HD) [root at s04 ~]# gluster vol info Volume Name: tier2 Type: Distributed-Disperse Volume ID: a28d88c5-3295-4e35-98d4-210b3af9358c Status: Started Snapshot Count: 0 Number of Bricks: 12 x (4 + 2) = 72 Transport-type: tcp Bricks: Brick1: s01-stg:/gluster/mnt1/brick Brick2: s02-stg:/gluster/mnt1/brick Brick3: s03-stg:/gluster/mnt1/brick Brick4: s01-stg:/gluster/mnt2/brick Brick5: s02-stg:/gluster/mnt2/brick Brick6: s03-stg:/gluster/mnt2/brick Brick7: s01-stg:/gluster/mnt3/brick Brick8: s02-stg:/gluster/mnt3/brick Brick9: s03-stg:/gluster/mnt3/brick Brick10: s01-stg:/gluster/mnt4/brick Brick11: s02-stg:/gluster/mnt4/brick Brick12: s03-stg:/gluster/mnt4/brick Brick13: s01-stg:/gluster/mnt5/brick Brick14: s02-stg:/gluster/mnt5/brick Brick15: s03-stg:/gluster/mnt5/brick Brick16: s01-stg:/gluster/mnt6/brick Brick17: s02-stg:/gluster/mnt6/brick Brick18: s03-stg:/gluster/mnt6/brick Brick19: s01-stg:/gluster/mnt7/brick Brick20: s02-stg:/gluster/mnt7/brick Brick21: s03-stg:/gluster/mnt7/brick Brick22: s01-stg:/gluster/mnt8/brick Brick23: s02-stg:/gluster/mnt8/brick Brick24: s03-stg:/gluster/mnt8/brick Brick25: s01-stg:/gluster/mnt9/brick Brick26: s02-stg:/gluster/mnt9/brick Brick27: s03-stg:/gluster/mnt9/brick Brick28: s01-stg:/gluster/mnt10/brick Brick29: s02-stg:/gluster/mnt10/brick Brick30: s03-stg:/gluster/mnt10/brick Brick31: s01-stg:/gluster/mnt11/brick Brick32: s02-stg:/gluster/mnt11/brick Brick33: s03-stg:/gluster/mnt11/brick Brick34: s01-stg:/gluster/mnt12/brick Brick35: s02-stg:/gluster/mnt12/brick Brick36: s03-stg:/gluster/mnt12/brick Brick37: s04-stg:/gluster/mnt1/brick Brick38: s04-stg:/gluster/mnt2/brick Brick39: s04-stg:/gluster/mnt3/brick Brick40: s04-stg:/gluster/mnt4/brick Brick41: s04-stg:/gluster/mnt5/brick Brick42: s04-stg:/gluster/mnt6/brick Brick43: s04-stg:/gluster/mnt7/brick Brick44: s04-stg:/gluster/mnt8/brick Brick45: s04-stg:/gluster/mnt9/brick Brick46: s04-stg:/gluster/mnt10/brick Brick47: s04-stg:/gluster/mnt11/brick Brick48: s04-stg:/gluster/mnt12/brick Brick49: s05-stg:/gluster/mnt1/brick Brick50: s05-stg:/gluster/mnt2/brick Brick51: s05-stg:/gluster/mnt3/brick Brick52: s05-stg:/gluster/mnt4/brick Brick53: s05-stg:/gluster/mnt5/brick Brick54: s05-stg:/gluster/mnt6/brick Brick55: s05-stg:/gluster/mnt7/brick Brick56: s05-stg:/gluster/mnt8/brick Brick57: s05-stg:/gluster/mnt9/brick Brick58: s05-stg:/gluster/mnt10/brick Brick59: s05-stg:/gluster/mnt11/brick Brick60: s05-stg:/gluster/mnt12/brick Brick61: s06-stg:/gluster/mnt1/brick Brick62: s06-stg:/gluster/mnt2/brick Brick63: s06-stg:/gluster/mnt3/brick Brick64: s06-stg:/gluster/mnt4/brick Brick65: s06-stg:/gluster/mnt5/brick Brick66: s06-stg:/gluster/mnt6/brick Brick67: s06-stg:/gluster/mnt7/brick Brick68: s06-stg:/gluster/mnt8/brick Brick69: s06-stg:/gluster/mnt9/brick Brick70: s06-stg:/gluster/mnt10/brick Brick71: s06-stg:/gluster/mnt11/brick Brick72: s06-stg:/gluster/mnt12/brick Options Reconfigured: network.ping-timeout: 60 diagnostics.count-fop-hits: on diagnostics.latency-measurement: on cluster.server-quorum-type: server features.default-soft-limit: 90 features.quota-deem-statfs: on performance.io-thread-count: 16 disperse.cpu-extensions: auto performance.io-cache: off network.inode-lru-limit: 50000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on cluster.readdir-optimize: on performance.parallel-readdir: off performance.readdir-ahead: on cluster.lookup-optimize: on client.event-threads: 4 server.event-threads: 4 nfs.disable: on transport.address-family: inet cluster.quorum-type: auto cluster.min-free-disk: 10 performance.client-io-threads: on features.quota: on features.inode-quota: on features.bitrot: on features.scrub: Active cluster.brick-multiplex: on cluster.server-quorum-ratio: 51% If it can help, I paste here the output of ?free -m? command executed on all the cluster nodes: The result is almost the same on every nodes. In your opinion, the available RAM is enough to support data movement? [root at s06 ~]# free -m total used free shared buff/cache available Mem: 64309 10409 464 15 53434 52998 Swap: 65535 103 65432 Thank you in advance. Sorry for my long message, but I?m trying to notify you all available information. Regards, Mauro -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180926/8416c3e4/attachment.html>
Ashish Pandey
2018-Sep-26 12:13 UTC
[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version
I think we don't have enough logs to debug this so I would suggest you to provide more logs/info. I have also observed that the configuration and setup of your volume is not very efficient. For example: Brick37: s04-stg:/gluster/mnt1/brick Brick38: s04-stg:/gluster/mnt2/brick Brick39: s04-stg:/gluster/mnt3/brick Brick40: s04-stg:/gluster/mnt4/brick Brick41: s04-stg:/gluster/mnt5/brick Brick42: s04-stg:/gluster/mnt6/brick Brick43: s04-stg:/gluster/mnt7/brick Brick44: s04-stg:/gluster/mnt8/brick Brick45: s04-stg:/gluster/mnt9/brick Brick46: s04-stg:/gluster/mnt10/brick Brick47: s04-stg:/gluster/mnt11/brick Brick48: s04-stg:/gluster/mnt12/brick These 12 bricks are on same node and the sub volume made up of these bricks will be of same subvolume, which is not good. Same is true for the bricks hosted on s05-stg and s06-stg I think you have added these bricks after creating vol. The probability of disruption in connection of these bricks will be higher in this case. --- Ashish ----- Original Message ----- From: "Mauro Tridici" <mauro.tridici at cmcc.it> To: "gluster-users" <gluster-users at gluster.org> Sent: Wednesday, September 26, 2018 3:38:35 PM Subject: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version Dear All, Dear Nithya, after upgrading from 3.10.5 version to 3.12.14, I tried to start a rebalance process to distribute data across the bricks, but something goes wrong. Rebalance failed on different nodes and the time value needed to complete the procedure seems to be very high. [root at s01 ~]# gluster volume rebalance tier2 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 19 161.6GB 537 2 2 in progress 0:32:23 s02-stg 25 212.7GB 526 5 2 in progress 0:32:25 s03-stg 4 69.1GB 511 0 0 in progress 0:32:25 s04-stg 4 484Bytes 12283 0 3 in progress 0:32:25 s05-stg 23 484Bytes 11049 0 10 in progress 0:32:25 s06-stg 3 1.2GB 8032 11 3 failed 0:17:57 Estimated time left for rebalance to complete : 3601:05:41 volume rebalance: tier2: success When rebalance processes fail, I can see the following kind of errors in /var/log/glusterfs/tier2-rebalance.log Error type 1) [2018-09-26 08:50:19.872575] W [MSGID: 122053] [ec-common.c:269:ec_check_status] 0-tier2-disperse-10: Operation failed on 2 of 6 subvolumes.(up=111111, mask=100111, remaining= 000000, good=100111, bad=011000) [2018-09-26 08:50:19.901792] W [MSGID: 122053] [ec-common.c:269:ec_check_status] 0-tier2-disperse-11: Operation failed on 1 of 6 subvolumes.(up=111111, mask=111101, remaining= 000000, good=111101, bad=000010) Error type 2) [2018-09-26 08:53:31.566836] W [socket.c:600:__socket_rwv] 0-tier2-client-53: readv on 192.168.0.55:49153 failed (Connection reset by peer) Error type 3) [2018-09-26 08:57:37.852590] W [MSGID: 122035] [ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation with some subvolumes unavailable (10) [2018-09-26 08:57:39.282306] W [MSGID: 122035] [ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation with some subvolumes unavailable (10) [2018-09-26 09:02:04.928408] W [MSGID: 109023] [dht-rebalance.c:1013:__dht_check_free_space] 0-tier2-dht: data movement of file {blocks:0 name:(/OPA/archive/historical/dts/MRE A/Observations/Observations/MREA14/Cs-1/CMCC/raw/CS013.ext)} would result in dst node (tier2-disperse-5:2440190848) having lower disk space than the source node (tier2-dispers e-11:71373083776).Skipping file. Error type 4) W [rpc-clnt-ping.c:223:rpc_clnt_ping_cbk] 0-tier2-client-7: socket disconnected Error type 5) [2018-09-26 09:07:42.333720] W [glusterfsd.c:1375:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7e25) [0x7f0417e0ee25] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55 90086004b5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55900860032b] ) 0-: received signum (15), shutting down Error type 6) [2018-09-25 08:09:18.340658] C [rpc-clnt-ping.c:166:rpc_clnt_ping_timer_expired] 0-tier2-client-4: server 192.168.0.52:49153 has not responded in the last 42 seconds, disconnecting. It seems that there are some network or timeout problems, but the network usage/traffic values are not so high. Do you think that, in my volume configuration, I have to modify some volume options related to thread and/or network parameters? Could you, please, help me to understand the cause of the problems above? You can find below our volume info: (volume is implemented on 6 servers; each server configuration: 2 cpu 10-cores, 64GB RAM, 1 SSD dedicated to the OS, 12 x 10TB HD) [root at s04 ~]# gluster vol info Volume Name: tier2 Type: Distributed-Disperse Volume ID: a28d88c5-3295-4e35-98d4-210b3af9358c Status: Started Snapshot Count: 0 Number of Bricks: 12 x (4 + 2) = 72 Transport-type: tcp Bricks: Brick1: s01-stg:/gluster/mnt1/brick Brick2: s02-stg:/gluster/mnt1/brick Brick3: s03-stg:/gluster/mnt1/brick Brick4: s01-stg:/gluster/mnt2/brick Brick5: s02-stg:/gluster/mnt2/brick Brick6: s03-stg:/gluster/mnt2/brick Brick7: s01-stg:/gluster/mnt3/brick Brick8: s02-stg:/gluster/mnt3/brick Brick9: s03-stg:/gluster/mnt3/brick Brick10: s01-stg:/gluster/mnt4/brick Brick11: s02-stg:/gluster/mnt4/brick Brick12: s03-stg:/gluster/mnt4/brick Brick13: s01-stg:/gluster/mnt5/brick Brick14: s02-stg:/gluster/mnt5/brick Brick15: s03-stg:/gluster/mnt5/brick Brick16: s01-stg:/gluster/mnt6/brick Brick17: s02-stg:/gluster/mnt6/brick Brick18: s03-stg:/gluster/mnt6/brick Brick19: s01-stg:/gluster/mnt7/brick Brick20: s02-stg:/gluster/mnt7/brick Brick21: s03-stg:/gluster/mnt7/brick Brick22: s01-stg:/gluster/mnt8/brick Brick23: s02-stg:/gluster/mnt8/brick Brick24: s03-stg:/gluster/mnt8/brick Brick25: s01-stg:/gluster/mnt9/brick Brick26: s02-stg:/gluster/mnt9/brick Brick27: s03-stg:/gluster/mnt9/brick Brick28: s01-stg:/gluster/mnt10/brick Brick29: s02-stg:/gluster/mnt10/brick Brick30: s03-stg:/gluster/mnt10/brick Brick31: s01-stg:/gluster/mnt11/brick Brick32: s02-stg:/gluster/mnt11/brick Brick33: s03-stg:/gluster/mnt11/brick Brick34: s01-stg:/gluster/mnt12/brick Brick35: s02-stg:/gluster/mnt12/brick Brick36: s03-stg:/gluster/mnt12/brick Brick37: s04-stg:/gluster/mnt1/brick Brick38: s04-stg:/gluster/mnt2/brick Brick39: s04-stg:/gluster/mnt3/brick Brick40: s04-stg:/gluster/mnt4/brick Brick41: s04-stg:/gluster/mnt5/brick Brick42: s04-stg:/gluster/mnt6/brick Brick43: s04-stg:/gluster/mnt7/brick Brick44: s04-stg:/gluster/mnt8/brick Brick45: s04-stg:/gluster/mnt9/brick Brick46: s04-stg:/gluster/mnt10/brick Brick47: s04-stg:/gluster/mnt11/brick Brick48: s04-stg:/gluster/mnt12/brick Brick49: s05-stg:/gluster/mnt1/brick Brick50: s05-stg:/gluster/mnt2/brick Brick51: s05-stg:/gluster/mnt3/brick Brick52: s05-stg:/gluster/mnt4/brick Brick53: s05-stg:/gluster/mnt5/brick Brick54: s05-stg:/gluster/mnt6/brick Brick55: s05-stg:/gluster/mnt7/brick Brick56: s05-stg:/gluster/mnt8/brick Brick57: s05-stg:/gluster/mnt9/brick Brick58: s05-stg:/gluster/mnt10/brick Brick59: s05-stg:/gluster/mnt11/brick Brick60: s05-stg:/gluster/mnt12/brick Brick61: s06-stg:/gluster/mnt1/brick Brick62: s06-stg:/gluster/mnt2/brick Brick63: s06-stg:/gluster/mnt3/brick Brick64: s06-stg:/gluster/mnt4/brick Brick65: s06-stg:/gluster/mnt5/brick Brick66: s06-stg:/gluster/mnt6/brick Brick67: s06-stg:/gluster/mnt7/brick Brick68: s06-stg:/gluster/mnt8/brick Brick69: s06-stg:/gluster/mnt9/brick Brick70: s06-stg:/gluster/mnt10/brick Brick71: s06-stg:/gluster/mnt11/brick Brick72: s06-stg:/gluster/mnt12/brick Options Reconfigured: network.ping-timeout: 60 diagnostics.count-fop-hits: on diagnostics.latency-measurement: on cluster.server-quorum-type: server features.default-soft-limit: 90 features.quota-deem-statfs: on performance.io -thread-count: 16 disperse.cpu-extensions: auto performance.io -cache: off network.inode-lru-limit: 50000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on cluster.readdir-optimize: on performance.parallel-readdir: off performance.readdir-ahead: on cluster.lookup-optimize: on client.event-threads: 4 server.event-threads: 4 nfs.disable: on transport.address-family: inet cluster.quorum-type: auto cluster.min-free-disk: 10 performance.client-io-threads: on features.quota: on features.inode-quota: on features.bitrot: on features.scrub: Active cluster.brick-multiplex: on cluster.server-quorum-ratio: 51% If it can help, I paste here the output of ?free -m? command executed on all the cluster nodes: The result is almost the same on every nodes. In your opinion, the available RAM is enough to support data movement? [root at s06 ~]# free -m total used free shared buff/cache available Mem: 64309 10409 464 15 53434 52998 Swap: 65535 103 65432 Thank you in advance. Sorry for my long message, but I?m trying to notify you all available information. Regards, Mauro _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180926/392d561d/attachment.html>