thr3ads.net - Gluster users - [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version [Sep 2018]

If this information is useful, please help other people find it:
Share via:

Mauro Tridici

2018-Sep-26 17:56 UTC

[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version

Hi Ashish,

sure, no problem! We are a little bit worried, but we can wait  :-)
Thank you very much for your support and your availability.

Regards,
Mauro

> Il giorno 26 set 2018, alle ore 19:33, Ashish Pandey <aspandey at
redhat.com> ha scritto:
> 
> Hi Mauro,
> 
> Yes, I can provide you step by step procedure to correct it. 
> Is it fine If i provide you the steps tomorrow as it is quite late over
here and I don't want to miss anything in hurry?
> 
> ---
> Ashish
> 
> From: "Mauro Tridici" <mauro.tridici at cmcc.it>
> To: "Ashish Pandey" <aspandey at redhat.com>
> Cc: "gluster-users" <gluster-users at gluster.org>
> Sent: Wednesday, September 26, 2018 6:54:19 PM
> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse
volume        based on 3.12.14 version
> 
> 
> Hi Ashish,
> 
> in attachment you can find the rebalance log file and the last updated
brick log file (the other files in /var/log/glusterfs/bricks directory seem to
be too old).
> I just stopped the running rebalance (as you can see at the bottom of the
rebalance log file).
> So, if exists a safe procedure to correct the problem I would like execute
it.
> 
> I don?t know if I can ask you it, but, if it is possible, could you please
describe me step by step the right procedure to remove the newly added bricks
without losing the data that have been already rebalanced?
> 
> The following outputs show the result of ?df -h? command executed on one of
the first 3 nodes (s01, s02, s03) already existing  and on one of the last 3
nodes (s04, s05, s06) added recently.
> 
> [root at s06 bricks]# df -h
> File system                          Dim. Usati Dispon. Uso% Montato su
> /dev/mapper/cl_s06-root              100G  2,1G     98G   3% /
> devtmpfs                              32G     0     32G   0% /dev
> tmpfs                                 32G  4,0K     32G   1% /dev/shm
> tmpfs                                 32G   26M     32G   1% /run
> tmpfs                                 32G     0     32G   0% /sys/fs/cgroup
> /dev/mapper/cl_s06-var               100G  2,0G     99G   2% /var
> /dev/mapper/cl_s06-gluster           100G   33M    100G   1% /gluster
> /dev/sda1                           1014M  152M    863M  15% /boot
> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  807G    8,3T   9% /gluster/mnt3
> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  807G    8,3T   9% /gluster/mnt6
> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  807G    8,3T   9% /gluster/mnt2
> /dev/mapper/gluster_vge-gluster_lve  9,0T  807G    8,3T   9% /gluster/mnt4
> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  887G    8,2T  10% /gluster/mnt9
> /dev/mapper/gluster_vgb-gluster_lvb  9,0T  807G    8,3T   9% /gluster/mnt1
> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  887G    8,2T  10% /gluster/mnt7
> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  807G    8,3T   9% /gluster/mnt5
> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  887G    8,2T  10% /gluster/mnt8
> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  887G    8,2T  10% /gluster/mnt11
> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  887G    8,2T  10% /gluster/mnt10
> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  887G    8,2T  10% /gluster/mnt12
> tmpfs                                6,3G     0    6,3G   0% /run/user/0
> 
> [root at s01 ~]# df -h
> File system                          Dim. Usati Dispon. Uso% Montato su
> /dev/mapper/cl_s01-root              100G  5,3G     95G   6% /
> devtmpfs                              32G     0     32G   0% /dev
> tmpfs                                 32G   39M     32G   1% /dev/shm
> tmpfs                                 32G   26M     32G   1% /run
> tmpfs                                 32G     0     32G   0% /sys/fs/cgroup
> /dev/mapper/cl_s01-var               100G   11G     90G  11% /var
> /dev/md127                          1015M  151M    865M  15% /boot
> /dev/mapper/cl_s01-gluster           100G   33M    100G   1% /gluster
> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  5,5T    3,6T  61% /gluster/mnt7
> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  5,4T    3,6T  61% /gluster/mnt11
> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  5,7T    3,4T  63% /gluster/mnt4
> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  5,8T    3,3T  64% /gluster/mnt10
> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  5,5T    3,6T  61% /gluster/mnt8
> /dev/mapper/gluster_vgn-gluster_lvn  9,0T  5,4T    3,6T  61% /gluster/mnt12
> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  5,8T    3,3T  64% /gluster/mnt9
> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  5,6T    3,5T  63% /gluster/mnt6
> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  5,6T    3,5T  63% /gluster/mnt5
> /dev/mapper/gluster_vge-gluster_lve  9,0T  5,7T    3,4T  63% /gluster/mnt3
> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  5,6T    3,5T  62% /gluster/mnt1
> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  5,6T    3,5T  62% /gluster/mnt2
> tmpfs                                6,3G     0    6,3G   0% /run/user/0
> s01-stg:tier2                        420T  159T    262T  38% /tier2
> 
> As you can see, used space value of each brick of the last servers is about
800GB.
> 
> Thank you,
> Mauro
> 
> 
> 
> 
> 
> 
> 
> 
> Il giorno 26 set 2018, alle ore 14:51, Ashish Pandey <aspandey at
redhat.com <mailto:aspandey at redhat.com>> ha scritto:
> 
> Hi Mauro,
> 
> rebalance and brick logs should be the first thing we should go through.
> 
> There is a procedure to correct the configuration/setup but the situation
you are in is difficult to follow that procedure.
> You should have added the bricks hosted on s04-stg, s05-stg and s06-stg the
same way you had the previous configuration.
> That means 2 bricks on each node for one subvolume.
> The procedure will require a lot of replace bricks which will again need
healing and all. In addition to that we have to wait for re-balance to complete.
> 
> I would suggest that if whole data has not been rebalanced and if you can
stop the rebalance and remove these newly added bricks properly then you should
remove these newly added bricks.
> After that, add these bricks so that you have 2 bricks of each volume on 3
newly added nodes.
> 
> Yes, it is like undoing whole effort but it is better to do it now then
facing issues in future when it will be almost impossible to correct these
things if you have lots of data.
> 
> ---
> Ashish
> 
> 
> 
> From: "Mauro Tridici" <mauro.tridici at cmcc.it
<mailto:mauro.tridici at cmcc.it>>
> To: "Ashish Pandey" <aspandey at redhat.com
<mailto:aspandey at redhat.com>>
> Cc: "gluster-users" <gluster-users at gluster.org
<mailto:gluster-users at gluster.org>>
> Sent: Wednesday, September 26, 2018 5:55:02 PM
> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse
volume        based on 3.12.14 version
> 
> 
> Dear Ashish,
> 
> thank you for you answer.
> I could provide you the entire log file related to glusterd, glusterfsd and
rebalance.
> Please, could you indicate which one you need first?
> 
> Yes, we added the last 36 bricks after creating vol. Is there a procedure
to correct this error? Is it still possible to do it?
> 
> Many thanks,
> Mauro
> 
> Il giorno 26 set 2018, alle ore 14:13, Ashish Pandey <aspandey at
redhat.com <mailto:aspandey at redhat.com>> ha scritto:
> 
> 
> I think we don't have enough logs to debug this so I would suggest you
to provide more logs/info.
> I have also observed that the configuration and setup of your volume is not
very efficient.
> 
> For example: 
> Brick37: s04-stg:/gluster/mnt1/brick
> Brick38: s04-stg:/gluster/mnt2/brick
> Brick39: s04-stg:/gluster/mnt3/brick
> Brick40: s04-stg:/gluster/mnt4/brick
> Brick41: s04-stg:/gluster/mnt5/brick
> Brick42: s04-stg:/gluster/mnt6/brick
> Brick43: s04-stg:/gluster/mnt7/brick
> Brick44: s04-stg:/gluster/mnt8/brick
> Brick45: s04-stg:/gluster/mnt9/brick
> Brick46: s04-stg:/gluster/mnt10/brick
> Brick47: s04-stg:/gluster/mnt11/brick
> Brick48: s04-stg:/gluster/mnt12/brick
> 
> These 12 bricks are on same node and the sub volume made up of these bricks
will be of same subvolume, which is not good. Same is true for the bricks hosted
on s05-stg and s06-stg
> I think you have added these bricks after creating vol. The probability of
disruption in connection of these bricks will be higher in this case.
> 
> ---
> Ashish
> 
> From: "Mauro Tridici" <mauro.tridici at cmcc.it
<mailto:mauro.tridici at cmcc.it>>
> To: "gluster-users" <gluster-users at gluster.org
<mailto:gluster-users at gluster.org>>
> Sent: Wednesday, September 26, 2018 3:38:35 PM
> Subject: [Gluster-users] Rebalance failed on Distributed Disperse volume   
based on 3.12.14 version
> 
> Dear All, Dear Nithya,
> 
> after upgrading from 3.10.5 version to 3.12.14, I tried to start a
rebalance process to distribute data across the bricks, but something goes
wrong.
> Rebalance failed on different nodes and the time value needed to complete
the procedure seems to be very high.
> 
> [root at s01 ~]# gluster volume rebalance tier2 status
>                                     Node Rebalanced-files          size    
scanned      failures       skipped               status  run time in h:m:s
>                                ---------      -----------   -----------  
-----------   -----------   -----------         ------------     --------------
>                                localhost               19       161.6GB    
537             2             2          in progress        0:32:23
>                                  s02-stg               25       212.7GB    
526             5             2          in progress        0:32:25
>                                  s03-stg                4        69.1GB    
511             0             0          in progress        0:32:25
>                                  s04-stg                4      484Bytes    
12283             0             3          in progress        0:32:25
>                                  s05-stg               23      484Bytes    
11049             0            10          in progress        0:32:25
>                                  s06-stg                3         1.2GB    
8032            11             3               failed        0:17:57
> Estimated time left for rebalance to complete :     3601:05:41
> volume rebalance: tier2: success
> 
> When rebalance processes fail, I can see the following kind of errors in
/var/log/glusterfs/tier2-rebalance.log
> 
> Error type 1)
> 
> [2018-09-26 08:50:19.872575] W [MSGID: 122053]
[ec-common.c:269:ec_check_status] 0-tier2-disperse-10: Operation failed on 2 of
6 subvolumes.(up=111111, mask=100111, remaining> 000000, good=100111,
bad=011000)
> [2018-09-26 08:50:19.901792] W [MSGID: 122053]
[ec-common.c:269:ec_check_status] 0-tier2-disperse-11: Operation failed on 1 of
6 subvolumes.(up=111111, mask=111101, remaining> 000000, good=111101,
bad=000010)
> 
> Error type 2)
> 
> [2018-09-26 08:53:31.566836] W [socket.c:600:__socket_rwv]
0-tier2-client-53: readv on 192.168.0.55:49153 failed (Connection reset by peer)
> 
> Error type 3)
> 
> [2018-09-26 08:57:37.852590] W [MSGID: 122035]
[ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation with
some subvolumes unavailable (10)
> [2018-09-26 08:57:39.282306] W [MSGID: 122035]
[ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation with
some subvolumes unavailable (10)
> [2018-09-26 09:02:04.928408] W [MSGID: 109023]
[dht-rebalance.c:1013:__dht_check_free_space] 0-tier2-dht: data movement of file
{blocks:0 name:(/OPA/archive/historical/dts/MRE
> A/Observations/Observations/MREA14/Cs-1/CMCC/raw/CS013.ext)} would result
in dst node (tier2-disperse-5:2440190848) having lower disk space than the
source node (tier2-dispers
> e-11:71373083776).Skipping file.
> 
> Error type 4)
> 
> W [rpc-clnt-ping.c:223:rpc_clnt_ping_cbk] 0-tier2-client-7: socket
disconnected
> 
> Error type 5)
> 
> [2018-09-26 09:07:42.333720] W [glusterfsd.c:1375:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7e25) [0x7f0417e0ee25]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55
> 90086004b5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
[0x55900860032b] ) 0-: received signum (15), shutting down
> 
> Error type 6)
> 
> [2018-09-25 08:09:18.340658] C
[rpc-clnt-ping.c:166:rpc_clnt_ping_timer_expired] 0-tier2-client-4: server
192.168.0.52:49153 has not responded in the last 42 seconds, disconnecting.
> 
> It seems that there are some network or timeout problems, but the network
usage/traffic values are not so high.
> Do you think that, in my volume configuration, I have to modify some volume
options related to thread and/or network parameters?
> Could you, please, help me to understand the cause of the problems above?
> 
> You can find below our volume info:
> (volume is implemented on 6 servers; each server configuration:  2 cpu
10-cores, 64GB RAM, 1 SSD dedicated to the OS, 12 x 10TB HD)
> 
> [root at s04 ~]# gluster vol info
>  
> Volume Name: tier2
> Type: Distributed-Disperse
> Volume ID: a28d88c5-3295-4e35-98d4-210b3af9358c
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 12 x (4 + 2) = 72
> Transport-type: tcp
> Bricks:
> Brick1: s01-stg:/gluster/mnt1/brick
> Brick2: s02-stg:/gluster/mnt1/brick
> Brick3: s03-stg:/gluster/mnt1/brick
> Brick4: s01-stg:/gluster/mnt2/brick
> Brick5: s02-stg:/gluster/mnt2/brick
> Brick6: s03-stg:/gluster/mnt2/brick
> Brick7: s01-stg:/gluster/mnt3/brick
> Brick8: s02-stg:/gluster/mnt3/brick
> Brick9: s03-stg:/gluster/mnt3/brick
> Brick10: s01-stg:/gluster/mnt4/brick
> Brick11: s02-stg:/gluster/mnt4/brick
> Brick12: s03-stg:/gluster/mnt4/brick
> Brick13: s01-stg:/gluster/mnt5/brick
> Brick14: s02-stg:/gluster/mnt5/brick
> Brick15: s03-stg:/gluster/mnt5/brick
> Brick16: s01-stg:/gluster/mnt6/brick
> Brick17: s02-stg:/gluster/mnt6/brick
> Brick18: s03-stg:/gluster/mnt6/brick
> Brick19: s01-stg:/gluster/mnt7/brick
> Brick20: s02-stg:/gluster/mnt7/brick
> Brick21: s03-stg:/gluster/mnt7/brick
> Brick22: s01-stg:/gluster/mnt8/brick
> Brick23: s02-stg:/gluster/mnt8/brick
> Brick24: s03-stg:/gluster/mnt8/brick
> Brick25: s01-stg:/gluster/mnt9/brick
> Brick26: s02-stg:/gluster/mnt9/brick
> Brick27: s03-stg:/gluster/mnt9/brick
> Brick28: s01-stg:/gluster/mnt10/brick
> Brick29: s02-stg:/gluster/mnt10/brick
> Brick30: s03-stg:/gluster/mnt10/brick
> Brick31: s01-stg:/gluster/mnt11/brick
> Brick32: s02-stg:/gluster/mnt11/brick
> Brick33: s03-stg:/gluster/mnt11/brick
> Brick34: s01-stg:/gluster/mnt12/brick
> Brick35: s02-stg:/gluster/mnt12/brick
> Brick36: s03-stg:/gluster/mnt12/brick
> Brick37: s04-stg:/gluster/mnt1/brick
> Brick38: s04-stg:/gluster/mnt2/brick
> Brick39: s04-stg:/gluster/mnt3/brick
> Brick40: s04-stg:/gluster/mnt4/brick
> Brick41: s04-stg:/gluster/mnt5/brick
> Brick42: s04-stg:/gluster/mnt6/brick
> Brick43: s04-stg:/gluster/mnt7/brick
> Brick44: s04-stg:/gluster/mnt8/brick
> Brick45: s04-stg:/gluster/mnt9/brick
> Brick46: s04-stg:/gluster/mnt10/brick
> Brick47: s04-stg:/gluster/mnt11/brick
> Brick48: s04-stg:/gluster/mnt12/brick
> Brick49: s05-stg:/gluster/mnt1/brick
> Brick50: s05-stg:/gluster/mnt2/brick
> Brick51: s05-stg:/gluster/mnt3/brick
> Brick52: s05-stg:/gluster/mnt4/brick
> Brick53: s05-stg:/gluster/mnt5/brick
> Brick54: s05-stg:/gluster/mnt6/brick
> Brick55: s05-stg:/gluster/mnt7/brick
> Brick56: s05-stg:/gluster/mnt8/brick
> Brick57: s05-stg:/gluster/mnt9/brick
> Brick58: s05-stg:/gluster/mnt10/brick
> Brick59: s05-stg:/gluster/mnt11/brick
> Brick60: s05-stg:/gluster/mnt12/brick
> Brick61: s06-stg:/gluster/mnt1/brick
> Brick62: s06-stg:/gluster/mnt2/brick
> Brick63: s06-stg:/gluster/mnt3/brick
> Brick64: s06-stg:/gluster/mnt4/brick
> Brick65: s06-stg:/gluster/mnt5/brick
> Brick66: s06-stg:/gluster/mnt6/brick
> Brick67: s06-stg:/gluster/mnt7/brick
> Brick68: s06-stg:/gluster/mnt8/brick
> Brick69: s06-stg:/gluster/mnt9/brick
> Brick70: s06-stg:/gluster/mnt10/brick
> Brick71: s06-stg:/gluster/mnt11/brick
> Brick72: s06-stg:/gluster/mnt12/brick
> Options Reconfigured:
> network.ping-timeout: 60
> diagnostics.count-fop-hits: on
> diagnostics.latency-measurement: on
> cluster.server-quorum-type: server
> features.default-soft-limit: 90
> features.quota-deem-statfs: on
> performance.io <http://performance.io/>-thread-count: 16
> disperse.cpu-extensions: auto
> performance.io <http://performance.io/>-cache: off
> network.inode-lru-limit: 50000
> performance.md-cache-timeout: 600
> performance.cache-invalidation: on
> performance.stat-prefetch: on
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: on
> cluster.readdir-optimize: on
> performance.parallel-readdir: off
> performance.readdir-ahead: on
> cluster.lookup-optimize: on
> client.event-threads: 4
> server.event-threads: 4
> nfs.disable: on
> transport.address-family: inet
> cluster.quorum-type: auto
> cluster.min-free-disk: 10
> performance.client-io-threads: on
> features.quota: on
> features.inode-quota: on
> features.bitrot: on
> features.scrub: Active
> cluster.brick-multiplex: on
> cluster.server-quorum-ratio: 51%
> 
> If it can help, I paste here the output of ?free -m? command executed on
all the cluster nodes:
> 
> The result is almost the same on every nodes. In your opinion, the
available RAM is enough to support data movement?
> 
> [root at s06 ~]# free -m
>               total        used        free      shared  buff/cache  
available
> Mem:          64309       10409         464          15       53434      
52998
> Swap:         65535         103       65432
> 
> Thank you in advance.
> Sorry for my long message, but I?m trying to notify you all available
information.
> 
> Regards,
> Mauro
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 

-------------------------
Mauro Tridici

Fondazione CMCC
CMCC Supercomputing Center
presso Complesso Ecotekne - Universit? del Salento -
Strada Prov.le Lecce - Monteroni sn
73100 Lecce  IT
http://www.cmcc.it

mobile: (+39) 327 5630841
email: mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>
https://it.linkedin.com/in/mauro-tridici-5977238b

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180926/817fc4f4/attachment.html>

Mauro Tridici

2018-Sep-27 10:33 UTC

head link

[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version

Dear Ashish,

I hope I don?t disturb you so much, but I would like to ask you if you had some
time to dedicate to our problem.
Please, forgive my insistence.

Thank you in advance,
Mauro
> Il giorno 26 set 2018, alle ore 19:56, Mauro Tridici <mauro.tridici at
cmcc.it> ha scritto:
> 
> Hi Ashish,
> 
> sure, no problem! We are a little bit worried, but we can wait  :-)
> Thank you very much for your support and your availability.
> 
> Regards,
> Mauro
> 
> 
>> Il giorno 26 set 2018, alle ore 19:33, Ashish Pandey <aspandey at
redhat.com <mailto:aspandey at redhat.com>> ha scritto:
>> 
>> Hi Mauro,
>> 
>> Yes, I can provide you step by step procedure to correct it. 
>> Is it fine If i provide you the steps tomorrow as it is quite late over
here and I don't want to miss anything in hurry?
>> 
>> ---
>> Ashish
>> 
>> From: "Mauro Tridici" <mauro.tridici at cmcc.it
<mailto:mauro.tridici at cmcc.it>>
>> To: "Ashish Pandey" <aspandey at redhat.com
<mailto:aspandey at redhat.com>>
>> Cc: "gluster-users" <gluster-users at gluster.org
<mailto:gluster-users at gluster.org>>
>> Sent: Wednesday, September 26, 2018 6:54:19 PM
>> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse
volume        based on 3.12.14 version
>> 
>> 
>> Hi Ashish,
>> 
>> in attachment you can find the rebalance log file and the last updated
brick log file (the other files in /var/log/glusterfs/bricks directory seem to
be too old).
>> I just stopped the running rebalance (as you can see at the bottom of
the rebalance log file).
>> So, if exists a safe procedure to correct the problem I would like
execute it.
>> 
>> I don?t know if I can ask you it, but, if it is possible, could you
please describe me step by step the right procedure to remove the newly added
bricks without losing the data that have been already rebalanced?
>> 
>> The following outputs show the result of ?df -h? command executed on
one of the first 3 nodes (s01, s02, s03) already existing  and on one of the
last 3 nodes (s04, s05, s06) added recently.
>> 
>> [root at s06 bricks]# df -h
>> File system                          Dim. Usati Dispon. Uso% Montato su
>> /dev/mapper/cl_s06-root              100G  2,1G     98G   3% /
>> devtmpfs                              32G     0     32G   0% /dev
>> tmpfs                                 32G  4,0K     32G   1% /dev/shm
>> tmpfs                                 32G   26M     32G   1% /run
>> tmpfs                                 32G     0     32G   0%
/sys/fs/cgroup
>> /dev/mapper/cl_s06-var               100G  2,0G     99G   2% /var
>> /dev/mapper/cl_s06-gluster           100G   33M    100G   1% /gluster
>> /dev/sda1                           1014M  152M    863M  15% /boot
>> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  807G    8,3T   9%
/gluster/mnt3
>> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  807G    8,3T   9%
/gluster/mnt6
>> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  807G    8,3T   9%
/gluster/mnt2
>> /dev/mapper/gluster_vge-gluster_lve  9,0T  807G    8,3T   9%
/gluster/mnt4
>> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  887G    8,2T  10%
/gluster/mnt9
>> /dev/mapper/gluster_vgb-gluster_lvb  9,0T  807G    8,3T   9%
/gluster/mnt1
>> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  887G    8,2T  10%
/gluster/mnt7
>> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  807G    8,3T   9%
/gluster/mnt5
>> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  887G    8,2T  10%
/gluster/mnt8
>> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  887G    8,2T  10%
/gluster/mnt11
>> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  887G    8,2T  10%
/gluster/mnt10
>> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  887G    8,2T  10%
/gluster/mnt12
>> tmpfs                                6,3G     0    6,3G   0%
/run/user/0
>> 
>> [root at s01 ~]# df -h
>> File system                          Dim. Usati Dispon. Uso% Montato su
>> /dev/mapper/cl_s01-root              100G  5,3G     95G   6% /
>> devtmpfs                              32G     0     32G   0% /dev
>> tmpfs                                 32G   39M     32G   1% /dev/shm
>> tmpfs                                 32G   26M     32G   1% /run
>> tmpfs                                 32G     0     32G   0%
/sys/fs/cgroup
>> /dev/mapper/cl_s01-var               100G   11G     90G  11% /var
>> /dev/md127                          1015M  151M    865M  15% /boot
>> /dev/mapper/cl_s01-gluster           100G   33M    100G   1% /gluster
>> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  5,5T    3,6T  61%
/gluster/mnt7
>> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  5,4T    3,6T  61%
/gluster/mnt11
>> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  5,7T    3,4T  63%
/gluster/mnt4
>> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  5,8T    3,3T  64%
/gluster/mnt10
>> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  5,5T    3,6T  61%
/gluster/mnt8
>> /dev/mapper/gluster_vgn-gluster_lvn  9,0T  5,4T    3,6T  61%
/gluster/mnt12
>> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  5,8T    3,3T  64%
/gluster/mnt9
>> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  5,6T    3,5T  63%
/gluster/mnt6
>> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  5,6T    3,5T  63%
/gluster/mnt5
>> /dev/mapper/gluster_vge-gluster_lve  9,0T  5,7T    3,4T  63%
/gluster/mnt3
>> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  5,6T    3,5T  62%
/gluster/mnt1
>> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  5,6T    3,5T  62%
/gluster/mnt2
>> tmpfs                                6,3G     0    6,3G   0%
/run/user/0
>> s01-stg:tier2                        420T  159T    262T  38% /tier2
>> 
>> As you can see, used space value of each brick of the last servers is
about 800GB.
>> 
>> Thank you,
>> Mauro
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Il giorno 26 set 2018, alle ore 14:51, Ashish Pandey <aspandey at
redhat.com <mailto:aspandey at redhat.com>> ha scritto:
>> 
>> Hi Mauro,
>> 
>> rebalance and brick logs should be the first thing we should go
through.
>> 
>> There is a procedure to correct the configuration/setup but the
situation you are in is difficult to follow that procedure.
>> You should have added the bricks hosted on s04-stg, s05-stg and s06-stg
the same way you had the previous configuration.
>> That means 2 bricks on each node for one subvolume.
>> The procedure will require a lot of replace bricks which will again
need healing and all. In addition to that we have to wait for re-balance to
complete.
>> 
>> I would suggest that if whole data has not been rebalanced and if you
can stop the rebalance and remove these newly added bricks properly then you
should remove these newly added bricks.
>> After that, add these bricks so that you have 2 bricks of each volume
on 3 newly added nodes.
>> 
>> Yes, it is like undoing whole effort but it is better to do it now then
facing issues in future when it will be almost impossible to correct these
things if you have lots of data.
>> 
>> ---
>> Ashish
>> 
>> 
>> 
>> From: "Mauro Tridici" <mauro.tridici at cmcc.it
<mailto:mauro.tridici at cmcc.it>>
>> To: "Ashish Pandey" <aspandey at redhat.com
<mailto:aspandey at redhat.com>>
>> Cc: "gluster-users" <gluster-users at gluster.org
<mailto:gluster-users at gluster.org>>
>> Sent: Wednesday, September 26, 2018 5:55:02 PM
>> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse
volume        based on 3.12.14 version
>> 
>> 
>> Dear Ashish,
>> 
>> thank you for you answer.
>> I could provide you the entire log file related to glusterd, glusterfsd
and rebalance.
>> Please, could you indicate which one you need first?
>> 
>> Yes, we added the last 36 bricks after creating vol. Is there a
procedure to correct this error? Is it still possible to do it?
>> 
>> Many thanks,
>> Mauro
>> 
>> Il giorno 26 set 2018, alle ore 14:13, Ashish Pandey <aspandey at
redhat.com <mailto:aspandey at redhat.com>> ha scritto:
>> 
>> 
>> I think we don't have enough logs to debug this so I would suggest
you to provide more logs/info.
>> I have also observed that the configuration and setup of your volume is
not very efficient.
>> 
>> For example: 
>> Brick37: s04-stg:/gluster/mnt1/brick
>> Brick38: s04-stg:/gluster/mnt2/brick
>> Brick39: s04-stg:/gluster/mnt3/brick
>> Brick40: s04-stg:/gluster/mnt4/brick
>> Brick41: s04-stg:/gluster/mnt5/brick
>> Brick42: s04-stg:/gluster/mnt6/brick
>> Brick43: s04-stg:/gluster/mnt7/brick
>> Brick44: s04-stg:/gluster/mnt8/brick
>> Brick45: s04-stg:/gluster/mnt9/brick
>> Brick46: s04-stg:/gluster/mnt10/brick
>> Brick47: s04-stg:/gluster/mnt11/brick
>> Brick48: s04-stg:/gluster/mnt12/brick
>> 
>> These 12 bricks are on same node and the sub volume made up of these
bricks will be of same subvolume, which is not good. Same is true for the bricks
hosted on s05-stg and s06-stg
>> I think you have added these bricks after creating vol. The probability
of disruption in connection of these bricks will be higher in this case.
>> 
>> ---
>> Ashish
>> 
>> From: "Mauro Tridici" <mauro.tridici at cmcc.it
<mailto:mauro.tridici at cmcc.it>>
>> To: "gluster-users" <gluster-users at gluster.org
<mailto:gluster-users at gluster.org>>
>> Sent: Wednesday, September 26, 2018 3:38:35 PM
>> Subject: [Gluster-users] Rebalance failed on Distributed Disperse
volume        based on 3.12.14 version
>> 
>> Dear All, Dear Nithya,
>> 
>> after upgrading from 3.10.5 version to 3.12.14, I tried to start a
rebalance process to distribute data across the bricks, but something goes
wrong.
>> Rebalance failed on different nodes and the time value needed to
complete the procedure seems to be very high.
>> 
>> [root at s01 ~]# gluster volume rebalance tier2 status
>>                                     Node Rebalanced-files          size
scanned      failures       skipped               status  run time in h:m:s
>>                                ---------      -----------   -----------
-----------   -----------   -----------         ------------     --------------
>>                                localhost               19       161.6GB
537             2             2          in progress        0:32:23
>>                                  s02-stg               25       212.7GB
526             5             2          in progress        0:32:25
>>                                  s03-stg                4        69.1GB
511             0             0          in progress        0:32:25
>>                                  s04-stg                4      484Bytes
12283             0             3          in progress        0:32:25
>>                                  s05-stg               23      484Bytes
11049             0            10          in progress        0:32:25
>>                                  s06-stg                3         1.2GB
8032            11             3               failed        0:17:57
>> Estimated time left for rebalance to complete :     3601:05:41
>> volume rebalance: tier2: success
>> 
>> When rebalance processes fail, I can see the following kind of errors
in /var/log/glusterfs/tier2-rebalance.log
>> 
>> Error type 1)
>> 
>> [2018-09-26 08:50:19.872575] W [MSGID: 122053]
[ec-common.c:269:ec_check_status] 0-tier2-disperse-10: Operation failed on 2 of
6 subvolumes.(up=111111, mask=100111, remaining>> 000000, good=100111,
bad=011000)
>> [2018-09-26 08:50:19.901792] W [MSGID: 122053]
[ec-common.c:269:ec_check_status] 0-tier2-disperse-11: Operation failed on 1 of
6 subvolumes.(up=111111, mask=111101, remaining>> 000000, good=111101,
bad=000010)
>> 
>> Error type 2)
>> 
>> [2018-09-26 08:53:31.566836] W [socket.c:600:__socket_rwv]
0-tier2-client-53: readv on 192.168.0.55:49153 failed (Connection reset by peer)
>> 
>> Error type 3)
>> 
>> [2018-09-26 08:57:37.852590] W [MSGID: 122035]
[ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation with
some subvolumes unavailable (10)
>> [2018-09-26 08:57:39.282306] W [MSGID: 122035]
[ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation with
some subvolumes unavailable (10)
>> [2018-09-26 09:02:04.928408] W [MSGID: 109023]
[dht-rebalance.c:1013:__dht_check_free_space] 0-tier2-dht: data movement of file
{blocks:0 name:(/OPA/archive/historical/dts/MRE
>> A/Observations/Observations/MREA14/Cs-1/CMCC/raw/CS013.ext)} would
result in dst node (tier2-disperse-5:2440190848) having lower disk space than
the source node (tier2-dispers
>> e-11:71373083776).Skipping file.
>> 
>> Error type 4)
>> 
>> W [rpc-clnt-ping.c:223:rpc_clnt_ping_cbk] 0-tier2-client-7: socket
disconnected
>> 
>> Error type 5)
>> 
>> [2018-09-26 09:07:42.333720] W [glusterfsd.c:1375:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7e25) [0x7f0417e0ee25]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55
>> 90086004b5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
[0x55900860032b] ) 0-: received signum (15), shutting down
>> 
>> Error type 6)
>> 
>> [2018-09-25 08:09:18.340658] C
[rpc-clnt-ping.c:166:rpc_clnt_ping_timer_expired] 0-tier2-client-4: server
192.168.0.52:49153 has not responded in the last 42 seconds, disconnecting.
>> 
>> It seems that there are some network or timeout problems, but the
network usage/traffic values are not so high.
>> Do you think that, in my volume configuration, I have to modify some
volume options related to thread and/or network parameters?
>> Could you, please, help me to understand the cause of the problems
above?
>> 
>> You can find below our volume info:
>> (volume is implemented on 6 servers; each server configuration:  2 cpu
10-cores, 64GB RAM, 1 SSD dedicated to the OS, 12 x 10TB HD)
>> 
>> [root at s04 ~]# gluster vol info
>>  
>> Volume Name: tier2
>> Type: Distributed-Disperse
>> Volume ID: a28d88c5-3295-4e35-98d4-210b3af9358c
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 12 x (4 + 2) = 72
>> Transport-type: tcp
>> Bricks:
>> Brick1: s01-stg:/gluster/mnt1/brick
>> Brick2: s02-stg:/gluster/mnt1/brick
>> Brick3: s03-stg:/gluster/mnt1/brick
>> Brick4: s01-stg:/gluster/mnt2/brick
>> Brick5: s02-stg:/gluster/mnt2/brick
>> Brick6: s03-stg:/gluster/mnt2/brick
>> Brick7: s01-stg:/gluster/mnt3/brick
>> Brick8: s02-stg:/gluster/mnt3/brick
>> Brick9: s03-stg:/gluster/mnt3/brick
>> Brick10: s01-stg:/gluster/mnt4/brick
>> Brick11: s02-stg:/gluster/mnt4/brick
>> Brick12: s03-stg:/gluster/mnt4/brick
>> Brick13: s01-stg:/gluster/mnt5/brick
>> Brick14: s02-stg:/gluster/mnt5/brick
>> Brick15: s03-stg:/gluster/mnt5/brick
>> Brick16: s01-stg:/gluster/mnt6/brick
>> Brick17: s02-stg:/gluster/mnt6/brick
>> Brick18: s03-stg:/gluster/mnt6/brick
>> Brick19: s01-stg:/gluster/mnt7/brick
>> Brick20: s02-stg:/gluster/mnt7/brick
>> Brick21: s03-stg:/gluster/mnt7/brick
>> Brick22: s01-stg:/gluster/mnt8/brick
>> Brick23: s02-stg:/gluster/mnt8/brick
>> Brick24: s03-stg:/gluster/mnt8/brick
>> Brick25: s01-stg:/gluster/mnt9/brick
>> Brick26: s02-stg:/gluster/mnt9/brick
>> Brick27: s03-stg:/gluster/mnt9/brick
>> Brick28: s01-stg:/gluster/mnt10/brick
>> Brick29: s02-stg:/gluster/mnt10/brick
>> Brick30: s03-stg:/gluster/mnt10/brick
>> Brick31: s01-stg:/gluster/mnt11/brick
>> Brick32: s02-stg:/gluster/mnt11/brick
>> Brick33: s03-stg:/gluster/mnt11/brick
>> Brick34: s01-stg:/gluster/mnt12/brick
>> Brick35: s02-stg:/gluster/mnt12/brick
>> Brick36: s03-stg:/gluster/mnt12/brick
>> Brick37: s04-stg:/gluster/mnt1/brick
>> Brick38: s04-stg:/gluster/mnt2/brick
>> Brick39: s04-stg:/gluster/mnt3/brick
>> Brick40: s04-stg:/gluster/mnt4/brick
>> Brick41: s04-stg:/gluster/mnt5/brick
>> Brick42: s04-stg:/gluster/mnt6/brick
>> Brick43: s04-stg:/gluster/mnt7/brick
>> Brick44: s04-stg:/gluster/mnt8/brick
>> Brick45: s04-stg:/gluster/mnt9/brick
>> Brick46: s04-stg:/gluster/mnt10/brick
>> Brick47: s04-stg:/gluster/mnt11/brick
>> Brick48: s04-stg:/gluster/mnt12/brick
>> Brick49: s05-stg:/gluster/mnt1/brick
>> Brick50: s05-stg:/gluster/mnt2/brick
>> Brick51: s05-stg:/gluster/mnt3/brick
>> Brick52: s05-stg:/gluster/mnt4/brick
>> Brick53: s05-stg:/gluster/mnt5/brick
>> Brick54: s05-stg:/gluster/mnt6/brick
>> Brick55: s05-stg:/gluster/mnt7/brick
>> Brick56: s05-stg:/gluster/mnt8/brick
>> Brick57: s05-stg:/gluster/mnt9/brick
>> Brick58: s05-stg:/gluster/mnt10/brick
>> Brick59: s05-stg:/gluster/mnt11/brick
>> Brick60: s05-stg:/gluster/mnt12/brick
>> Brick61: s06-stg:/gluster/mnt1/brick
>> Brick62: s06-stg:/gluster/mnt2/brick
>> Brick63: s06-stg:/gluster/mnt3/brick
>> Brick64: s06-stg:/gluster/mnt4/brick
>> Brick65: s06-stg:/gluster/mnt5/brick
>> Brick66: s06-stg:/gluster/mnt6/brick
>> Brick67: s06-stg:/gluster/mnt7/brick
>> Brick68: s06-stg:/gluster/mnt8/brick
>> Brick69: s06-stg:/gluster/mnt9/brick
>> Brick70: s06-stg:/gluster/mnt10/brick
>> Brick71: s06-stg:/gluster/mnt11/brick
>> Brick72: s06-stg:/gluster/mnt12/brick
>> Options Reconfigured:
>> network.ping-timeout: 60
>> diagnostics.count-fop-hits: on
>> diagnostics.latency-measurement: on
>> cluster.server-quorum-type: server
>> features.default-soft-limit: 90
>> features.quota-deem-statfs: on
>> performance.io <http://performance.io/>-thread-count: 16
>> disperse.cpu-extensions: auto
>> performance.io <http://performance.io/>-cache: off
>> network.inode-lru-limit: 50000
>> performance.md-cache-timeout: 600
>> performance.cache-invalidation: on
>> performance.stat-prefetch: on
>> features.cache-invalidation-timeout: 600
>> features.cache-invalidation: on
>> cluster.readdir-optimize: on
>> performance.parallel-readdir: off
>> performance.readdir-ahead: on
>> cluster.lookup-optimize: on
>> client.event-threads: 4
>> server.event-threads: 4
>> nfs.disable: on
>> transport.address-family: inet
>> cluster.quorum-type: auto
>> cluster.min-free-disk: 10
>> performance.client-io-threads: on
>> features.quota: on
>> features.inode-quota: on
>> features.bitrot: on
>> features.scrub: Active
>> cluster.brick-multiplex: on
>> cluster.server-quorum-ratio: 51%
>> 
>> If it can help, I paste here the output of ?free -m? command executed
on all the cluster nodes:
>> 
>> The result is almost the same on every nodes. In your opinion, the
available RAM is enough to support data movement?
>> 
>> [root at s06 ~]# free -m
>>               total        used        free      shared  buff/cache  
available
>> Mem:          64309       10409         464          15       53434    
52998
>> Swap:         65535         103       65432
>> 
>> Thank you in advance.
>> Sorry for my long message, but I?m trying to notify you all available
information.
>> 
>> Regards,
>> Mauro
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>> 
> 
> 
> -------------------------
> Mauro Tridici
> 
> Fondazione CMCC
> CMCC Supercomputing Center
> presso Complesso Ecotekne - Universit? del Salento -
> Strada Prov.le Lecce - Monteroni sn
> 73100 Lecce  IT
> http://www.cmcc.it <http://www.cmcc.it/>
> 
> mobile: (+39) 327 5630841
> email: mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>
> https://it.linkedin.com/in/mauro-tridici-5977238b
<https://it.linkedin.com/in/mauro-tridici-5977238b>

-------------------------
Mauro Tridici

Fondazione CMCC
CMCC Supercomputing Center
presso Complesso Ecotekne - Universit? del Salento -
Strada Prov.le Lecce - Monteroni sn
73100 Lecce  IT
http://www.cmcc.it

mobile: (+39) 327 5630841
email: mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>
https://it.linkedin.com/in/mauro-tridici-5977238b

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180927/8749b173/attachment.html>

Gluster users - Sep 2018 - Rebalance failed on Distributed Disperse volume based on 3.12.14 version

[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version

[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version