Mauro Tridici
2018-Oct-03 15:48 UTC
[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version
Hi Nithya, in order to give an answer to your question as soon as possible, I just considered only the content of one brick of server s06 (in attachment you can find the content of /gluster/mnt1/brick). [root at s06 ~]# df -h File system Dim. Usati Dispon. Uso% Montato su /dev/mapper/cl_s06-root 100G 2,1G 98G 3% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 4,0K 32G 1% /dev/shm tmpfs 32G 106M 32G 1% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/mapper/cl_s06-var 100G 3,0G 97G 3% /var /dev/mapper/cl_s06-gluster 100G 33M 100G 1% /gluster /dev/sda1 1014M 152M 863M 15% /boot /dev/mapper/gluster_vgd-gluster_lvd 9,0T 12G 9,0T 1% /gluster/mnt3 /dev/mapper/gluster_vgg-gluster_lvg 9,0T 12G 9,0T 1% /gluster/mnt6 /dev/mapper/gluster_vgc-gluster_lvc 9,0T 12G 9,0T 1% /gluster/mnt2 /dev/mapper/gluster_vge-gluster_lve 9,0T 12G 9,0T 1% /gluster/mnt4 /dev/mapper/gluster_vgj-gluster_lvj 9,0T 1,4T 7,7T 16% /gluster/mnt9 /dev/mapper/gluster_vgb-gluster_lvb 9,0T 12G 9,0T 1% /gluster/mnt1 /dev/mapper/gluster_vgh-gluster_lvh 9,0T 1,4T 7,7T 16% /gluster/mnt7 /dev/mapper/gluster_vgf-gluster_lvf 9,0T 12G 9,0T 1% /gluster/mnt5 /dev/mapper/gluster_vgi-gluster_lvi 9,0T 1,4T 7,7T 16% /gluster/mnt8 /dev/mapper/gluster_vgl-gluster_lvl 9,0T 1,4T 7,7T 16% /gluster/mnt11 /dev/mapper/gluster_vgk-gluster_lvk 9,0T 1,4T 7,7T 16% /gluster/mnt10 /dev/mapper/gluster_vgm-gluster_lvm 9,0T 1,4T 7,7T 16% /gluster/mnt12 The scenario is almost the same for all the bricks removed from server s04, s05 and s06. In the next hours, I will check every files on each removed bricks. So, if I understand, I can proceed with deletion of directories and files left on the bricks only if each file have T tag, right? Thank you in advance, Mauro> Il giorno 03 ott 2018, alle ore 16:49, Nithya Balachandran <nbalacha at redhat.com> ha scritto: > > > > On 1 October 2018 at 15:35, Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> wrote: > Good morning Ashish, > > your explanations are always very useful, thank you very much: I will remember these suggestions for any future needs. > Anyway, during the week-end, the remove-brick procedures ended successfully and we were able to free up all bricks defined on server s04, s05 and 6 bricks of 12 on server s06. > So, we can say that, thanks to your suggestions, we are about to complete this first phase (removing of all bricks defined on s04, s05 and s06 servers). > > I really appreciated your support. > Now I have a last question (I hope): after remove-brick commit I noticed that some data remain on each brick (about 1.2GB of data). > Please, take a look to the ?df-h_on_s04_s05_s06.txt?. > The situation is almost the same on all 3 servers mentioned above: a long list of directories names and some files that are still on the brick, but respective size is 0. > > Examples: > > a lot of empty directories on /gluster/mnt*/brick/.glusterfs > > 8 /gluster/mnt2/brick/.glusterfs/b7/1b > 0 /gluster/mnt2/brick/.glusterfs/b7/ee/b7ee94a5-a77c-4c02-85a5-085992840c83 > 0 /gluster/mnt2/brick/.glusterfs/b7/ee/b7ee85d4-ce48-43a7-a89a-69c728ee8273 > > some empty files in directories in /gluster/mnt*/brick/* > > [root at s04 ~]# cd /gluster/mnt1/brick/ > [root at s04 brick]# ls -l > totale 32 > drwxr-xr-x 7 root root 100 11 set 22.14 archive_calypso > > [root at s04 brick]# cd archive_calypso/ > [root at s04 archive_calypso]# ll > totale 0 > drwxr-x--- 3 root 5200 29 11 set 22.13 ans002 > drwxr-x--- 3 5104 5100 32 11 set 22.14 ans004 > drwxr-x--- 3 4506 4500 31 11 set 22.14 ans006 > drwxr-x--- 3 4515 4500 28 11 set 22.14 ans015 > drwxr-x--- 4 4321 4300 54 11 set 22.14 ans021 > [root at s04 archive_calypso]# du -a * > 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.0/echam5/echam_sf006_198110.01.gz > 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.0/echam5 > 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.0 > 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.1/echam5/echam_sf006_198105.01.gz > 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.1/echam5/echam_sf006_198109.01.gz > 8 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.1/echam5 > > What we have to do with this data? Should I backup this ?empty? dirs and files on a different storage before deleting them? > > Hi Mauro, > > Are you sure these files and directories are empty? Please provide the ls -l output for the files. If they are 'T' files , they can be ignored. > > Regards, > Nithya > > As soon as all the bricks will be empty, I plan to re-add the new bricks using the following commands: > > gluster peer detach s04 > gluster peer detach s05 > gluster peer detach s06 > > gluster peer probe s04 > gluster peer probe s05 > gluster peer probe s06 > > gluster volume add-brick tier2 s04-stg:/gluster/mnt1/brick s05-stg:/gluster/mnt1/brick s06-stg:/gluster/mnt1/brick s04-stg:/gluster/mnt2/brick s05-stg:/gluster/mnt2/brick s06-stg:/gluster/mnt2/brick s04-stg:/gluster/mnt3/brick s05-stg:/gluster/mnt3/brick s06-stg:/gluster/mnt3/brick s04-stg:/gluster/mnt4/brick s05-stg:/gluster/mnt4/brick s06-stg:/gluster/mnt4/brick s04-stg:/gluster/mnt5/brick s05-stg:/gluster/mnt5/brick s06-stg:/gluster/mnt5/brick s04-stg:/gluster/mnt6/brick s05-stg:/gluster/mnt6/brick s06-stg:/gluster/mnt6/brick s04-stg:/gluster/mnt7/brick s05-stg:/gluster/mnt7/brick s06-stg:/gluster/mnt7/brick s04-stg:/gluster/mnt8/brick s05-stg:/gluster/mnt8/brick s06-stg:/gluster/mnt8/brick s04-stg:/gluster/mnt9/brick s05-stg:/gluster/mnt9/brick s06-stg:/gluster/mnt9/brick s04-stg:/gluster/mnt10/brick s05-stg:/gluster/mnt10/brick s06-stg:/gluster/mnt10/brick s04-stg:/gluster/mnt11/brick s05-stg:/gluster/mnt11/brick s06-stg:/gluster/mnt11/brick s04-stg:/gluster/mnt12/brick s05-stg:/gluster/mnt12/brick s06-stg:/gluster/mnt12/brick force > > gluster volume rebalance tier2 fix-layout start > > gluster volume rebalance tier2 start > > From your point of view, are they the right commands to close this repairing task? > > Thank you very much for your help. > Regards, > Mauro > > > > > >> Il giorno 01 ott 2018, alle ore 09:17, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >> >> >> Ohh!! It is because brick-multiplexing is "ON" on your setup. Not sure if it is by default ON for 3.12.14 or not. >> >> See "cluster.brick-multiplex: on" in gluster v <volname> info >> If brick multiplexing is ON, you will see only one process running for all the bricks on a Node. >> >> So we have to do following step to kill any one brick on a node. >> >> Steps to kill a brick when multiplex is on - >> >> Step - 1 >> Find unix domain_socket of the process on a node. >> Run "ps -aef | grep glusterfsd" on a node. Example : >> >> This is on my machine when I have all the bricks on same machine >> >> [root at apandey glusterfs]# ps -aef | grep glusterfsd | grep -v mnt >> root 28311 1 0 11:16 ? 00:00:06 /usr/local/sbin/glusterfsd -s apandey --volfile-id vol.apandey.home-apandey-bricks-gluster-vol-1 -p /var/run/gluster/vols/vol/apandey-home-apandey-bricks-gluster-vol-1.pid -S /var/run/gluster/1259033d2ff4f4e5.socket --brick-name /home/apandey/bricks/gluster/vol-1 -l /var/log/glusterfs/bricks/home-apandey-bricks-gluster-vol-1.log --xlator-option *-posix.glusterd-uuid=61b4524c-ccf3-4219-aaff-b3497ac6dd24 --process-name brick --brick-port 49158 --xlator-option vol-server.listen-port=49158 >> >> Here, /var/run/gluster/1259033d2ff4f4e5.socket is the unix domain socket >> >> Step - 2 >> Run following command to kill a brick on the same node - >> >> gf_attach -d <unix domain_socket> brick_path_on_that_node >> >> Example: >> >> gf_attach -d /var/run/gluster/1259033d2ff4f4e5.socket /home/apandey/bricks/gluster/vol-6 >> >> Status of volume: vol >> Gluster process TCP Port RDMA Port Online Pid >> ------------------------------------------------------------------------------ >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-1 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-2 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-3 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-4 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-5 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-6 49158 0 Y 28311 >> Self-heal Daemon on localhost N/A N/A Y 29787 >> >> Task Status of Volume vol >> ------------------------------------------------------------------------------ >> There are no active volume tasks >> >> [root at apandey glusterfs]# >> [root at apandey glusterfs]# >> [root at apandey glusterfs]# gf_attach -d /var/run/gluster/1259033d2ff4f4e5.socket /home/apandey/bricks/gluster/vol-6 >> OK >> [root at apandey glusterfs]# gluster v status >> Status of volume: vol >> Gluster process TCP Port RDMA Port Online Pid >> ------------------------------------------------------------------------------ >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-1 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-2 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-3 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-4 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-5 49158 0 Y 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-6 N/A N/A N N/A >> Self-heal Daemon on localhost N/A N/A Y 29787 >> >> Task Status of Volume vol >> ------------------------------------------------------------------------------ >> There are no active volume tasks >> >> >> To start a brick we just need to start volume using "force" >> >> gluster v start <volname> force >> >> ---- >> Ashish >> >> >> >> >> >> >> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >> Cc: "Gluster Users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >> Sent: Friday, September 28, 2018 9:25:53 PM >> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >> >> >> I asked you how to detect the PID of a specific brick because I see that more than one brick has the same PID (also on my virtual env). >> If I kill one of them I risk to kill some other brick. Is it normal? >> >> [root at s01 ~]# gluster vol status >> Status of volume: tier2 >> Gluster process TCP Port RDMA Port Online Pid >> ------------------------------------------------------------------------------ >> Brick s01-stg:/gluster/mnt1/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt1/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt1/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt2/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt2/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt2/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt3/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt3/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt3/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt4/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt4/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt4/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt5/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt5/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt5/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt6/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt6/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt6/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt7/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt7/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt7/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt8/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt8/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt8/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt9/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt9/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt9/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt10/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt10/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt10/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt11/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt11/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt11/brick 49153 0 Y 3953 >> Brick s01-stg:/gluster/mnt12/brick 49153 0 Y 3956 >> Brick s02-stg:/gluster/mnt12/brick 49153 0 Y 3956 >> Brick s03-stg:/gluster/mnt12/brick 49153 0 Y 3953 >> Brick s04-stg:/gluster/mnt1/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt2/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt3/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt4/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt5/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt6/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt7/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt8/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt9/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt10/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt11/brick 49153 0 Y 3433 >> Brick s04-stg:/gluster/mnt12/brick 49153 0 Y 3433 >> Brick s05-stg:/gluster/mnt1/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt2/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt3/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt4/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt5/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt6/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt7/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt8/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt9/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt10/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt11/brick 49153 0 Y 3709 >> Brick s05-stg:/gluster/mnt12/brick 49153 0 Y 3709 >> Brick s06-stg:/gluster/mnt1/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt2/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt3/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt4/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt5/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt6/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt7/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt8/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt9/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt10/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt11/brick 49153 0 Y 3644 >> Brick s06-stg:/gluster/mnt12/brick 49153 0 Y 3644 >> Self-heal Daemon on localhost N/A N/A Y 79376 >> Quota Daemon on localhost N/A N/A Y 79472 >> Bitrot Daemon on localhost N/A N/A Y 79485 >> Scrubber Daemon on localhost N/A N/A Y 79505 >> Self-heal Daemon on s03-stg N/A N/A Y 77073 >> Quota Daemon on s03-stg N/A N/A Y 77148 >> Bitrot Daemon on s03-stg N/A N/A Y 77160 >> Scrubber Daemon on s03-stg N/A N/A Y 77191 >> Self-heal Daemon on s02-stg N/A N/A Y 80150 >> Quota Daemon on s02-stg N/A N/A Y 80226 >> Bitrot Daemon on s02-stg N/A N/A Y 80238 >> Scrubber Daemon on s02-stg N/A N/A Y 80269 >> Self-heal Daemon on s04-stg N/A N/A Y 106815 >> Quota Daemon on s04-stg N/A N/A Y 106866 >> Bitrot Daemon on s04-stg N/A N/A Y 106878 >> Scrubber Daemon on s04-stg N/A N/A Y 106897 >> Self-heal Daemon on s05-stg N/A N/A Y 130807 >> Quota Daemon on s05-stg N/A N/A Y 130884 >> Bitrot Daemon on s05-stg N/A N/A Y 130896 >> Scrubber Daemon on s05-stg N/A N/A Y 130927 >> Self-heal Daemon on s06-stg N/A N/A Y 157146 >> Quota Daemon on s06-stg N/A N/A Y 157239 >> Bitrot Daemon on s06-stg N/A N/A Y 157252 >> Scrubber Daemon on s06-stg N/A N/A Y 157288 >> >> Task Status of Volume tier2 >> ------------------------------------------------------------------------------ >> Task : Remove brick >> ID : 06ec63bb-a441-4b85-b3cf-ac8e9df4830f >> Removed bricks: >> s04-stg:/gluster/mnt1/brick >> s04-stg:/gluster/mnt2/brick >> s04-stg:/gluster/mnt3/brick >> s04-stg:/gluster/mnt4/brick >> s04-stg:/gluster/mnt5/brick >> s04-stg:/gluster/mnt6/brick >> Status : in progress >> >> [root at s01 ~]# ps -ef|grep glusterfs >> root 3956 1 79 set25 ? 2-14:33:57 /usr/sbin/glusterfsd -s s01-stg --volfile-id tier2.s01-stg.gluster-mnt1-brick -p /var/run/gluster/vols/tier2/s01-stg-gluster-mnt1-brick.pid -S /var/run/gluster/a889b8a21ac2afcbfa0563b9dd4db265.socket --brick-name /gluster/mnt1/brick -l /var/log/glusterfs/bricks/gluster-mnt1-brick.log --xlator-option *-posix.glusterd-uuid=b734b083-4630-4523-9402-05d03565efee --brick-port 49153 --xlator-option tier2-server.listen-port=49153 >> root 79376 1 0 09:16 ? 00:04:16 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/4fab1a27e6ee700b3b9a3b3393ab7445.socket --xlator-option *replicate*.node-uuid=b734b083-4630-4523-9402-05d03565efee >> root 79472 1 0 09:16 ? 00:00:42 /usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p /var/run/gluster/quotad/quotad.pid -l /var/log/glusterfs/quotad.log -S /var/run/gluster/958ab34799fc58f4dfe20e5732eea70b.socket --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off >> root 79485 1 7 09:16 ? 00:40:43 /usr/sbin/glusterfs -s localhost --volfile-id gluster/bitd -p /var/run/gluster/bitd/bitd.pid -l /var/log/glusterfs/bitd.log -S /var/run/gluster/b2ea9da593fae1bc4d94e65aefdbdda9.socket --global-timer-wheel >> root 79505 1 0 09:16 ? 00:00:01 /usr/sbin/glusterfs -s localhost --volfile-id gluster/scrub -p /var/run/gluster/scrub/scrub.pid -l /var/logglusterfs/scrub.log -S /var/run/gluster/ee7886cbcf8d2adf261084b608c905d5.socket --global-timer-wheel >> root 137362 137225 0 17:53 pts/0 00:00:00 grep --color=auto glusterfs >> >> Il giorno 28 set 2018, alle ore 17:47, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >> >> >> >> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >> Cc: "Gluster Users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >> Sent: Friday, September 28, 2018 9:08:52 PM >> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >> >> Thank you, Ashish. >> >> I will study and try your solution on my virtual env. >> How I can detect the process of a brick on gluster server? >> >> Many Thanks, >> Mauro >> >> >> gluster v status <volname> will give you the list of bricks and the respective process id. >> Also, you can use "ps aux | grep glusterfs" to see all the processes on a node but I think the above step also do the same. >> >> --- >> Ashish >> >> >> >> Il ven 28 set 2018 16:39 Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >> >> >> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >> Cc: "gluster-users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >> Sent: Friday, September 28, 2018 7:08:41 PM >> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >> >> >> Dear Ashish, >> >> please excuse me, I'm very sorry for misunderstanding. >> Before contacting you during last days, we checked all network devices (switch 10GbE, cables, NICs, servers ports, and so on), operating systems version and settings, network bonding configuration, gluster packages versions, tuning profiles, etc. but everything seems to be ok. The first 3 servers (and volume) operated without problem for one year. After we added the new 3 servers we noticed something wrong. >> Fortunately, yesterday you gave me an hand to understand where is (or could be) the problem. >> >> At this moment, after we re-launched the remove-brick command, it seems that the rebalance is going ahead without errors, but it is only scanning the files. >> May be that during the future data movement some errors could appear. >> >> For this reason, it could be useful to know how to proceed in case of a new failure: insist with approach n.1 or change the strategy? >> We are thinking to try to complete the running remove-brick procedure and make a decision based on the outcome. >> >> Question: could we start approach n.2 also after having successfully removed the V1 subvolume?! >> >> >>> Yes, we can do that. My idea is to use replace-brick command. >> We will kill "ONLY" one brick process on s06. We will format this brick. Then use replace-brick command to replace brick of a volume on s05 with this formatted brick. >> heal will be triggered and data of the respective volume will be placed on this brick. >> >> Now, we can format the brick which got freed up on s05 and replace the brick which we killed on s06 to s05. >> During this process, we have to make sure heal completed before trying any other replace/kill brick. >> >> It is tricky but looks doable. Think about it and try to perform it on your virtual environment first before trying on production. >> ------- >> >> If it is still possible, could you please illustrate the approach n.2 even if I dont have free disks? >> I would like to start thinking about it and test it on a virtual environment. >> >> Thank you in advance for your help and patience. >> Regards, >> Mauro >> >> >> >> Il giorno 28 set 2018, alle ore 14:36, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >> >> >> We could have taken approach -2 even if you did not have free disks. You should have told me why are you >> opting Approach-1 or perhaps I should have asked. >> I was wondering for approach 1 because sometimes re-balance takes time depending upon the data size. >> >> Anyway, I hope whole setup is stable, I mean it is not in the middle of something which we can not stop. >> If free disks are the only concern I will give you some more steps to deal with it and follow the approach 2. >> >> Let me know once you think everything is fine with the system and there is nothing to heal. >> >> --- >> Ashish >> >> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >> Cc: "gluster-users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >> Sent: Friday, September 28, 2018 4:21:03 PM >> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >> >> >> Hi Ashish, >> >> as I said in my previous message, we adopted the first approach you suggested (setting network.ping-timeout option to 0). >> This choice was due to the absence of empty brick to be used as indicated in the second approach. >> >> So, we launched remove-brick command on the first subvolume (V1, bricks 1,2,3,4,5,6 on server s04). >> Rebalance started moving the data across the other bricks, but, after about 3TB of moved data, rebalance speed slowed down and some transfer errors appeared in the rebalance.log of server s04. >> At this point, since remaining 1,8TB need to be moved in order to complete the step, we decided to stop the remove-brick execution and start it again (I hope it doesn?t stop again before complete the rebalance) >> >> Now rebalance is not moving data, it?s only scanning files (please, take a look to the following output) >> >> [root at s01 ~]# gluster volume remove-brick tier2 s04-stg:/gluster/mnt1/brick s04-stg:/gluster/mnt2/brick s04-stg:/gluster/mnt3/brick s04-stg:/gluster/mnt4/brick s04-stg:/gluster/mnt5/brick s04-stg:/gluster/mnt6/brick status >> Node Rebalanced-files size scanned failures skipped status run time in h:m:s >> --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- >> s04-stg 0 0Bytes 182008 0 0 in progress 3:08:09 >> Estimated time left for rebalance to complete : 442:45:06 >> >> If I?m not wrong, remove-brick rebalances entire cluster each time it start. >> Is there a way to speed up this procedure? Do you have some other suggestion that, in this particular case, could be useful to reduce errors (I know that they are related to the current volume configuration) and improve rebalance performance avoiding to rebalance the entire cluster? >> >> Thank you in advance, >> Mauro >> >> Il giorno 27 set 2018, alle ore 13:14, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >> >> >> Yes, you can. >> If not me others may also reply. >> >> --- >> Ashish >> >> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >> Cc: "gluster-users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >> Sent: Thursday, September 27, 2018 4:24:12 PM >> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >> >> >> Dear Ashish, >> >> I can not thank you enough! >> Your procedure and description is very detailed. >> I think to follow the first approach after setting network.ping-timeout option to 0 (If I?m not wrong ?0" means ?infinite?...I noticed that this value reduced rebalance errors). >> After the fix I will set network.ping-timeout option to default value. >> >> Could I contact you again if I need some kind of suggestion? >> >> Thank you very much again. >> Have a good day, >> Mauro >> >> >> Il giorno 27 set 2018, alle ore 12:38, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >> >> >> Hi Mauro, >> >> We can divide the 36 newly added bricks into 6 set of 6 bricks each starting from brick37. >> That means, there are 6 ec subvolumes and we have to deal with one sub volume at a time. >> I have named it V1 to V6. >> >> Problem: >> Take the case of V1. >> The best configuration/setup would be to have all the 6 bricks of V1 on 6 different nodes. >> However, in your case you have added 3 new nodes. So, at least we should have 2 bricks on 3 different newly added nodes. >> This way, in 4+2 EC configuration, even if one node goes down you will have 4 other bricks of that volume > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> > https://lists.gluster.org/mailman/listinfo/gluster-users <https://lists.gluster.org/mailman/listinfo/gluster-users> > ... > > [Message clipped] >------------------------- Mauro Tridici Fondazione CMCC CMCC Supercomputing Center presso Complesso Ecotekne - Universit? del Salento - Strada Prov.le Lecce - Monteroni sn 73100 Lecce IT http://www.cmcc.it mobile: (+39) 327 5630841 email: mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it> https://it.linkedin.com/in/mauro-tridici-5977238b -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181003/5b8574b7/attachment-0002.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: ls-l_on_a_brick.txt.gz Type: application/x-gzip Size: 2747799 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181003/5b8574b7/attachment-0001.gz> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181003/5b8574b7/attachment-0003.html>
Mauro Tridici
2018-Oct-03 21:12 UTC
[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version
Hi Nithya, I created and executed the following simple script in order to check each brick content. --- #!/bin/bash for i in {1..12} do # ls -lR /gluster/mnt$i/brick/ > $HOSTNAME.brick$i.txt find /gluster/mnt$i/brick -type f -print0|xargs -0r ls -l > $HOSTNAME.brick$i.txt wc -l $HOSTNAME.brick$i.txt >> report.txt grep -v '\-\-T' $HOSTNAME.brick$i.txt >> report.txt done ? It scans all files left on the bricks and save, for each brick, the ?ls -l? output to separated log files (named s04.brick#.txt). Moreover, the bash script creates a report file (report.txt) to collect all file without ?- - T? tag. [root at s04 left]# ll totale 557236 -rwxr--r-- 1 root root 273 3 ott 22.45 check -rw------- 1 root root 0 3 ott 22.46 nohup.out -rw-r--r-- 1 root root 7581 3 ott 22.49 report.txt -rw-r--r-- 1 root root 44801236 3 ott 22.48 s04.brick10.txt -rw-r--r-- 1 root root 44801236 3 ott 22.49 s04.brick11.txt -rw-r--r-- 1 root root 44801236 3 ott 22.49 s04.brick12.txt -rw-r--r-- 1 root root 45007600 3 ott 22.46 s04.brick1.txt -rw-r--r-- 1 root root 45007600 3 ott 22.46 s04.brick2.txt -rw-r--r-- 1 root root 45007600 3 ott 22.47 s04.brick3.txt -rw-r--r-- 1 root root 45007600 3 ott 22.47 s04.brick4.txt -rw-r--r-- 1 root root 45007600 3 ott 22.47 s04.brick5.txt -rw-r--r-- 1 root root 45007600 3 ott 22.47 s04.brick6.txt -rw-r--r-- 1 root root 44474106 3 ott 22.48 s04.brick7.txt -rw-r--r-- 1 root root 44474106 3 ott 22.48 s04.brick8.txt -rw-r--r-- 1 root root 44474106 3 ott 22.48 s04.brick9.txt So, at the end of the script execution, I obtained that: - s04 server bricks don?t contain files without ? - - T? tag except for the following files (I think I can delete them, right?) -rw-r--r-- 1 root root 4096 11 set 11.22 /gluster/mnt12/brick/.glusterfs/brick.db -rw-r--r-- 1 root root 32768 16 set 03.21 /gluster/mnt12/brick/.glusterfs/brick.db-shm -rw-r--r-- 1 root root 20632 11 set 11.22 /gluster/mnt12/brick/.glusterfs/brick.db-wal -rw-r--r-- 1 root root 19 29 set 15.14 /gluster/mnt12/brick/.glusterfs/health_check ---------- 1 root root 0 29 set 00.05 /gluster/mnt12/brick/.glusterfs/indices/xattrop/xattrop-9040d2ea-6acb-42c2-b515-0a44380e60d8 ---------- 1 root root 0 11 set 11.22 /gluster/mnt12/brick/.glusterfs/quarantine/stub-00000000-0000-0000-0000-000000000008 - s05 server bricks don?t contain files without ? - - T? tag except for the following files: -rw-r--r-- 1 root root 4096 11 set 11.22 /gluster/mnt8/brick/.glusterfs/brick.db -rw-r--r-- 1 root root 32768 16 set 03.19 /gluster/mnt8/brick/.glusterfs/brick.db-shm -rw-r--r-- 1 root root 20632 11 set 11.22 /gluster/mnt8/brick/.glusterfs/brick.db-wal -rw-r--r-- 1 root root 19 1 ott 07.30 /gluster/mnt8/brick/.glusterfs/health_check ---------- 1 root root 0 30 set 16.42 /gluster/mnt8/brick/.glusterfs/indices/xattrop/xattrop-9db3d840-35e0-4359-8d7a-14d305760247 ---------- 1 root root 0 11 set 11.22 /gluster/mnt8/brick/.glusterfs/quarantine/stub-00000000-0000-0000-0000-000000000008 - s06 server bricks HAVE some files that I think are important. This is the files list: -rw-r--r-- 2 5219 5200 519226880 14 set 17.29 /gluster/mnt6/brick/.glusterfs/ef/87/ef870cb8-03be-45c8-8b72-38941f08b8a5 -rw-r--r-- 2 5219 5200 844800 17 gen 2017 /gluster/mnt6/brick/.glusterfs/ef/98/ef98b463-3a0a-46a2-ad18-37149d4dd65c -rw-r--r-- 2 5219 5200 3164160 23 apr 2016 /gluster/mnt6/brick/.glusterfs/a4/25/a4255b8e-de1f-4acc-a5cf-d47ac7767d46 -rw-r--r-- 2 12001 12000 0 12 ago 05.06 /gluster/mnt6/brick/.glusterfs/a4/52/a4520383-eaa1-4c82-a6f0-d9ea7de4c48d ---------- 1 root root 0 11 set 11.22 /gluster/mnt6/brick/.glusterfs/quarantine/stub-00000000-0000-0000-0000-000000000008 -rw-r--r-- 2 12001 12000 0 23 lug 22.22 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1986_simu/prepobs_19860823/COST.DAT -rw-r--r-- 2 12001 12000 0 24 lug 00.28 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1986_simu/prepobs_19860828/COST.DAT -rw-r--r-- 2 12001 12000 0 24 lug 08.27 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1986_simu/prepobs_19860916/COST.DAT -rw-r--r-- 2 12001 12000 0 26 lug 00.50 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1986_simu/prepobs_19861221/COST.DAT -rw-r--r-- 2 12001 12000 0 26 lug 03.23 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1986_simu/prepobs_19861230/COST.DAT -rw-r--r-- 2 12001 12000 0 7 ago 15.21 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1987_simu/model/wind/in/procday.20180101 -rw-r--r-- 2 12001 12000 0 7 ago 15.22 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1987_simu/model/wind/work/err.log -rw-r--r-- 2 12001 12000 0 14 ago 12.55 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1988_simu/assim/no_obs_19881231 -rw-r--r-- 2 12001 12000 0 14 ago 12.55 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1988_simu/model/wind/in/endday.19881231 -rw-r--r-- 2 12001 12000 0 14 ago 12.55 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1988_simu/model/wind/in/procday.20180101 -rw-r--r-- 2 12001 12000 0 14 ago 12.55 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1988_simu/model/wind/in/startday.19881230 -rw-r--r-- 2 12001 12000 0 14 ago 12.55 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1988_simu/model/wind/work/err.log -rw-r--r-- 2 12001 12000 0 8 ago 13.51 /gluster/mnt6/brick/OPA/tessa01/work/REA_exp/rea_1988_simu/prepobs_19880114/COST.DAT -rw-r--r-- 2 5219 5200 844800 22 giu 2016 /gluster/mnt6/brick/CSP/sp1/CESM/archive/sps_199301_001/atm/hist/postproc/sps_199301_001.cam.h0.1993-03_grid.nc -rw-r--r-- 2 5219 5200 3203072 22 apr 2016 /gluster/mnt6/brick/CSP/sp1/CESM/archive/sps_199301_001/lnd/hist/lnd/hist/sps_199301_001.clm2.h0.1993-03.nc.gz -rw-r--r-- 2 5219 5200 3164160 23 apr 2016 /gluster/mnt6/brick/CSP/sp1/CESM/archive/sps_199301_001/lnd/hist/lnd/hist/sps_199301_001.clm2.h0.1993-05.nc.gz -rw-r--r-- 2 5219 5200 844800 17 gen 2017 /gluster/mnt6/brick/CSP/sp1/CESM/archive/sps_199301_002/atm/hist/postproc/sps_199301_002.cam.h0.1993-01_grid.nc -rw-r--r-- 2 5219 5200 844800 17 gen 2017 /gluster/mnt6/brick/CSP/sp1/CESM/archive/sps_199301_002/atm/hist/postproc/sps_199301_002.cam.h0.1993-05_grid.nc -rw-r--r-- 2 5219 5200 844800 17 gen 2017 /gluster/mnt6/brick/CSP/sp1/CESM/archive/sps_199301_002/atm/hist/postproc/sps_199301_002.cam.h0.1993-06_grid.nc -rw-r--r-- 2 5219 5200 844800 22 giu 2016 /gluster/mnt6/brick/CSP/sp1/CESM/archive/sps_199301_003/atm/hist/postproc/sps_199301_003.cam.h0.1993-03_grid.nc What can I do with the files not moved by rebalance? Thank you, Mauro> Il giorno 03 ott 2018, alle ore 17:48, Mauro Tridici <mauro.tridici at cmcc.it> ha scritto: > > > Hi Nithya, > > in order to give an answer to your question as soon as possible, I just considered only the content of one brick of server s06 (in attachment you can find the content of /gluster/mnt1/brick). > > [root at s06 ~]# df -h > File system Dim. Usati Dispon. Uso% Montato su > /dev/mapper/cl_s06-root 100G 2,1G 98G 3% / > devtmpfs 32G 0 32G 0% /dev > tmpfs 32G 4,0K 32G 1% /dev/shm > tmpfs 32G 106M 32G 1% /run > tmpfs 32G 0 32G 0% /sys/fs/cgroup > /dev/mapper/cl_s06-var 100G 3,0G 97G 3% /var > /dev/mapper/cl_s06-gluster 100G 33M 100G 1% /gluster > /dev/sda1 1014M 152M 863M 15% /boot > /dev/mapper/gluster_vgd-gluster_lvd 9,0T 12G 9,0T 1% /gluster/mnt3 > /dev/mapper/gluster_vgg-gluster_lvg 9,0T 12G 9,0T 1% /gluster/mnt6 > /dev/mapper/gluster_vgc-gluster_lvc 9,0T 12G 9,0T 1% /gluster/mnt2 > /dev/mapper/gluster_vge-gluster_lve 9,0T 12G 9,0T 1% /gluster/mnt4 > /dev/mapper/gluster_vgj-gluster_lvj 9,0T 1,4T 7,7T 16% /gluster/mnt9 > /dev/mapper/gluster_vgb-gluster_lvb 9,0T 12G 9,0T 1% /gluster/mnt1 > /dev/mapper/gluster_vgh-gluster_lvh 9,0T 1,4T 7,7T 16% /gluster/mnt7 > /dev/mapper/gluster_vgf-gluster_lvf 9,0T 12G 9,0T 1% /gluster/mnt5 > /dev/mapper/gluster_vgi-gluster_lvi 9,0T 1,4T 7,7T 16% /gluster/mnt8 > /dev/mapper/gluster_vgl-gluster_lvl 9,0T 1,4T 7,7T 16% /gluster/mnt11 > /dev/mapper/gluster_vgk-gluster_lvk 9,0T 1,4T 7,7T 16% /gluster/mnt10 > /dev/mapper/gluster_vgm-gluster_lvm 9,0T 1,4T 7,7T 16% /gluster/mnt12 > > The scenario is almost the same for all the bricks removed from server s04, s05 and s06. > In the next hours, I will check every files on each removed bricks. > > So, if I understand, I can proceed with deletion of directories and files left on the bricks only if each file have T tag, right? > > Thank you in advance, > Mauro > > <ls-l_on_a_brick.txt.gz> >> Il giorno 03 ott 2018, alle ore 16:49, Nithya Balachandran <nbalacha at redhat.com <mailto:nbalacha at redhat.com>> ha scritto: >> >> >> >> On 1 October 2018 at 15:35, Mauro Tridici <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> wrote: >> Good morning Ashish, >> >> your explanations are always very useful, thank you very much: I will remember these suggestions for any future needs. >> Anyway, during the week-end, the remove-brick procedures ended successfully and we were able to free up all bricks defined on server s04, s05 and 6 bricks of 12 on server s06. >> So, we can say that, thanks to your suggestions, we are about to complete this first phase (removing of all bricks defined on s04, s05 and s06 servers). >> >> I really appreciated your support. >> Now I have a last question (I hope): after remove-brick commit I noticed that some data remain on each brick (about 1.2GB of data). >> Please, take a look to the ?df-h_on_s04_s05_s06.txt?. >> The situation is almost the same on all 3 servers mentioned above: a long list of directories names and some files that are still on the brick, but respective size is 0. >> >> Examples: >> >> a lot of empty directories on /gluster/mnt*/brick/.glusterfs >> >> 8 /gluster/mnt2/brick/.glusterfs/b7/1b >> 0 /gluster/mnt2/brick/.glusterfs/b7/ee/b7ee94a5-a77c-4c02-85a5-085992840c83 >> 0 /gluster/mnt2/brick/.glusterfs/b7/ee/b7ee85d4-ce48-43a7-a89a-69c728ee8273 >> >> some empty files in directories in /gluster/mnt*/brick/* >> >> [root at s04 ~]# cd /gluster/mnt1/brick/ >> [root at s04 brick]# ls -l >> totale 32 >> drwxr-xr-x 7 root root 100 11 set 22.14 archive_calypso >> >> [root at s04 brick]# cd archive_calypso/ >> [root at s04 archive_calypso]# ll >> totale 0 >> drwxr-x--- 3 root 5200 29 11 set 22.13 ans002 >> drwxr-x--- 3 5104 5100 32 11 set 22.14 ans004 >> drwxr-x--- 3 4506 4500 31 11 set 22.14 ans006 >> drwxr-x--- 3 4515 4500 28 11 set 22.14 ans015 >> drwxr-x--- 4 4321 4300 54 11 set 22.14 ans021 >> [root at s04 archive_calypso]# du -a * >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.0/echam5/echam_sf006_198110.01.gz >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.0/echam5 >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.0 >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.1/echam5/echam_sf006_198105.01.gz >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.1/echam5/echam_sf006_198109.01.gz >> 8 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.1/echam5 >> >> What we have to do with this data? Should I backup this ?empty? dirs and files on a different storage before deleting them? >> >> Hi Mauro, >> >> Are you sure these files and directories are empty? Please provide the ls -l output for the files. If they are 'T' files , they can be ignored. >> >> Regards, >> Nithya >> >> As soon as all the bricks will be empty, I plan to re-add the new bricks using the following commands: >> >> gluster peer detach s04 >> gluster peer detach s05 >> gluster peer detach s06 >> >> gluster peer probe s04 >> gluster peer probe s05 >> gluster peer probe s06 >> >> gluster volume add-brick tier2 s04-stg:/gluster/mnt1/brick s05-stg:/gluster/mnt1/brick s06-stg:/gluster/mnt1/brick s04-stg:/gluster/mnt2/brick s05-stg:/gluster/mnt2/brick s06-stg:/gluster/mnt2/brick s04-stg:/gluster/mnt3/brick s05-stg:/gluster/mnt3/brick s06-stg:/gluster/mnt3/brick s04-stg:/gluster/mnt4/brick s05-stg:/gluster/mnt4/brick s06-stg:/gluster/mnt4/brick s04-stg:/gluster/mnt5/brick s05-stg:/gluster/mnt5/brick s06-stg:/gluster/mnt5/brick s04-stg:/gluster/mnt6/brick s05-stg:/gluster/mnt6/brick s06-stg:/gluster/mnt6/brick s04-stg:/gluster/mnt7/brick s05-stg:/gluster/mnt7/brick s06-stg:/gluster/mnt7/brick s04-stg:/gluster/mnt8/brick s05-stg:/gluster/mnt8/brick s06-stg:/gluster/mnt8/brick s04-stg:/gluster/mnt9/brick s05-stg:/gluster/mnt9/brick s06-stg:/gluster/mnt9/brick s04-stg:/gluster/mnt10/brick s05-stg:/gluster/mnt10/brick s06-stg:/gluster/mnt10/brick s04-stg:/gluster/mnt11/brick s05-stg:/gluster/mnt11/brick s06-stg:/gluster/mnt11/brick s04-stg:/gluster/mnt12/brick s05-stg:/gluster/mnt12/brick s06-stg:/gluster/mnt12/brick force >> >> gluster volume rebalance tier2 fix-layout start >> >> gluster volume rebalance tier2 start >> >> From your point of view, are they the right commands to close this repairing task? >> >> Thank you very much for your help. >> Regards, >> Mauro >> >> >> >> >> >>> Il giorno 01 ott 2018, alle ore 09:17, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >>> >>> >>> Ohh!! It is because brick-multiplexing is "ON" on your setup. Not sure if it is by default ON for 3.12.14 or not. >>> >>> See "cluster.brick-multiplex: on" in gluster v <volname> info >>> If brick multiplexing is ON, you will see only one process running for all the bricks on a Node. >>> >>> So we have to do following step to kill any one brick on a node. >>> >>> Steps to kill a brick when multiplex is on - >>> >>> Step - 1 >>> Find unix domain_socket of the process on a node. >>> Run "ps -aef | grep glusterfsd" on a node. Example : >>> >>> This is on my machine when I have all the bricks on same machine >>> >>> [root at apandey glusterfs]# ps -aef | grep glusterfsd | grep -v mnt >>> root 28311 1 0 11:16 ? 00:00:06 /usr/local/sbin/glusterfsd -s apandey --volfile-id vol.apandey.home-apandey-bricks-gluster-vol-1 -p /var/run/gluster/vols/vol/apandey-home-apandey-bricks-gluster-vol-1.pid -S /var/run/gluster/1259033d2ff4f4e5.socket --brick-name /home/apandey/bricks/gluster/vol-1 -l /var/log/glusterfs/bricks/home-apandey-bricks-gluster-vol-1.log --xlator-option *-posix.glusterd-uuid=61b4524c-ccf3-4219-aaff-b3497ac6dd24 --process-name brick --brick-port 49158 --xlator-option vol-server.listen-port=49158 >>> >>> Here, /var/run/gluster/1259033d2ff4f4e5.socket is the unix domain socket >>> >>> Step - 2 >>> Run following command to kill a brick on the same node - >>> >>> gf_attach -d <unix domain_socket> brick_path_on_that_node >>> >>> Example: >>> >>> gf_attach -d /var/run/gluster/1259033d2ff4f4e5.socket /home/apandey/bricks/gluster/vol-6 >>> >>> Status of volume: vol >>> Gluster process TCP Port RDMA Port Online Pid >>> ------------------------------------------------------------------------------ >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-1 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-2 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-3 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-4 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-5 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-6 49158 0 Y 28311 >>> Self-heal Daemon on localhost N/A N/A Y 29787 >>> >>> Task Status of Volume vol >>> ------------------------------------------------------------------------------ >>> There are no active volume tasks >>> >>> [root at apandey glusterfs]# >>> [root at apandey glusterfs]# >>> [root at apandey glusterfs]# gf_attach -d /var/run/gluster/1259033d2ff4f4e5.socket /home/apandey/bricks/gluster/vol-6 >>> OK >>> [root at apandey glusterfs]# gluster v status >>> Status of volume: vol >>> Gluster process TCP Port RDMA Port Online Pid >>> ------------------------------------------------------------------------------ >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-1 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-2 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-3 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-4 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-5 49158 0 Y 28311 >>> Brick apandey:/home/apandey/bricks/gluster/ >>> vol-6 N/A N/A N N/A >>> Self-heal Daemon on localhost N/A N/A Y 29787 >>> >>> Task Status of Volume vol >>> ------------------------------------------------------------------------------ >>> There are no active volume tasks >>> >>> >>> To start a brick we just need to start volume using "force" >>> >>> gluster v start <volname> force >>> >>> ---- >>> Ashish >>> >>> >>> >>> >>> >>> >>> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >>> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >>> Cc: "Gluster Users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >>> Sent: Friday, September 28, 2018 9:25:53 PM >>> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >>> >>> >>> I asked you how to detect the PID of a specific brick because I see that more than one brick has the same PID (also on my virtual env). >>> If I kill one of them I risk to kill some other brick. Is it normal? >>> >>> [root at s01 ~]# gluster vol status >>> Status of volume: tier2 >>> Gluster process TCP Port RDMA Port Online Pid >>> ------------------------------------------------------------------------------ >>> Brick s01-stg:/gluster/mnt1/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt1/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt1/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt2/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt2/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt2/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt3/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt3/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt3/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt4/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt4/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt4/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt5/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt5/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt5/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt6/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt6/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt6/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt7/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt7/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt7/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt8/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt8/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt8/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt9/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt9/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt9/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt10/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt10/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt10/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt11/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt11/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt11/brick 49153 0 Y 3953 >>> Brick s01-stg:/gluster/mnt12/brick 49153 0 Y 3956 >>> Brick s02-stg:/gluster/mnt12/brick 49153 0 Y 3956 >>> Brick s03-stg:/gluster/mnt12/brick 49153 0 Y 3953 >>> Brick s04-stg:/gluster/mnt1/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt2/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt3/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt4/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt5/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt6/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt7/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt8/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt9/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt10/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt11/brick 49153 0 Y 3433 >>> Brick s04-stg:/gluster/mnt12/brick 49153 0 Y 3433 >>> Brick s05-stg:/gluster/mnt1/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt2/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt3/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt4/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt5/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt6/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt7/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt8/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt9/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt10/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt11/brick 49153 0 Y 3709 >>> Brick s05-stg:/gluster/mnt12/brick 49153 0 Y 3709 >>> Brick s06-stg:/gluster/mnt1/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt2/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt3/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt4/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt5/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt6/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt7/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt8/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt9/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt10/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt11/brick 49153 0 Y 3644 >>> Brick s06-stg:/gluster/mnt12/brick 49153 0 Y 3644 >>> Self-heal Daemon on localhost N/A N/A Y 79376 >>> Quota Daemon on localhost N/A N/A Y 79472 >>> Bitrot Daemon on localhost N/A N/A Y 79485 >>> Scrubber Daemon on localhost N/A N/A Y 79505 >>> Self-heal Daemon on s03-stg N/A N/A Y 77073 >>> Quota Daemon on s03-stg N/A N/A Y 77148 >>> Bitrot Daemon on s03-stg N/A N/A Y 77160 >>> Scrubber Daemon on s03-stg N/A N/A Y 77191 >>> Self-heal Daemon on s02-stg N/A N/A Y 80150 >>> Quota Daemon on s02-stg N/A N/A Y 80226 >>> Bitrot Daemon on s02-stg N/A N/A Y 80238 >>> Scrubber Daemon on s02-stg N/A N/A Y 80269 >>> Self-heal Daemon on s04-stg N/A N/A Y 106815 >>> Quota Daemon on s04-stg N/A N/A Y 106866 >>> Bitrot Daemon on s04-stg N/A N/A Y 106878 >>> Scrubber Daemon on s04-stg N/A N/A Y 106897 >>> Self-heal Daemon on s05-stg N/A N/A Y 130807 >>> Quota Daemon on s05-stg N/A N/A Y 130884 >>> Bitrot Daemon on s05-stg N/A N/A Y 130896 >>> Scrubber Daemon on s05-stg N/A N/A Y 130927 >>> Self-heal Daemon on s06-stg N/A N/A Y 157146 >>> Quota Daemon on s06-stg N/A N/A Y 157239 >>> Bitrot Daemon on s06-stg N/A N/A Y 157252 >>> Scrubber Daemon on s06-stg N/A N/A Y 157288 >>> >>> Task Status of Volume tier2 >>> ------------------------------------------------------------------------------ >>> Task : Remove brick >>> ID : 06ec63bb-a441-4b85-b3cf-ac8e9df4830f >>> Removed bricks: >>> s04-stg:/gluster/mnt1/brick >>> s04-stg:/gluster/mnt2/brick >>> s04-stg:/gluster/mnt3/brick >>> s04-stg:/gluster/mnt4/brick >>> s04-stg:/gluster/mnt5/brick >>> s04-stg:/gluster/mnt6/brick >>> Status : in progress >>> >>> [root at s01 ~]# ps -ef|grep glusterfs >>> root 3956 1 79 set25 ? 2-14:33:57 /usr/sbin/glusterfsd -s s01-stg --volfile-id tier2.s01-stg.gluster-mnt1-brick -p /var/run/gluster/vols/tier2/s01-stg-gluster-mnt1-brick.pid -S /var/run/gluster/a889b8a21ac2afcbfa0563b9dd4db265.socket --brick-name /gluster/mnt1/brick -l /var/log/glusterfs/bricks/gluster-mnt1-brick.log --xlator-option *-posix.glusterd-uuid=b734b083-4630-4523-9402-05d03565efee --brick-port 49153 --xlator-option tier2-server.listen-port=49153 >>> root 79376 1 0 09:16 ? 00:04:16 /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/4fab1a27e6ee700b3b9a3b3393ab7445.socket --xlator-option *replicate*.node-uuid=b734b083-4630-4523-9402-05d03565efee >>> root 79472 1 0 09:16 ? 00:00:42 /usr/sbin/glusterfs -s localhost --volfile-id gluster/quotad -p /var/run/gluster/quotad/quotad.pid -l /var/log/glusterfs/quotad.log -S /var/run/gluster/958ab34799fc58f4dfe20e5732eea70b.socket --xlator-option *replicate*.data-self-heal=off --xlator-option *replicate*.metadata-self-heal=off --xlator-option *replicate*.entry-self-heal=off >>> root 79485 1 7 09:16 ? 00:40:43 /usr/sbin/glusterfs -s localhost --volfile-id gluster/bitd -p /var/run/gluster/bitd/bitd.pid -l /var/log/glusterfs/bitd.log -S /var/run/gluster/b2ea9da593fae1bc4d94e65aefdbdda9.socket --global-timer-wheel >>> root 79505 1 0 09:16 ? 00:00:01 /usr/sbin/glusterfs -s localhost --volfile-id gluster/scrub -p /var/run/gluster/scrub/scrub.pid -l /var/logglusterfs/scrub.log -S /var/run/gluster/ee7886cbcf8d2adf261084b608c905d5.socket --global-timer-wheel >>> root 137362 137225 0 17:53 pts/0 00:00:00 grep --color=auto glusterfs >>> >>> Il giorno 28 set 2018, alle ore 17:47, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >>> >>> >>> >>> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >>> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >>> Cc: "Gluster Users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >>> Sent: Friday, September 28, 2018 9:08:52 PM >>> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >>> >>> Thank you, Ashish. >>> >>> I will study and try your solution on my virtual env. >>> How I can detect the process of a brick on gluster server? >>> >>> Many Thanks, >>> Mauro >>> >>> >>> gluster v status <volname> will give you the list of bricks and the respective process id. >>> Also, you can use "ps aux | grep glusterfs" to see all the processes on a node but I think the above step also do the same. >>> >>> --- >>> Ashish >>> >>> >>> >>> Il ven 28 set 2018 16:39 Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >>> >>> >>> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >>> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >>> Cc: "gluster-users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >>> Sent: Friday, September 28, 2018 7:08:41 PM >>> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >>> >>> >>> Dear Ashish, >>> >>> please excuse me, I'm very sorry for misunderstanding. >>> Before contacting you during last days, we checked all network devices (switch 10GbE, cables, NICs, servers ports, and so on), operating systems version and settings, network bonding configuration, gluster packages versions, tuning profiles, etc. but everything seems to be ok. The first 3 servers (and volume) operated without problem for one year. After we added the new 3 servers we noticed something wrong. >>> Fortunately, yesterday you gave me an hand to understand where is (or could be) the problem. >>> >>> At this moment, after we re-launched the remove-brick command, it seems that the rebalance is going ahead without errors, but it is only scanning the files. >>> May be that during the future data movement some errors could appear. >>> >>> For this reason, it could be useful to know how to proceed in case of a new failure: insist with approach n.1 or change the strategy? >>> We are thinking to try to complete the running remove-brick procedure and make a decision based on the outcome. >>> >>> Question: could we start approach n.2 also after having successfully removed the V1 subvolume?! >>> >>> >>> Yes, we can do that. My idea is to use replace-brick command. >>> We will kill "ONLY" one brick process on s06. We will format this brick. Then use replace-brick command to replace brick of a volume on s05 with this formatted brick. >>> heal will be triggered and data of the respective volume will be placed on this brick. >>> >>> Now, we can format the brick which got freed up on s05 and replace the brick which we killed on s06 to s05. >>> During this process, we have to make sure heal completed before trying any other replace/kill brick. >>> >>> It is tricky but looks doable. Think about it and try to perform it on your virtual environment first before trying on production. >>> ------- >>> >>> If it is still possible, could you please illustrate the approach n.2 even if I dont have free disks? >>> I would like to start thinking about it and test it on a virtual environment. >>> >>> Thank you in advance for your help and patience. >>> Regards, >>> Mauro >>> >>> >>> >>> Il giorno 28 set 2018, alle ore 14:36, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >>> >>> >>> We could have taken approach -2 even if you did not have free disks. You should have told me why are you >>> opting Approach-1 or perhaps I should have asked. >>> I was wondering for approach 1 because sometimes re-balance takes time depending upon the data size. >>> >>> Anyway, I hope whole setup is stable, I mean it is not in the middle of something which we can not stop. >>> If free disks are the only concern I will give you some more steps to deal with it and follow the approach 2. >>> >>> Let me know once you think everything is fine with the system and there is nothing to heal. >>> >>> --- >>> Ashish >>> >>> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >>> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >>> Cc: "gluster-users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >>> Sent: Friday, September 28, 2018 4:21:03 PM >>> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >>> >>> >>> Hi Ashish, >>> >>> as I said in my previous message, we adopted the first approach you suggested (setting network.ping-timeout option to 0). >>> This choice was due to the absence of empty brick to be used as indicated in the second approach. >>> >>> So, we launched remove-brick command on the first subvolume (V1, bricks 1,2,3,4,5,6 on server s04). >>> Rebalance started moving the data across the other bricks, but, after about 3TB of moved data, rebalance speed slowed down and some transfer errors appeared in the rebalance.log of server s04. >>> At this point, since remaining 1,8TB need to be moved in order to complete the step, we decided to stop the remove-brick execution and start it again (I hope it doesn?t stop again before complete the rebalance) >>> >>> Now rebalance is not moving data, it?s only scanning files (please, take a look to the following output) >>> >>> [root at s01 ~]# gluster volume remove-brick tier2 s04-stg:/gluster/mnt1/brick s04-stg:/gluster/mnt2/brick s04-stg:/gluster/mnt3/brick s04-stg:/gluster/mnt4/brick s04-stg:/gluster/mnt5/brick s04-stg:/gluster/mnt6/brick status >>> Node Rebalanced-files size scanned failures skipped status run time in h:m:s >>> --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- >>> s04-stg 0 0Bytes 182008 0 0 in progress 3:08:09 >>> Estimated time left for rebalance to complete : 442:45:06 >>> >>> If I?m not wrong, remove-brick rebalances entire cluster each time it start. >>> Is there a way to speed up this procedure? Do you have some other suggestion that, in this particular case, could be useful to reduce errors (I know that they are related to the current volume configuration) and improve rebalance performance avoiding to rebalance the entire cluster? >>> >>> Thank you in advance, >>> Mauro >>> >>> Il giorno 27 set 2018, alle ore 13:14, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >>> >>> >>> Yes, you can. >>> If not me others may also reply. >>> >>> --- >>> Ashish >>> >>> From: "Mauro Tridici" <mauro.tridici at cmcc.it <mailto:mauro.tridici at cmcc.it>> >>> To: "Ashish Pandey" <aspandey at redhat.com <mailto:aspandey at redhat.com>> >>> Cc: "gluster-users" <gluster-users at gluster.org <mailto:gluster-users at gluster.org>> >>> Sent: Thursday, September 27, 2018 4:24:12 PM >>> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version >>> >>> >>> Dear Ashish, >>> >>> I can not thank you enough! >>> Your procedure and description is very detailed. >>> I think to follow the first approach after setting network.ping-timeout option to 0 (If I?m not wrong ?0" means ?infinite?...I noticed that this value reduced rebalance errors). >>> After the fix I will set network.ping-timeout option to default value. >>> >>> Could I contact you again if I need some kind of suggestion? >>> >>> Thank you very much again. >>> Have a good day, >>> Mauro >>> >>> >>> Il giorno 27 set 2018, alle ore 12:38, Ashish Pandey <aspandey at redhat.com <mailto:aspandey at redhat.com>> ha scritto: >>> >>> >>> Hi Mauro, >>> >>> We can divide the 36 newly added bricks into 6 set of 6 bricks each starting from brick37. >>> That means, there are 6 ec subvolumes and we have to deal with one sub volume at a time. >>> I have named it V1 to V6. >>> >>> Problem: >>> Take the case of V1. >>> The best configuration/setup would be to have all the 6 bricks of V1 on 6 different nodes. >>> However, in your case you have added 3 new nodes. So, at least we should have 2 bricks on 3 different newly added nodes. >>> This way, in 4+2 EC configuration, even if one node goes down you will have 4 other bricks of that volume >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> https://lists.gluster.org/mailman/listinfo/gluster-users <https://lists.gluster.org/mailman/listinfo/gluster-users> >> ... >> >> [Message clipped] >> > > > ---------------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181003/9788ac0f/attachment.html>
Nithya Balachandran
2018-Oct-04 04:05 UTC
[Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version
Hi Mauro, It looks like all of these are actual files that were not migrated. Please send me the rebalance logs for this node so I can check for any migration errors. As this is a disperse volume, copying the files to the mount will be difficult. Ashish, how would we go about this? Regards, Nithya On 3 October 2018 at 21:18, Mauro Tridici <mauro.tridici at cmcc.it> wrote:> > Hi Nithya, > > in order to give an answer to your question as soon as possible, I just > considered only the content of one brick of server s06 (in attachment you > can find the content of /gluster/mnt1/brick). > > [root at s06 ~]# df -h > File system Dim. Usati Dispon. Uso% Montato su > /dev/mapper/cl_s06-root 100G 2,1G 98G 3% / > devtmpfs 32G 0 32G 0% /dev > tmpfs 32G 4,0K 32G 1% /dev/shm > tmpfs 32G 106M 32G 1% /run > tmpfs 32G 0 32G 0% /sys/fs/cgroup > /dev/mapper/cl_s06-var 100G 3,0G 97G 3% /var > /dev/mapper/cl_s06-gluster 100G 33M 100G 1% /gluster > /dev/sda1 1014M 152M 863M 15% /boot > /dev/mapper/gluster_vgd-gluster_lvd 9,0T 12G 9,0T 1% /gluster/mnt3 > /dev/mapper/gluster_vgg-gluster_lvg 9,0T 12G 9,0T 1% /gluster/mnt6 > /dev/mapper/gluster_vgc-gluster_lvc 9,0T 12G 9,0T 1% /gluster/mnt2 > /dev/mapper/gluster_vge-gluster_lve 9,0T 12G 9,0T 1% /gluster/mnt4 > /dev/mapper/gluster_vgj-gluster_lvj 9,0T 1,4T 7,7T 16% /gluster/mnt9 > */dev/mapper/gluster_vgb-gluster_lvb 9,0T 12G 9,0T 1% > /gluster/mnt1* > /dev/mapper/gluster_vgh-gluster_lvh 9,0T 1,4T 7,7T 16% /gluster/mnt7 > /dev/mapper/gluster_vgf-gluster_lvf 9,0T 12G 9,0T 1% /gluster/mnt5 > /dev/mapper/gluster_vgi-gluster_lvi 9,0T 1,4T 7,7T 16% /gluster/mnt8 > /dev/mapper/gluster_vgl-gluster_lvl 9,0T 1,4T 7,7T 16% > /gluster/mnt11 > /dev/mapper/gluster_vgk-gluster_lvk 9,0T 1,4T 7,7T 16% > /gluster/mnt10 > /dev/mapper/gluster_vgm-gluster_lvm 9,0T 1,4T 7,7T 16% > /gluster/mnt12 > > The scenario is almost the same for all the bricks removed from server > s04, s05 and s06. > In the next hours, I will check every files on each removed bricks. > > So, if I understand, I can proceed with deletion of directories and files > left on the bricks only if each file have T tag, right? > > Thank you in advance, > Mauro > > > Il giorno 03 ott 2018, alle ore 16:49, Nithya Balachandran < > nbalacha at redhat.com> ha scritto: > > > > On 1 October 2018 at 15:35, Mauro Tridici <mauro.tridici at cmcc.it> wrote: > >> Good morning Ashish, >> >> your explanations are always very useful, thank you very much: I will >> remember these suggestions for any future needs. >> Anyway, during the week-end, the remove-brick procedures ended >> successfully and we were able to free up all bricks defined on server s04, >> s05 and 6 bricks of 12 on server s06. >> So, we can say that, thanks to your suggestions, we are about to complete >> this first phase (removing of all bricks defined on s04, s05 and s06 >> servers). >> >> I really appreciated your support. >> Now I have a last question (I hope): after remove-brick commit I noticed >> that some data remain on each brick (about 1.2GB of data). >> Please, take a look to the ?df-h_on_s04_s05_s06.txt?. >> The situation is almost the same on all 3 servers mentioned above: a long >> list of directories names and some files that are still on the brick, but >> respective size is 0. >> >> Examples: >> >> a lot of empty directories on /gluster/mnt*/brick/.glusterfs >> >> 8 /gluster/mnt2/brick/.glusterfs/b7/1b >> 0 /gluster/mnt2/brick/.glusterfs/b7/ee/b7ee94a5-a77c-4c02- >> 85a5-085992840c83 >> 0 /gluster/mnt2/brick/.glusterfs/b7/ee/b7ee85d4-ce48-43a7- >> a89a-69c728ee8273 >> >> some empty files in directories in /gluster/mnt*/brick/* >> >> [root at s04 ~]# cd /gluster/mnt1/brick/ >> [root at s04 brick]# ls -l >> totale 32 >> drwxr-xr-x 7 root root 100 11 set 22.14 *archive_calypso* >> >> [root at s04 brick]# cd archive_calypso/ >> [root at s04 archive_calypso]# ll >> totale 0 >> drwxr-x--- 3 root 5200 29 11 set 22.13 *ans002* >> drwxr-x--- 3 5104 5100 32 11 set 22.14 *ans004* >> drwxr-x--- 3 4506 4500 31 11 set 22.14 *ans006* >> drwxr-x--- 3 4515 4500 28 11 set 22.14 *ans015* >> drwxr-x--- 4 4321 4300 54 11 set 22.14 *ans021* >> [root at s04 archive_calypso]# du -a * >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501 >> .0/echam5/echam_sf006_198110.01.gz >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.0/echam5 >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.0 >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501 >> .1/echam5/echam_sf006_198105.01.gz >> 0 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501 >> .1/echam5/echam_sf006_198109.01.gz >> 8 ans002/archive/ans002/HINDCASTS/RUN_ATMWANG_LANSENS/19810501.1/echam5 >> >> What we have to do with this data? Should I backup this ?empty? dirs and >> files on a different storage before deleting them? >> > > Hi Mauro, > > Are you sure these files and directories are empty? Please provide the ls > -l output for the files. If they are 'T' files , they can be ignored. > > Regards, > Nithya > >> >> As soon as all the bricks will be empty, I plan to re-add the new bricks >> using the following commands: >> >> *gluster peer detach s04* >> *gluster peer detach s05* >> *gluster peer detach s06* >> >> *gluster peer probe s04* >> *gluster peer probe s05* >> *gluster peer probe s06* >> >> *gluster volume add-brick tier2 s04-stg:/gluster/mnt1/brick >> s05-stg:/gluster/mnt1/brick s06-stg:/gluster/mnt1/brick >> s04-stg:/gluster/mnt2/brick s05-stg:/gluster/mnt2/brick >> s06-stg:/gluster/mnt2/brick s04-stg:/gluster/mnt3/brick >> s05-stg:/gluster/mnt3/brick s06-stg:/gluster/mnt3/brick >> s04-stg:/gluster/mnt4/brick s05-stg:/gluster/mnt4/brick >> s06-stg:/gluster/mnt4/brick s04-stg:/gluster/mnt5/brick >> s05-stg:/gluster/mnt5/brick s06-stg:/gluster/mnt5/brick >> s04-stg:/gluster/mnt6/brick s05-stg:/gluster/mnt6/brick >> s06-stg:/gluster/mnt6/brick s04-stg:/gluster/mnt7/brick >> s05-stg:/gluster/mnt7/brick s06-stg:/gluster/mnt7/brick >> s04-stg:/gluster/mnt8/brick s05-stg:/gluster/mnt8/brick >> s06-stg:/gluster/mnt8/brick s04-stg:/gluster/mnt9/brick >> s05-stg:/gluster/mnt9/brick s06-stg:/gluster/mnt9/brick >> s04-stg:/gluster/mnt10/brick s05-stg:/gluster/mnt10/brick >> s06-stg:/gluster/mnt10/brick s04-stg:/gluster/mnt11/brick >> s05-stg:/gluster/mnt11/brick s06-stg:/gluster/mnt11/brick >> s04-stg:/gluster/mnt12/brick s05-stg:/gluster/mnt12/brick >> s06-stg:/gluster/mnt12/brick force* >> >> *gluster volume rebalance tier2 fix-layout start* >> >> *gluster volume rebalance tier2 start* >> >> From your point of view, are they the right commands to close this >> repairing task? >> >> Thank you very much for your help. >> Regards, >> Mauro >> >> >> >> >> >> Il giorno 01 ott 2018, alle ore 09:17, Ashish Pandey <aspandey at redhat.com> >> ha scritto: >> >> >> Ohh!! It is because brick-multiplexing is "ON" on your setup. Not sure if >> it is by default ON for 3.12.14 or not. >> >> See "cluster.brick-multiplex: on" in gluster v <volname> info >> If brick multiplexing is ON, you will see only one process running for >> all the bricks on a Node. >> >> So we have to do following step to kill any one brick on a node. >> >> *Steps to kill a brick when multiplex is on -* >> >> *Step - 1 * >> Find *unix domain_socket* of the process on a node. >> Run "ps -aef | grep glusterfsd" on a node. Example : >> >> This is on my machine when I have all the bricks on same machine >> >> [root at apandey glusterfs]# ps -aef | grep glusterfsd | grep -v mnt >> root 28311 1 0 11:16 ? 00:00:06 >> /usr/local/sbin/glusterfsd -s apandey --volfile-id >> vol.apandey.home-apandey-bricks-gluster-vol-1 -p >> /var/run/gluster/vols/vol/apandey-home-apandey-bricks-gluster-vol-1.pid >> -S /var/run/gluster/1259033d2ff4f4e5.socket --brick-name >> /home/apandey/bricks/gluster/vol-1 -l /var/log/glusterfs/bricks/home >> -apandey-bricks-gluster-vol-1.log --xlator-option >> *-posix.glusterd-uuid=61b4524c-ccf3-4219-aaff-b3497ac6dd24 >> --process-name brick --brick-port 49158 --xlator-option >> vol-server.listen-port=49158 >> >> Here, /var/run/gluster/1259033d2ff4f4e5.socket is the unix domain socket >> >> *Step - 2* >> Run following command to kill a brick on the same node - >> >> gf_attach -d <unix domain_socket> brick_path_on_that_node >> >> Example: >> >> *gf_attach -d /var/run/gluster/1259033d2ff4f4e5.socket >> /home/apandey/bricks/gluster/vol-6* >> >> Status of volume: vol >> Gluster process TCP Port RDMA Port Online >> Pid >> ------------------------------------------------------------ >> ------------------ >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-1 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-2 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-3 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-4 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-5 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-6 49158 0 Y >> 28311 >> Self-heal Daemon on localhost N/A N/A Y >> 29787 >> >> Task Status of Volume vol >> ------------------------------------------------------------ >> ------------------ >> There are no active volume tasks >> >> [root at apandey glusterfs]# >> [root at apandey glusterfs]# >> [root at apandey glusterfs]# gf_attach -d /var/run/gluster/1259033d2ff4f4e5.socket >> /home/apandey/bricks/gluster/vol-6 >> OK >> [root at apandey glusterfs]# gluster v status >> Status of volume: vol >> Gluster process TCP Port RDMA Port Online >> Pid >> ------------------------------------------------------------ >> ------------------ >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-1 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-2 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-3 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-4 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-5 49158 0 Y >> 28311 >> Brick apandey:/home/apandey/bricks/gluster/ >> vol-6 N/A N/A N >> N/A >> Self-heal Daemon on localhost N/A N/A Y >> 29787 >> >> Task Status of Volume vol >> ------------------------------------------------------------ >> ------------------ >> There are no active volume tasks >> >> >> To start a brick we just need to start volume using "force" >> >> gluster v start <volname> force >> >> ---- >> Ashish >> >> >> >> >> >> >> ------------------------------ >> *From: *"Mauro Tridici" <mauro.tridici at cmcc.it> >> *To: *"Ashish Pandey" <aspandey at redhat.com> >> *Cc: *"Gluster Users" <gluster-users at gluster.org> >> *Sent: *Friday, September 28, 2018 9:25:53 PM >> *Subject: *Re: [Gluster-users] Rebalance failed on Distributed Disperse >> volume based on 3.12.14 version >> >> >> I asked you how to detect the PID of a specific brick because I see that >> more than one brick has the same PID (also on my virtual env). >> If I kill one of them I risk to kill some other brick. Is it normal? >> >> [root at s01 ~]# gluster vol status >> Status of volume: tier2 >> Gluster process TCP Port RDMA Port Online >> Pid >> ------------------------------------------------------------ >> ------------------ >> Brick s01-stg:/gluster/mnt1/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt1/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt1/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt2/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt2/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt2/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt3/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt3/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt3/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt4/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt4/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt4/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt5/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt5/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt5/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt6/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt6/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt6/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt7/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt7/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt7/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt8/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt8/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt8/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt9/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt9/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt9/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt10/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt10/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt10/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt11/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt11/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt11/brick 49153 0 Y >> 3953 >> Brick s01-stg:/gluster/mnt12/brick 49153 0 Y >> 3956 >> Brick s02-stg:/gluster/mnt12/brick 49153 0 Y >> 3956 >> Brick s03-stg:/gluster/mnt12/brick 49153 0 Y >> 3953 >> Brick s04-stg:/gluster/mnt1/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt2/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt3/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt4/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt5/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt6/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt7/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt8/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt9/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt10/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt11/brick 49153 0 Y >> 3433 >> Brick s04-stg:/gluster/mnt12/brick 49153 0 Y >> 3433 >> Brick s05-stg:/gluster/mnt1/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt2/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt3/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt4/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt5/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt6/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt7/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt8/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt9/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt10/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt11/brick 49153 0 Y >> 3709 >> Brick s05-stg:/gluster/mnt12/brick 49153 0 Y >> 3709 >> Brick s06-stg:/gluster/mnt1/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt2/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt3/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt4/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt5/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt6/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt7/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt8/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt9/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt10/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt11/brick 49153 0 Y >> 3644 >> Brick s06-stg:/gluster/mnt12/brick 49153 0 Y >> 3644 >> Self-heal Daemon on localhost N/A N/A Y >> 79376 >> Quota Daemon on localhost N/A N/A Y >> 79472 >> Bitrot Daemon on localhost N/A N/A Y >> 79485 >> Scrubber Daemon on localhost N/A N/A Y >> 79505 >> Self-heal Daemon on s03-stg N/A N/A Y >> 77073 >> Quota Daemon on s03-stg N/A N/A Y >> 77148 >> Bitrot Daemon on s03-stg N/A N/A Y >> 77160 >> Scrubber Daemon on s03-stg N/A N/A Y >> 77191 >> Self-heal Daemon on s02-stg N/A N/A Y >> 80150 >> Quota Daemon on s02-stg N/A N/A Y >> 80226 >> Bitrot Daemon on s02-stg N/A N/A Y >> 80238 >> Scrubber Daemon on s02-stg N/A N/A Y >> 80269 >> Self-heal Daemon on s04-stg N/A N/A Y >> 106815 >> Quota Daemon on s04-stg N/A N/A Y >> 106866 >> Bitrot Daemon on s04-stg N/A N/A Y >> 106878 >> Scrubber Daemon on s04-stg N/A N/A Y >> 106897 >> Self-heal Daemon on s05-stg N/A N/A Y >> 130807 >> Quota Daemon on s05-stg N/A N/A Y >> 130884 >> Bitrot Daemon on s05-stg N/A N/A Y >> 130896 >> Scrubber Daemon on s05-stg N/A N/A Y >> 130927 >> Self-heal Daemon on s06-stg N/A N/A Y >> 157146 >> Quota Daemon on s06-stg N/A N/A Y >> 157239 >> Bitrot Daemon on s06-stg N/A N/A Y >> 157252 >> Scrubber Daemon on s06-stg N/A N/A Y >> 157288 >> >> Task Status of Volume tier2 >> ------------------------------------------------------------ >> ------------------ >> Task : Remove brick >> ID : 06ec63bb-a441-4b85-b3cf-ac8e9df4830f >> Removed bricks: >> s04-stg:/gluster/mnt1/brick >> s04-stg:/gluster/mnt2/brick >> s04-stg:/gluster/mnt3/brick >> s04-stg:/gluster/mnt4/brick >> s04-stg:/gluster/mnt5/brick >> s04-stg:/gluster/mnt6/brick >> Status : in progress >> >> [root at s01 ~]# ps -ef|grep glusterfs >> root 3956 1 79 set25 ? 2-14:33:57 /usr/sbin/*glusterfs*d >> -s s01-stg --volfile-id tier2.s01-stg.gluster-mnt1-brick -p >> /var/run/gluster/vols/tier2/s01-stg-gluster-mnt1-brick.pid -S >> /var/run/gluster/a889b8a21ac2afcbfa0563b9dd4db265.socket --brick-name >> /gluster/mnt1/brick -l /var/log/*glusterfs*/bricks/gluster-mnt1-brick.log >> --xlator-option *-posix.glusterd-uuid=b734b083-4630-4523-9402-05d03565efee >> --brick-port 49153 --xlator-option tier2-server.listen-port=49153 >> root 79376 1 0 09:16 ? 00:04:16 /usr/sbin/*glusterfs* >> -s localhost --volfile-id gluster/glustershd -p >> /var/run/gluster/glustershd/glustershd.pid -l /var/log/*glusterfs* >> /glustershd.log -S /var/run/gluster/4fab1a27e6ee700b3b9a3b3393ab7445.socket >> --xlator-option *replicate*.node-uuid=b734b083 >> -4630-4523-9402-05d03565efee >> root 79472 1 0 09:16 ? 00:00:42 /usr/sbin/*glusterfs* >> -s localhost --volfile-id gluster/quotad -p /var/run/gluster/quotad/quotad.pid >> -l /var/log/*glusterfs*/quotad.log -S /var/run/gluster/958ab34799fc58f4dfe20e5732eea70b.socket >> --xlator-option *replicate*.data-self-heal=off --xlator-option >> *replicate*.metadata-self-heal=off --xlator-option >> *replicate*.entry-self-heal=off >> root 79485 1 7 09:16 ? 00:40:43 /usr/sbin/*glusterfs* >> -s localhost --volfile-id gluster/bitd -p /var/run/gluster/bitd/bitd.pid -l >> /var/log/*glusterfs*/bitd.log -S /var/run/gluster/b2ea9da593fae1bc4d94e65aefdbdda9.socket >> --global-timer-wheel >> root 79505 1 0 09:16 ? 00:00:01 /usr/sbin/*glusterfs* >> -s localhost --volfile-id gluster/scrub -p /var/run/gluster/scrub/scrub.pid >> -l /var/log*glusterfs*/scrub.log -S /var/run/gluster/ee7886cbcf8d2adf261084b608c905d5.socket >> --global-timer-wheel >> root 137362 137225 0 17:53 pts/0 00:00:00 grep --color=auto >> *glusterfs* >> >> Il giorno 28 set 2018, alle ore 17:47, Ashish Pandey <aspandey at redhat.com> >> ha scritto: >> >> >> >> ------------------------------ >> *From: *"Mauro Tridici" <mauro.tridici at cmcc.it> >> *To: *"Ashish Pandey" <aspandey at redhat.com> >> *Cc: *"Gluster Users" <gluster-users at gluster.org> >> *Sent: *Friday, September 28, 2018 9:08:52 PM >> *Subject: *Re: [Gluster-users] Rebalance failed on Distributed Disperse >> volume based on 3.12.14 version >> >> Thank you, Ashish. >> >> I will study and try your solution on my virtual env. >> How I can detect the process of a brick on gluster server? >> >> Many Thanks, >> Mauro >> >> >> gluster v status <volname> will give you the list of bricks and the >> respective process id. >> Also, you can use "ps aux | grep glusterfs" to see all the processes on a >> node but I think the above step also do the same. >> >> --- >> Ashish >> >> >> >> Il ven 28 set 2018 16:39 Ashish Pandey <aspandey at redhat.com> ha scritto: >> >>> >>> >>> ------------------------------ >>> *From: *"Mauro Tridici" <mauro.tridici at cmcc.it> >>> *To: *"Ashish Pandey" <aspandey at redhat.com> >>> *Cc: *"gluster-users" <gluster-users at gluster.org> >>> *Sent: *Friday, September 28, 2018 7:08:41 PM >>> *Subject: *Re: [Gluster-users] Rebalance failed on Distributed Disperse >>> volume based on 3.12.14 version >>> >>> >>> Dear Ashish, >>> >>> please excuse me, I'm very sorry for misunderstanding. >>> Before contacting you during last days, we checked all network devices >>> (switch 10GbE, cables, NICs, servers ports, and so on), operating systems >>> version and settings, network bonding configuration, gluster packages >>> versions, tuning profiles, etc. but everything seems to be ok. The first 3 >>> servers (and volume) operated without problem for one year. After we added >>> the new 3 servers we noticed something wrong. >>> Fortunately, yesterday you gave me an hand to understand where is (or >>> could be) the problem. >>> >>> At this moment, after we re-launched the remove-brick command, it seems >>> that the rebalance is going ahead without errors, but it is only scanning >>> the files. >>> May be that during the future data movement some errors could appear. >>> >>> For this reason, it could be useful to know how to proceed in case of a >>> new failure: insist with approach n.1 or change the strategy? >>> We are thinking to try to complete the running remove-brick procedure >>> and make a decision based on the outcome. >>> >>> Question: could we start approach n.2 also after having successfully >>> removed the V1 subvolume?! >>> >>> >>> Yes, we can do that. My idea is to use replace-brick command. >>> We will kill "ONLY" one brick process on s06. We will format this brick. >>> Then use replace-brick command to replace brick of a volume on s05 with >>> this formatted brick. >>> heal will be triggered and data of the respective volume will be placed >>> on this brick. >>> >>> Now, we can format the brick which got freed up on s05 and replace the >>> brick which we killed on s06 to s05. >>> During this process, we have to make sure heal completed before trying >>> any other replace/kill brick. >>> >>> It is tricky but looks doable. Think about it and try to perform it on >>> your virtual environment first before trying on production. >>> ------- >>> >>> If it is still possible, could you please illustrate the approach n.2 >>> even if I dont have free disks? >>> I would like to start thinking about it and test it on a virtual >>> environment. >>> >>> Thank you in advance for your help and patience. >>> Regards, >>> Mauro >>> >>> >>> >>> Il giorno 28 set 2018, alle ore 14:36, Ashish Pandey < >>> aspandey at redhat.com> ha scritto: >>> >>> >>> We could have taken approach -2 even if you did not have free disks. You >>> should have told me why are you >>> opting Approach-1 or perhaps I should have asked. >>> I was wondering for approach 1 because sometimes re-balance takes time >>> depending upon the data size. >>> >>> Anyway, I hope whole setup is stable, I mean it is not in the middle of >>> something which we can not stop. >>> If free disks are the only concern I will give you some more steps to >>> deal with it and follow the approach 2. >>> >>> Let me know once you think everything is fine with the system and there >>> is nothing to heal. >>> >>> --- >>> Ashish >>> >>> ------------------------------ >>> *From: *"Mauro Tridici" <mauro.tridici at cmcc.it> >>> *To: *"Ashish Pandey" <aspandey at redhat.com> >>> *Cc: *"gluster-users" <gluster-users at gluster.org> >>> *Sent: *Friday, September 28, 2018 4:21:03 PM >>> *Subject: *Re: [Gluster-users] Rebalance failed on Distributed Disperse >>> volume based on 3.12.14 version >>> >>> >>> Hi Ashish, >>> >>> as I said in my previous message, we adopted the first approach you >>> suggested (setting network.ping-timeout option to 0). >>> This choice was due to the absence of empty brick to be used as >>> indicated in the second approach. >>> >>> So, we launched remove-brick command on the first subvolume (V1, bricks >>> 1,2,3,4,5,6 on server s04). >>> Rebalance started moving the data across the other bricks, but, after >>> about 3TB of moved data, rebalance speed slowed down and some transfer >>> errors appeared in the rebalance.log of server s04. >>> At this point, since remaining 1,8TB need to be moved in order to >>> complete the step, we decided to stop the remove-brick execution and start >>> it again (I hope it doesn?t stop again before complete the rebalance) >>> >>> Now rebalance is not moving data, it?s only scanning files (please, take >>> a look to the following output) >>> >>> [root at s01 ~]# gluster volume remove-brick tier2 >>> s04-stg:/gluster/mnt1/brick s04-stg:/gluster/mnt2/brick >>> s04-stg:/gluster/mnt3/brick s04-stg:/gluster/mnt4/brick >>> s04-stg:/gluster/mnt5/brick s04-stg:/gluster/mnt6/brick status >>> Node Rebalanced-files size >>> scanned failures skipped status run time in >>> h:m:s >>> --------- ----------- ----------- >>> ----------- ----------- ----------- ------------ >>> -------------- >>> s04-stg 0 0Bytes >>> 182008 0 0 in progress 3:08:09 >>> Estimated time left for rebalance to complete : 442:45:06 >>> >>> If I?m not wrong, remove-brick rebalances entire cluster each time it >>> start. >>> Is there a way to speed up this procedure? Do you have some other >>> suggestion that, in this particular case, could be useful to reduce errors >>> (I know that they are related to the current volume configuration) and >>> improve rebalance performance avoiding to rebalance the entire cluster? >>> >>> Thank you in advance, >>> Mauro >>> >>> Il giorno 27 set 2018, alle ore 13:14, Ashish Pandey < >>> aspandey at redhat.com> ha scritto: >>> >>> >>> Yes, you can. >>> If not me others may also reply. >>> >>> --- >>> Ashish >>> >>> ------------------------------ >>> *From: *"Mauro Tridici" <mauro.tridici at cmcc.it> >>> *To: *"Ashish Pandey" <aspandey at redhat.com> >>> *Cc: *"gluster-users" <gluster-users at gluster.org> >>> *Sent: *Thursday, September 27, 2018 4:24:12 PM >>> *Subject: *Re: [Gluster-users] Rebalance failed on Distributed Disperse >>> volume based on 3.12.14 version >>> >>> >>> Dear Ashish, >>> >>> I can not thank you enough! >>> Your procedure and description is very detailed. >>> I think to follow the first approach after setting network.ping-timeout >>> option to 0 (If I?m not wrong ?0" means ?infinite?...I noticed that this >>> value reduced rebalance errors). >>> After the fix I will set network.ping-timeout option to default value. >>> >>> Could I contact you again if I need some kind of suggestion? >>> >>> Thank you very much again. >>> Have a good day, >>> Mauro >>> >>> >>> Il giorno 27 set 2018, alle ore 12:38, Ashish Pandey < >>> aspandey at redhat.com> ha scritto: >>> >>> >>> Hi Mauro, >>> >>> We can divide the 36 newly added bricks into 6 set of 6 bricks each >>> starting from brick37. >>> That means, there are 6 ec subvolumes and we have to deal with one sub >>> volume at a time. >>> I have named it V1 to V6. >>> >>> Problem: >>> Take the case of V1. >>> The best configuration/setup would be to have all the 6 bricks of V1 on >>> 6 different nodes. >>> However, in your case you have added 3 new nodes. So, at least we should >>> have 2 bricks on 3 different newly added nodes. >>> This way, in 4+2 EC configuration, even if one node goes down you will >>> have 4 other bricks of that volume >>> >>> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> ... >> >> [Message clipped] > > > > > ------------------------- > Mauro Tridici > > Fondazione CMCC > CMCC Supercomputing Center > presso Complesso Ecotekne - Universit? del Salento - > Strada Prov.le Lecce - Monteroni sn > 73100 Lecce IT > http://www.cmcc.it > > mobile: (+39) 327 5630841 > email: mauro.tridici at cmcc.it > https://it.linkedin.com/in/mauro-tridici-5977238b > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20181004/56840220/attachment-0001.html>