thr3ads.net - Gluster users - [Gluster-users] Geo-replication status Faulty [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Gilberto Nunes

2020-Oct-26 20:45 UTC

[Gluster-users] Geo-replication status Faulty

I was able to solve the issue restarting all servers.

Now I have another issue!

I just powered off the gluster01 server and then the geo-replication
entered in faulty status.
I tried to stop and start the gluster geo-replication like that:

gluster volume geo-replication DATA root at gluster03::DATA-SLAVE resume
Peer gluster01.home.local, which is a part of DATA volume, is down. Please
bring up the peer and retry.
geo-replication command failed

How can I have geo-replication with 2 master and 1 slave?

Thanks


---
Gilberto Nunes Ferreira






Em seg., 26 de out. de 2020 ?s 17:23, Gilberto Nunes <
gilberto.nunes32 at gmail.com> escreveu:
> Hi there...
>
> I'd created a 2 gluster vol and another 1 gluster server acting as a
> backup server, using geo-replication.
> So in gluster01 I'd issued the command:
>
> gluster peer probe gluster02;gluster peer probe gluster03
> gluster vol create DATA replica 2 gluster01:/DATA/master01-data
> gluster02:/DATA/master01-data/
>
> Then in gluster03 server:
>
> gluster vol create DATA-SLAVE gluster03:/DATA/slave-data/
>
> I'd setted the ssh powerless session between this 3 servers.
>
> Then I'd used this script
>
> https://github.com/gilbertoferreira/georepsetup
>
> like this
>
> georepsetup
>
/usr/local/lib/python2.7/dist-packages/paramiko-2.7.2-py2.7.egg/paramiko/transport.py:33:
> CryptographyDeprecationWarning: Python 2 is no longer supp
> orted by the Python core team. Support for it is now deprecated in
> cryptography, and will be removed in a future release.
>  from cryptography.hazmat.backends import default_backend
> usage: georepsetup [-h] [--force] [--no-color] MASTERVOL SLAVE SLAVEVOL
> georepsetup: error: too few arguments
> gluster01:~# georepsetup DATA gluster03 DATA-SLAVE
>
/usr/local/lib/python2.7/dist-packages/paramiko-2.7.2-py2.7.egg/paramiko/transport.py:33:
> CryptographyDeprecationWarning: Python 2 is no longer supp
> orted by the Python core team. Support for it is now deprecated in
> cryptography, and will be removed in a future release.
>  from cryptography.hazmat.backends import default_backend
> Geo-replication session will be established between DATA and
> gluster03::DATA-SLAVE
> Root password of gluster03 is required to complete the setup. NOTE:
> Password will not be stored.
>
> root at gluster03's password:
> [    OK] gluster03 is Reachable(Port 22)
> [    OK] SSH Connection established root at gluster03
> [    OK] Master Volume and Slave Volume are compatible (Version: 8.2)
> [    OK] Common secret pub file present at
> /var/lib/glusterd/geo-replication/common_secret.pem.pub
> [    OK] common_secret.pem.pub file copied to gluster03
> [    OK] Master SSH Keys copied to all Up Slave nodes
> [    OK] Updated Master SSH Keys to all Up Slave nodes authorized_keys
> file
> [    OK] Geo-replication Session Established
>
> Then I reboot the 3 servers...
> After a while everything works ok, but after a few minutes, I get Faulty
> status in gluster01....
>
> There's the log
>
>
> [2020-10-26 20:16:41.362584] I
> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status
> Change [{status=Initializing...}]
> [2020-10-26 20:16:41.362937] I [monitor(monitor):160:monitor] Monitor:
> starting gsyncd worker [{brick=/DATA/master01-data},
> {slave_node=gluster03}]
> [2020-10-26 20:16:41.508884] I [resource(worker
> /DATA/master01-data):1387:connect_remote] SSH: Initializing SSH connection
> between master and slave.
> ..
> [2020-10-26 20:16:42.996678] I [resource(worker
> /DATA/master01-data):1436:connect_remote] SSH: SSH connection between
> master and slave established.
> [{duration=1.4873}]
> [2020-10-26 20:16:42.997121] I [resource(worker
> /DATA/master01-data):1116:connect] GLUSTER: Mounting gluster volume
> locally...
> [2020-10-26 20:16:44.170661] E [syncdutils(worker
> /DATA/master01-data):110:gf_mount_ready] <top>: failed to get the
xattr
> value
> [2020-10-26 20:16:44.171281] I [resource(worker
> /DATA/master01-data):1139:connect] GLUSTER: Mounted gluster volume
> [{duration=1.1739}]
> [2020-10-26 20:16:44.171772] I [subcmds(worker
> /DATA/master01-data):84:subcmd_worker] <top>: Worker spawn
successful.
> Acknowledging back to monitor
> [2020-10-26 20:16:46.200603] I [master(worker
> /DATA/master01-data):1645:register] _GMaster: Working dir
> [{path=/var/lib/misc/gluster/gsyncd/DATA_glu
> ster03_DATA-SLAVE/DATA-master01-data}]
> [2020-10-26 20:16:46.201798] I [resource(worker
> /DATA/master01-data):1292:service_loop] GLUSTER: Register time
> [{time=1603743406}]
> [2020-10-26 20:16:46.226415] I [gsyncdstatus(worker
> /DATA/master01-data):281:set_active] GeorepStatus: Worker Status Change
> [{status=Active}]
> [2020-10-26 20:16:46.395112] I [gsyncdstatus(worker
> /DATA/master01-data):253:set_worker_crawl_status] GeorepStatus: Crawl
> Status Change [{status=His
> tory Crawl}]
> [2020-10-26 20:16:46.396491] I [master(worker
> /DATA/master01-data):1559:crawl] _GMaster: starting history crawl
> [{turns=1}, {stime=(1603742506, 0)},
> {etime=1603743406}, {entry_stime=(1603743226, 0)}]
> [2020-10-26 20:16:46.399292] E [resource(worker
> /DATA/master01-data):1312:service_loop] GLUSTER: Changelog History Crawl
> failed [{error=[Errno 0] Su
> cesso}]
> [2020-10-26 20:16:47.177205] I [monitor(monitor):228:monitor] Monitor:
> worker died in startup phase [{brick=/DATA/master01-data}]
> [2020-10-26 20:16:47.184525] I
> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status
> Change [{status=Faulty}]
>
>
> Any advice will be welcome.
>
> Thanks
>
> ---
> Gilberto Nunes Ferreira
>
>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20201026/09926ba8/attachment.html>

Strahil Nikolov

2020-Oct-27 00:31 UTC

head link

[Gluster-users] Geo-replication status Faulty

Usually there is always only 1 "master" , but when you power off one
of the 2 nodes - the geo rep should handle that and the second node should take
the job.

How long did you wait after gluster1 has been rebooted ?


Best Regards,
Strahil Nikolov






? ??????????, 26 ???????? 2020 ?., 22:46:21 ???????+2, Gilberto Nunes
<gilberto.nunes32 at gmail.com> ??????:





I was?able to solve the issue restarting all servers.

Now I have?another issue!

I just powered off the gluster01 server and then the geo-replication entered in
faulty status.
I tried to stop and start the gluster geo-replication like that:

gluster volume geo-replication DATA root at gluster03::DATA-SLAVE resume ?Peer
gluster01.home.local, which is a part of DATA volume, is down. Please bring up
the peer and retry. geo-replication command failed
How can I have geo-replication with 2 master and 1 slave?

Thanks


---
Gilberto Nunes Ferreira







Em seg., 26 de out. de 2020 ?s 17:23, Gilberto Nunes <gilberto.nunes32 at
gmail.com> escreveu:> Hi there...
> 
> I'd created a 2 gluster vol and another 1 gluster server acting as a
backup server, using geo-replication.
> So in gluster01 I'd issued the command:
> 
> gluster peer probe gluster02;gluster peer probe gluster03
> gluster vol create DATA replica 2 gluster01:/DATA/master01-data
gluster02:/DATA/master01-data/
> 
> Then in gluster03 server:
> 
> gluster vol create DATA-SLAVE gluster03:/DATA/slave-data/
> 
> I'd setted the ssh powerless session between this 3 servers.
> 
> Then I'd used this script
> 
> https://github.com/gilbertoferreira/georepsetup
> 
> like this
> 
> georepsetup
???????????/usr/local/lib/python2.7/dist-packages/paramiko-2.7.2-py2.7.egg/paramiko/transport.py:33:
CryptographyDeprecationWarning: Python 2 is no longer supported by the Python
core team. Support for it is now deprecated in cryptography, and will be removed
in a future release. ?from cryptography.hazmat.backends import default_backend
usage: georepsetup [-h] [--force] [--no-color] MASTERVOL SLAVE SLAVEVOL
georepsetup: error: too few arguments gluster01:~# georepsetup DATA gluster03
DATA-SLAVE
/usr/local/lib/python2.7/dist-packages/paramiko-2.7.2-py2.7.egg/paramiko/transport.py:33:
CryptographyDeprecationWarning: Python 2 is no longer supported by the Python
core team. Support for it is now deprecated in cryptography, and will be removed
in a future release. ?from cryptography.hazmat.backends import default_backend
Geo-replication session will be established between DATA and
gluster03::DATA-SLAVE Root password of gluster03 is required to complete the
setup. NOTE: Password will not be stored. root at gluster03's password: ?[
???OK] gluster03 is Reachable(Port 22) [ ???OK] SSH Connection established root
at gluster03 [ ???OK] Master Volume and Slave Volume are compatible (Version:
8.2) [ ???OK] Common secret pub file present at
/var/lib/glusterd/geo-replication/common_secret.pem.pub [ ???OK]
common_secret.pem.pub file copied to gluster03 [ ???OK] Master SSH Keys copied
to all Up Slave nodes [ ???OK] Updated Master SSH Keys to all Up Slave nodes
authorized_keys file [ ???OK] Geo-replication Session Established
> Then I reboot the 3 servers...
> After a while everything works ok, but after a few minutes, I get Faulty
status in gluster01....
> 
> There's the log
> 
> 
> [2020-10-26 20:16:41.362584] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change
[{status=Initializing...}] [2020-10-26 20:16:41.362937] I
[monitor(monitor):160:monitor] Monitor: starting gsyncd worker
[{brick=/DATA/master01-data}, {slave_node=gluster03}] [2020-10-26
20:16:41.508884] I [resource(worker /DATA/master01-data):1387:connect_remote]
SSH: Initializing SSH connection between master and slave... [2020-10-26
20:16:42.996678] I [resource(worker /DATA/master01-data):1436:connect_remote]
SSH: SSH connection between master and slave established. [{duration=1.4873}]
[2020-10-26 20:16:42.997121] I [resource(worker
/DATA/master01-data):1116:connect] GLUSTER: Mounting gluster volume locally...
[2020-10-26 20:16:44.170661] E [syncdutils(worker
/DATA/master01-data):110:gf_mount_ready] <top>: failed to get the xattr
value [2020-10-26 20:16:44.171281] I [resource(worker
/DATA/master01-data):1139:connect] GLUSTER: Mounted gluster volume
[{duration=1.1739}] [2020-10-26 20:16:44.171772] I [subcmds(worker
/DATA/master01-data):84:subcmd_worker] <top>: Worker spawn successful.
Acknowledging back to monitor [2020-10-26 20:16:46.200603] I [master(worker
/DATA/master01-data):1645:register] _GMaster: Working dir
[{path=/var/lib/misc/gluster/gsyncd/DATA_gluster03_DATA-SLAVE/DATA-master01-data}]
[2020-10-26 20:16:46.201798] I [resource(worker
/DATA/master01-data):1292:service_loop] GLUSTER: Register time
[{time=1603743406}] [2020-10-26 20:16:46.226415] I [gsyncdstatus(worker
/DATA/master01-data):281:set_active] GeorepStatus: Worker Status Change
[{status=Active}] [2020-10-26 20:16:46.395112] I [gsyncdstatus(worker
/DATA/master01-data):253:set_worker_crawl_status] GeorepStatus: Crawl Status
Change [{status=History Crawl}] [2020-10-26 20:16:46.396491] I [master(worker
/DATA/master01-data):1559:crawl] _GMaster: starting history crawl [{turns=1},
{stime=(1603742506, 0)},{etime=1603743406}, {entry_stime=(1603743226, 0)}]
[2020-10-26 20:16:46.399292] E [resource(worker
/DATA/master01-data):1312:service_loop] GLUSTER: Changelog History Crawl failed
[{error=[Errno 0] Sucesso}] [2020-10-26 20:16:47.177205] I
[monitor(monitor):228:monitor] Monitor: worker died in startup phase
[{brick=/DATA/master01-data}] [2020-10-26 20:16:47.184525] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change
[{status=Faulty}]
> 
> Any advice will be welcome.
> 
> Thanks
> 
> ---
> Gilberto Nunes Ferreira
> 
> 
> 
> 
> 
> ________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Aravinda VK

2020-Oct-27 09:52 UTC

head link

[Gluster-users] Geo-replication status Faulty

Hi Gilberto,

Happy to see georepsetup tool is useful for you. The repo I moved to
https://github.com/aravindavk/gluster-georep-tools
<https://github.com/aravindavk/gluster-georep-tools> (renamed as
?gluster-georep-setup?).

I think the georep command failure is due to respective node?s(peer) Glusterd is
not reachable/down.

Aravinda Vishwanathapura
https://kadalu.io
> On 27-Oct-2020, at 2:15 AM, Gilberto Nunes <gilberto.nunes32 at
gmail.com> wrote:
> 
> I was able to solve the issue restarting all servers.
> 
> Now I have another issue!
> 
> I just powered off the gluster01 server and then the geo-replication
entered in faulty status.
> I tried to stop and start the gluster geo-replication like that:
> 
> gluster volume geo-replication DATA root at gluster03::DATA-SLAVE resume  
> Peer gluster01.home.local, which is a part of DATA volume, is down. Please
bring up the peer and retry.
> geo-replication command failed
> 
> How can I have geo-replication with 2 master and 1 slave?
> 
> Thanks
> 
> 
> ---
> Gilberto Nunes Ferreira
> 
> 
> 
> 
> 
> 
> Em seg., 26 de out. de 2020 ?s 17:23, Gilberto Nunes <gilberto.nunes32
at gmail.com <mailto:gilberto.nunes32 at gmail.com>> escreveu:
> Hi there...
> 
> I'd created a 2 gluster vol and another 1 gluster server acting as a
backup server, using geo-replication.
> So in gluster01 I'd issued the command:
> 
> gluster peer probe gluster02;gluster peer probe gluster03
> gluster vol create DATA replica 2 gluster01:/DATA/master01-data
gluster02:/DATA/master01-data/
> 
> Then in gluster03 server:
> 
> gluster vol create DATA-SLAVE gluster03:/DATA/slave-data/
> 
> I'd setted the ssh powerless session between this 3 servers.
> 
> Then I'd used this script
> 
> https://github.com/gilbertoferreira/georepsetup
<https://github.com/gilbertoferreira/georepsetup>
> 
> like this
> 
> georepsetup            
>
/usr/local/lib/python2.7/dist-packages/paramiko-2.7.2-py2.7.egg/paramiko/transport.py:33:
CryptographyDeprecationWarning: Python 2 is no longer supp
> orted by the Python core team. Support for it is now deprecated in
cryptography, and will be removed in a future release.
>  from cryptography.hazmat.backends import default_backend 
> usage: georepsetup [-h] [--force] [--no-color] MASTERVOL SLAVE SLAVEVOL 
> georepsetup: error: too few arguments 
> gluster01:~# georepsetup DATA gluster03 DATA-SLAVE 
>
/usr/local/lib/python2.7/dist-packages/paramiko-2.7.2-py2.7.egg/paramiko/transport.py:33:
CryptographyDeprecationWarning: Python 2 is no longer supp
> orted by the Python core team. Support for it is now deprecated in
cryptography, and will be removed in a future release.
>  from cryptography.hazmat.backends import default_backend 
> Geo-replication session will be established between DATA and
gluster03::DATA-SLAVE
> Root password of gluster03 is required to complete the setup. NOTE:
Password will not be stored.
> 
> root at gluster03's password:  
> [    OK] gluster03 is Reachable(Port 22) 
> [    OK] SSH Connection established root at gluster03 
> [    OK] Master Volume and Slave Volume are compatible (Version: 8.2) 
> [    OK] Common secret pub file present at
/var/lib/glusterd/geo-replication/common_secret.pem.pub
> [    OK] common_secret.pem.pub file copied to gluster03 
> [    OK] Master SSH Keys copied to all Up Slave nodes 
> [    OK] Updated Master SSH Keys to all Up Slave nodes authorized_keys file
> [    OK] Geo-replication Session Established
> 
> Then I reboot the 3 servers...
> After a while everything works ok, but after a few minutes, I get Faulty
status in gluster01....
> 
> There's the log
> 
> 
> [2020-10-26 20:16:41.362584] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change
[{status=Initializing...}]
> [2020-10-26 20:16:41.362937] I [monitor(monitor):160:monitor] Monitor:
starting gsyncd worker [{brick=/DATA/master01-data}, {slave_node=gluster03}]
> [2020-10-26 20:16:41.508884] I [resource(worker
/DATA/master01-data):1387:connect_remote] SSH: Initializing SSH connection
between master and slave.
> .. 
> [2020-10-26 20:16:42.996678] I [resource(worker
/DATA/master01-data):1436:connect_remote] SSH: SSH connection between master and
slave established.
> [{duration=1.4873}] 
> [2020-10-26 20:16:42.997121] I [resource(worker
/DATA/master01-data):1116:connect] GLUSTER: Mounting gluster volume locally...
> [2020-10-26 20:16:44.170661] E [syncdutils(worker
/DATA/master01-data):110:gf_mount_ready] <top>: failed to get the xattr
value
> [2020-10-26 20:16:44.171281] I [resource(worker
/DATA/master01-data):1139:connect] GLUSTER: Mounted gluster volume
[{duration=1.1739}]
> [2020-10-26 20:16:44.171772] I [subcmds(worker
/DATA/master01-data):84:subcmd_worker] <top>: Worker spawn successful.
Acknowledging back to monitor
> [2020-10-26 20:16:46.200603] I [master(worker
/DATA/master01-data):1645:register] _GMaster: Working dir
[{path=/var/lib/misc/gluster/gsyncd/DATA_glu
> ster03_DATA-SLAVE/DATA-master01-data}] 
> [2020-10-26 20:16:46.201798] I [resource(worker
/DATA/master01-data):1292:service_loop] GLUSTER: Register time
[{time=1603743406}]
> [2020-10-26 20:16:46.226415] I [gsyncdstatus(worker
/DATA/master01-data):281:set_active] GeorepStatus: Worker Status Change
[{status=Active}]
> [2020-10-26 20:16:46.395112] I [gsyncdstatus(worker
/DATA/master01-data):253:set_worker_crawl_status] GeorepStatus: Crawl Status
Change [{status=His
> tory Crawl}] 
> [2020-10-26 20:16:46.396491] I [master(worker
/DATA/master01-data):1559:crawl] _GMaster: starting history crawl [{turns=1},
{stime=(1603742506, 0)},
> {etime=1603743406}, {entry_stime=(1603743226, 0)}] 
> [2020-10-26 20:16:46.399292] E [resource(worker
/DATA/master01-data):1312:service_loop] GLUSTER: Changelog History Crawl failed
[{error=[Errno 0] Su
> cesso}] 
> [2020-10-26 20:16:47.177205] I [monitor(monitor):228:monitor] Monitor:
worker died in startup phase [{brick=/DATA/master01-data}]
> [2020-10-26 20:16:47.184525] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change
[{status=Faulty}]
> 
> 
> Any advice will be welcome.
> 
> Thanks
> 
> ---
> Gilberto Nunes Ferreira
> 
> 
> 
> 
> ________
> 
> 
> 
> Community Meeting Calendar:
> 
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
> 
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20201027/0fa3d112/attachment.html>

Gluster users - Oct 2020 - Geo-replication status Faulty

[Gluster-users] Geo-replication status Faulty

[Gluster-users] Geo-replication status Faulty

[Gluster-users] Geo-replication status Faulty