thr3ads.net - Gluster users - [Gluster-users] Geo-replication [Mar 2020]

If this information is useful, please help other people find it:
Share via:

Strahil Nikolov

2020-Mar-03 05:08 UTC

[Gluster-users] Geo-replication

On March 3, 2020 4:13:38 AM GMT+02:00, David Cunningham <dcunningham at
voisonics.com> wrote:>Hello,
>
>Thanks for that. When we re-tried with push-pem from cafs10 (on the
>A/master cluster) it failed with "Unable to mount and fetch slave
>volume
>details." and in the logs we see:
>
>[2020-03-03 02:07:42.614911] E
>[name.c:258:af_inet_client_get_remote_sockaddr] 0-gvol0-client-0: DNS
>resolution failed on host nvfs10.local
>[2020-03-03 02:07:42.638824] E
>[name.c:258:af_inet_client_get_remote_sockaddr] 0-gvol0-client-1: DNS
>resolution failed on host nvfs20.local
>[2020-03-03 02:07:42.664493] E
>[name.c:258:af_inet_client_get_remote_sockaddr] 0-gvol0-client-2: DNS
>resolution failed on host nvfs30.local
>
>These .local addresses are the LAN addresses that B/slave nodes nvfs10,
>nvfs20, and nvfs30 replicate with. It seems that the A/master needs to
>be
>able to contact those addresses. Is that right? If it is then we'll
>need to
>re-do the B cluster to replicate using publicly accessible IP addresses
>instead of their LAN.
>
>Thank you.
>
>
>On Mon, 2 Mar 2020 at 20:53, Aravinda VK <aravinda at kadalu.io>
wrote:
>
>> Looks like setup issue to me. Copying SSH keys manually is not
>required.
>>
>> Command prefix is required while adding to authorized_keys file in
>each
>> remote nodes. That will not be available if ssh keys are added
>manually.
>>
>> Geo-rep specifies /nonexisting/gsyncd in the command to make sure it
>> connects via the actual command specified in authorized_keys file, in
>your
>> case Geo-replication is actually looking for gsyncd command in
>> /nonexisting/gsyncd path.
>>
>> Please try with push-pem option during Geo-rep create command.
>>
>> ?
>> regards
>> Aravinda Vishwanathapura
>> https://kadalu.io
>>
>>
>> On 02-Mar-2020, at 6:03 AM, David Cunningham
><dcunningham at voisonics.com>
>> wrote:
>>
>> Hello,
>>
>> We've set up geo-replication but it isn't actually syncing.
Scenario
>is
>> that we have two GFS clusters. Cluster A has nodes cafs10, cafs20,
>and
>> cafs30, replicating with each other over a LAN. Cluster B has nodes
>nvfs10,
>> nvfs20, and nvfs30 also replicating with each other over a LAN. We
>are
>> geo-replicating data from the A cluster to the B cluster over the
>internet.
>> SSH key access is set up, allowing all the A nodes password-less
>access to
>> root on nvfs10
>>
>> Geo-replication was set up using these commands, run on cafs10:
>>
>> gluster volume geo-replication gvol0 nvfs10.example.com::gvol0 create
>> ssh-port 8822 no-verify
>> gluster volume geo-replication gvol0 nvfs10.example.com::gvol0 config
>> remote-gsyncd /usr/lib/x86_64-linux-gnu/glusterfs/gsyncd
>> gluster volume geo-replication gvol0 nvfs10.example.com::gvol0 start
>>
>> However after a very short period of the status being
>"Initializing..."
>> the status then sits on "Passive":
>>
>> # gluster volume geo-replication gvol0 nvfs10.example.com::gvol0
>status
>> MASTER NODE    MASTER VOL    MASTER BRICK                       
>SLAVE
>> USER    SLAVE                         SLAVE NODE      STATUS    
>CRAWL
>> STATUS    LAST_SYNCED
>>
>>
>------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> cafs10         gvol0         /nodirectwritedata/gluster/gvol0    root
>>      nvfs10.example.com::gvol0    nvfs30.local    Passive    N/A
>>     N/A
>> cafs30         gvol0         /nodirectwritedata/gluster/gvol0    root
>>      nvfs10.example.com::gvol0    N/A             Created    N/A
>>     N/A
>> cafs20         gvol0         /nodirectwritedata/gluster/gvol0    root
>>      nvfs10.example.com::gvol0    N/A             Created    N/A
>>     N/A
>>
>> So my questions are:
>> 1. Why does the status on cafs10 mention "nvfs30.local"?
That's the
>LAN
>> address that nvfs10 replicates with nvfs30 using. It's not
accessible
>from
>> the A cluster, and I didn't use it when configuring
geo-replication.
>> 2. Why does geo-replication sit in Passive status?
>>
>> Thanks very much for any assistance.
>>
>>
>> On Tue, 25 Feb 2020 at 15:46, David Cunningham
><dcunningham at voisonics.com>
>> wrote:
>>
>>> Hi Aravinda and Sunny,
>>>
>>> Thank you for the replies. We have 3 replicating nodes on the
master
>>> side, and want to geo-replicate their data to the remote slave
side.
>As I
>>> understand it if the master node which had the geo-replication
>create
>>> command run goes down then another node will take over pushing
>updates to
>>> the remote slave. Is that right?
>>>
>>> We have already taken care of adding all master node's SSH keys
to
>the
>>> remote slave's authorized_keys externally, so won't include
the
>push-pem
>>> part of the create command.
>>>
>>> Mostly I wanted to confirm the geo-replication behaviour on the
>>> replicating master nodes if one of them goes down.
>>>
>>> Thank you!
>>>
>>>
>>> On Tue, 25 Feb 2020 at 14:32, Aravinda VK <aravinda at
kadalu.io>
>wrote:
>>>
>>>> Hi David,
>>>>
>>>>
>>>> On 25-Feb-2020, at 3:45 AM, David Cunningham
><dcunningham at voisonics.com>
>>>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I've a couple of questions on geo-replication that
hopefully
>someone can
>>>> help with:
>>>>
>>>> 1. If there are multiple nodes in a cluster on the master side
>(pushing
>>>> updates to the geo-replication slave), which node actually does
the
>>>> pushing? Does GlusterFS decide itself automatically?
>>>>
>>>>
>>>> Once Geo-replication session is started, one worker will be
started
>>>> corresponding to each Master bricks. Each worker identifies the
>changes
>>>> happened in respective brick and sync those changes via Mount.
This
>way
>>>> load is distributed among Master nodes. In case of Replica sub
>volume, one
>>>> worker among the Replica group will become active and
participate
>in the
>>>> syncing. Other bricks in that Replica group will remain
Passive.
>Passive
>>>> worker will become Active if the previously Active brick goes
down
>(This is
>>>> because all Replica bricks will have the same set of changes,
>syncing from
>>>> each worker is redundant).
>>>>
>>>>
>>>> 2.With regard to copying SSH keys, presumably the SSH key of
all
>master
>>>> nodes should be authorized on the geo-replication client side?
>>>>
>>>>
>>>> Geo-replication session is established between one master node
and
>one
>>>> remote node. If Geo-rep create command is successful then,
>>>>
>>>> - SSH keys generated in all master nodes
>>>> - Public keys from all master nodes are copied to initiator
Master
>node
>>>> - Public keys copied to the Remote node specified in the create
>command
>>>> - Master public keys are distributed to all nodes of remote
Cluster
>and
>>>> added to respective ~/.ssh/authorized_keys
>>>>
>>>> After successful Geo-rep create command, any Master node can
>connect to
>>>> any remote node via ssh.
>>>>
>>>> Security: Command prefix is added while adding public key to
remote
>>>> node?s authorized_keys file, So that if anyone gain access
using
>this key
>>>> can access only gsyncd command.
>>>>
>>>> ```
>>>> command=gsyncd ssh-key?.
>>>> ```
>>>>
>>>>
>>>>
>>>> Thanks for your help.
>>>>
>>>> --
>>>> David Cunningham, Voisonics Limited
>>>> http://voisonics.com/
>>>> USA: +1 213 221 1092
>>>> New Zealand: +64 (0)28 2558 3782
>>>> ________
>>>>
>>>>
>>>>
>>>> Community Meeting Calendar:
>>>>
>>>> Schedule -
>>>> Every Tuesday at 14:30 IST / 09:00 UTC
>>>> Bridge: https://bluejeans.com/441850968
>>>>
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>> ?
>>>> regards
>>>> Aravinda Vishwanathapura
>>>> https://kadalu.io
>>>>
>>>>
>>>
>>> --
>>> David Cunningham, Voisonics Limited
>>> http://voisonics.com/
>>> USA: +1 213 221 1092
>>> New Zealand: +64 (0)28 2558 3782
>>>
>>
>>
>> --
>> David Cunningham, Voisonics Limited
>> http://voisonics.com/
>> USA: +1 213 221 1092
>> New Zealand: +64 (0)28 2558 3782
>>
>>
>>
Hey David,

Why don't you set the B cluster's hostnames in /etc/hosts of all A
cluster nodes ?

Maybe you won't need to rebuild  the whole B cluster.

I guess the A cluster nodes nees to be able to reach all nodes from B cluster,
so you might need to change the firewall settings.


Best Regards,
Strahil Nikolov

David Cunningham

2020-Mar-03 22:43 UTC

head link

[Gluster-users] Geo-replication

Hi Strahil,

The B cluster are communicating with each other via a LAN, and it seems the
A cluster has got B's LAN addresses (which aren't accessible from the
internet including the A cluster) through the geo-replication process. That
being the case, I think we'll have to re-do the B cluster to replicate
using public addresses instead of the LAN.

Thank you.


On Tue, 3 Mar 2020 at 18:07, Strahil Nikolov <hunter86_bg at yahoo.com>
wrote:
> On March 3, 2020 4:13:38 AM GMT+02:00, David Cunningham <
> dcunningham at voisonics.com> wrote:
> >Hello,
> >
> >Thanks for that. When we re-tried with push-pem from cafs10 (on the
> >A/master cluster) it failed with "Unable to mount and fetch slave
> >volume
> >details." and in the logs we see:
> >
> >[2020-03-03 02:07:42.614911] E
> >[name.c:258:af_inet_client_get_remote_sockaddr] 0-gvol0-client-0: DNS
> >resolution failed on host nvfs10.local
> >[2020-03-03 02:07:42.638824] E
> >[name.c:258:af_inet_client_get_remote_sockaddr] 0-gvol0-client-1: DNS
> >resolution failed on host nvfs20.local
> >[2020-03-03 02:07:42.664493] E
> >[name.c:258:af_inet_client_get_remote_sockaddr] 0-gvol0-client-2: DNS
> >resolution failed on host nvfs30.local
> >
> >These .local addresses are the LAN addresses that B/slave nodes nvfs10,
> >nvfs20, and nvfs30 replicate with. It seems that the A/master needs to
> >be
> >able to contact those addresses. Is that right? If it is then we'll
> >need to
> >re-do the B cluster to replicate using publicly accessible IP addresses
> >instead of their LAN.
> >
> >Thank you.
> >
> >
> >On Mon, 2 Mar 2020 at 20:53, Aravinda VK <aravinda at kadalu.io>
wrote:
> >
> >> Looks like setup issue to me. Copying SSH keys manually is not
> >required.
> >>
> >> Command prefix is required while adding to authorized_keys file in
> >each
> >> remote nodes. That will not be available if ssh keys are added
> >manually.
> >>
> >> Geo-rep specifies /nonexisting/gsyncd in the command to make sure
it
> >> connects via the actual command specified in authorized_keys file,
in
> >your
> >> case Geo-replication is actually looking for gsyncd command in
> >> /nonexisting/gsyncd path.
> >>
> >> Please try with push-pem option during Geo-rep create command.
> >>
> >> ?
> >> regards
> >> Aravinda Vishwanathapura
> >> https://kadalu.io
> >>
> >>
> >> On 02-Mar-2020, at 6:03 AM, David Cunningham
> ><dcunningham at voisonics.com>
> >> wrote:
> >>
> >> Hello,
> >>
> >> We've set up geo-replication but it isn't actually
syncing. Scenario
> >is
> >> that we have two GFS clusters. Cluster A has nodes cafs10, cafs20,
> >and
> >> cafs30, replicating with each other over a LAN. Cluster B has
nodes
> >nvfs10,
> >> nvfs20, and nvfs30 also replicating with each other over a LAN. We
> >are
> >> geo-replicating data from the A cluster to the B cluster over the
> >internet.
> >> SSH key access is set up, allowing all the A nodes password-less
> >access to
> >> root on nvfs10
> >>
> >> Geo-replication was set up using these commands, run on cafs10:
> >>
> >> gluster volume geo-replication gvol0 nvfs10.example.com::gvol0
create
> >> ssh-port 8822 no-verify
> >> gluster volume geo-replication gvol0 nvfs10.example.com::gvol0
config
> >> remote-gsyncd /usr/lib/x86_64-linux-gnu/glusterfs/gsyncd
> >> gluster volume geo-replication gvol0 nvfs10.example.com::gvol0
start
> >>
> >> However after a very short period of the status being
> >"Initializing..."
> >> the status then sits on "Passive":
> >>
> >> # gluster volume geo-replication gvol0 nvfs10.example.com::gvol0
> >status
> >> MASTER NODE    MASTER VOL    MASTER BRICK
> >SLAVE
> >> USER    SLAVE                         SLAVE NODE      STATUS
> >CRAWL
> >> STATUS    LAST_SYNCED
> >>
> >>
>
>
>------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >> cafs10         gvol0         /nodirectwritedata/gluster/gvol0   
root
> >>      nvfs10.example.com::gvol0    nvfs30.local    Passive    N/A
> >>     N/A
> >> cafs30         gvol0         /nodirectwritedata/gluster/gvol0   
root
> >>      nvfs10.example.com::gvol0    N/A             Created    N/A
> >>     N/A
> >> cafs20         gvol0         /nodirectwritedata/gluster/gvol0   
root
> >>      nvfs10.example.com::gvol0    N/A             Created    N/A
> >>     N/A
> >>
> >> So my questions are:
> >> 1. Why does the status on cafs10 mention "nvfs30.local"?
That's the
> >LAN
> >> address that nvfs10 replicates with nvfs30 using. It's not
accessible
> >from
> >> the A cluster, and I didn't use it when configuring
geo-replication.
> >> 2. Why does geo-replication sit in Passive status?
> >>
> >> Thanks very much for any assistance.
> >>
> >>
> >> On Tue, 25 Feb 2020 at 15:46, David Cunningham
> ><dcunningham at voisonics.com>
> >> wrote:
> >>
> >>> Hi Aravinda and Sunny,
> >>>
> >>> Thank you for the replies. We have 3 replicating nodes on the
master
> >>> side, and want to geo-replicate their data to the remote slave
side.
> >As I
> >>> understand it if the master node which had the geo-replication
> >create
> >>> command run goes down then another node will take over pushing
> >updates to
> >>> the remote slave. Is that right?
> >>>
> >>> We have already taken care of adding all master node's SSH
keys to
> >the
> >>> remote slave's authorized_keys externally, so won't
include the
> >push-pem
> >>> part of the create command.
> >>>
> >>> Mostly I wanted to confirm the geo-replication behaviour on
the
> >>> replicating master nodes if one of them goes down.
> >>>
> >>> Thank you!
> >>>
> >>>
> >>> On Tue, 25 Feb 2020 at 14:32, Aravinda VK <aravinda at
kadalu.io>
> >wrote:
> >>>
> >>>> Hi David,
> >>>>
> >>>>
> >>>> On 25-Feb-2020, at 3:45 AM, David Cunningham
> ><dcunningham at voisonics.com>
> >>>> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> I've a couple of questions on geo-replication that
hopefully
> >someone can
> >>>> help with:
> >>>>
> >>>> 1. If there are multiple nodes in a cluster on the master
side
> >(pushing
> >>>> updates to the geo-replication slave), which node actually
does the
> >>>> pushing? Does GlusterFS decide itself automatically?
> >>>>
> >>>>
> >>>> Once Geo-replication session is started, one worker will
be started
> >>>> corresponding to each Master bricks. Each worker
identifies the
> >changes
> >>>> happened in respective brick and sync those changes via
Mount. This
> >way
> >>>> load is distributed among Master nodes. In case of Replica
sub
> >volume, one
> >>>> worker among the Replica group will become active and
participate
> >in the
> >>>> syncing. Other bricks in that Replica group will remain
Passive.
> >Passive
> >>>> worker will become Active if the previously Active brick
goes down
> >(This is
> >>>> because all Replica bricks will have the same set of
changes,
> >syncing from
> >>>> each worker is redundant).
> >>>>
> >>>>
> >>>> 2.With regard to copying SSH keys, presumably the SSH key
of all
> >master
> >>>> nodes should be authorized on the geo-replication client
side?
> >>>>
> >>>>
> >>>> Geo-replication session is established between one master
node and
> >one
> >>>> remote node. If Geo-rep create command is successful then,
> >>>>
> >>>> - SSH keys generated in all master nodes
> >>>> - Public keys from all master nodes are copied to
initiator Master
> >node
> >>>> - Public keys copied to the Remote node specified in the
create
> >command
> >>>> - Master public keys are distributed to all nodes of
remote Cluster
> >and
> >>>> added to respective ~/.ssh/authorized_keys
> >>>>
> >>>> After successful Geo-rep create command, any Master node
can
> >connect to
> >>>> any remote node via ssh.
> >>>>
> >>>> Security: Command prefix is added while adding public key
to remote
> >>>> node?s authorized_keys file, So that if anyone gain access
using
> >this key
> >>>> can access only gsyncd command.
> >>>>
> >>>> ```
> >>>> command=gsyncd ssh-key?.
> >>>> ```
> >>>>
> >>>>
> >>>>
> >>>> Thanks for your help.
> >>>>
> >>>> --
> >>>> David Cunningham, Voisonics Limited
> >>>> http://voisonics.com/
> >>>> USA: +1 213 221 1092
> >>>> New Zealand: +64 (0)28 2558 3782
> >>>> ________
> >>>>
> >>>>
> >>>>
> >>>> Community Meeting Calendar:
> >>>>
> >>>> Schedule -
> >>>> Every Tuesday at 14:30 IST / 09:00 UTC
> >>>> Bridge: https://bluejeans.com/441850968
> >>>>
> >>>> Gluster-users mailing list
> >>>> Gluster-users at gluster.org
> >>>> https://lists.gluster.org/mailman/listinfo/gluster-users
> >>>>
> >>>>
> >>>>
> >>>> ?
> >>>> regards
> >>>> Aravinda Vishwanathapura
> >>>> https://kadalu.io
> >>>>
> >>>>
> >>>
> >>> --
> >>> David Cunningham, Voisonics Limited
> >>> http://voisonics.com/
> >>> USA: +1 213 221 1092
> >>> New Zealand: +64 (0)28 2558 3782
> >>>
> >>
> >>
> >> --
> >> David Cunningham, Voisonics Limited
> >> http://voisonics.com/
> >> USA: +1 213 221 1092
> >> New Zealand: +64 (0)28 2558 3782
> >>
> >>
> >>
>
> Hey David,
>
> Why don't you set the B cluster's hostnames in /etc/hosts of all A
cluster
> nodes ?
>
> Maybe you won't need to rebuild  the whole B cluster.
>
> I guess the A cluster nodes nees to be able to reach all nodes from B
> cluster, so you might need to change the firewall settings.
>
>
> Best Regards,
> Strahil Nikolov
>

-- 
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200304/a8ec2b96/attachment.html>

Gluster users - Mar 2020 - Geo-replication

[Gluster-users] Geo-replication

[Gluster-users] Geo-replication