thr3ads.net - Gluster users - [Gluster-users] Gluster and bonding [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Jorick Astrego

2019-Feb-25 12:44 UTC

[Gluster-users] Gluster and bonding

Hi,

Well no, mode 5 and mode 6 also have fault tollerance and don't need any
special switch config.

Quick google search:

https://serverfault.com/questions/734246/does-balance-alb-and-balance-tlb-support-fault-tolerance

    Bonding Mode 5 (balance-tlb) works by looking at all the devices in
    the bond, and sending out the slave with the least current traffic
    load. Traffic is only received by one slave (the "primary slave").
    If a slave is lost, that slave is not considered for transmission,
    so this mode is fault-tolerant.

    Bonding Mode 6 (balance-alb) works as above, except incoming ARP
    requests are intercepted by the bonding driver, and the bonding
    driver generates ARP replies so that external hosts are tricked into
    sending their traffic into one of the other bonding slaves instead
    of the primary slave. If many hosts in the same broadcast domain
    contact the bond, then traffic should balance roughly evenly into
    all slaves.

    If a slave is lost in Mode 6, then it may take some time for a
    remote host to time out its ARP table entry and send a new ARP
    request. A TCP or SCTP retransmission tents to lead into ARP request
    fairly quickly, but a UDP datagram does not, and will rely on the
    usual ARP table refresh. So Mode 6 /is/ fault tolerant, but
    convergence on slave loss may take some time depending on the Layer
    4 protocol used.

    If you are worried about fast fault tolerance, then consider using
    Mode 4 (802.3ad aka LACP) which negotiates link aggregation between
    the bond and the switch, and constantly updates the link status
    between the aggregation partners. Mode 4 also has configurable load
    balance hashing so is better for in-order delivery of TCP streams
    compared to Mode 5 or Mode 6.

https://wiki.linuxfoundation.org/networking/bonding

  *
    *balance-tlb or 5*
    Adaptive transmit load balancing: channel bonding that does not
    require any special switch support. The outgoing traffic is
    distributed according to the current load (computed relative to the
    speed) on each slave. Incoming traffic is received by the current
    slave. *If the receiving slave fails, another slave takes over the
    MAC address of the failed receiving slave.*
      o
        Prerequisite:
         1.
            Ethtool support in the base drivers for retrieving the speed
            of each slave.
  *
    *balance-alb or 6?*
    Adaptive load balancing: *includes balance-tlb plus receive load
    balancing* (rlb) for IPV4 traffic, and does not require any special
    switch support. The receive load balancing is achieved by ARP
    negotiation.
      o
        The bonding driver intercepts the ARP Replies sent by the local
        system on their way out and overwrites the source hardware
        address with the unique hardware address of one of the slaves in
        the bond such that different peers use different hardware
        addresses for the server.
      o
        Receive traffic from connections created by the server is also
        balanced. When the local system sends an ARP Request the bonding
        driver copies and saves the peer's IP information from the ARP
        packet.
      o
        When the ARP Reply arrives from the peer, its hardware address
        is retrieved and the bonding driver initiates an ARP reply to
        this peer assigning it to one of the slaves in the bond.
      o
        A problematic outcome of using ARP negotiation for balancing is
        that each time that an ARP request is broadcast it uses the
        hardware address of the bond. Hence, peers learn the hardware
        address of the bond and the balancing of receive traffic
        collapses to the current slave. This is handled by sending
        updates (ARP Replies) to all the peers with their individually
        assigned hardware address such that the traffic is
        redistributed. Receive traffic is also redistributed when a new
        slave is added to the bond and when an inactive slave is
        re-activated. The receive load is distributed sequentially
        (round robin) among the group of highest speed slaves in the bond.
      o
        When a link is reconnected or a new slave joins the bond the
        receive traffic is redistributed among all active slaves in the
        bond by initiating ARP Replies with the selected mac address to
        each of the clients. The updelay parameter (detailed below) must
        be set to a value equal or greater than the switch's forwarding
        delay so that the ARP Replies sent to the peers will not be
        blocked by the switch.

On 2/25/19 1:16 PM, Martin Toth wrote:> Hi Alex,
>
> you have to use bond mode 4 (LACP - 802.3ad) in order to achieve
> redundancy of cables/ports/switches. I suppose this is what you want.
>
> BR,
> Martin
>
>> On 25 Feb 2019, at 11:43, Alex K <rightkicktech at gmail.com
>> <mailto:rightkicktech at gmail.com>> wrote:
>>
>> Hi All,
>>
>> I was asking if it is possible to have the two separate cables
>> connected to two different physical switched. When trying mode6 or
>> mode1 in this setup gluster was refusing to start the volumes, giving
>> me "transport endpoint is not connected".
>>
>> server1: cable1 ---------------- switch1 ---------------------
>> server2: cable1
>> ?????????????????????????????? ? ? ? ? ? ?? |
>> server1: cable2 ---------------- switch2 ---------------------
>> server2: cable2
>>
>> Both switches are connected with each other also. This is done to
>> achieve redundancy for the switches.
>> When disconnecting cable2 from both servers, then gluster was happy.
>> What could be the?problem?
>>
>> Thanx,
>> Alex
>>
>>
>> On Mon, Feb 25, 2019 at 11:32 AM Jorick Astrego <jorick at
netbulae.eu
>> <mailto:jorick at netbulae.eu>> wrote:
>>
>>     Hi,
>>
>>     We use bonding mode 6 (balance-alb) for GlusterFS traffic
>>
>>    
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4
>>
>>         Preferred bonding mode for Red Hat Gluster Storage client is
>>         mode 6 (balance-alb), this allows client to transmit writes
>>         in parallel on separate NICs much of the time.
>>
>>     Regards,
>>
>>     Jorick Astrego
>>
>>     On 2/25/19 5:41 AM, Dmitry Melekhov wrote:
>>>     23.02.2019 19:54, Alex K ?????:
>>>>     Hi all,
>>>>
>>>>     I have a replica 3 setup where each server was configured
with
>>>>     a dual interfaces in mode 6 bonding. All cables were
connected
>>>>     to one common network switch.
>>>>
>>>>     To add redundancy to the switch, and avoid being a single
point
>>>>     of failure, I connected each second cable of each server to
a
>>>>     second switch. This turned out to not function as gluster
was
>>>>     refusing to start the volume logging "transport
endpoint is
>>>>     disconnected" although all nodes were able to reach
each other
>>>>     (ping) in the storage network. I switched the mode to mode
1
>>>>     (active/passive) and initially it worked but following a
reboot
>>>>     of all cluster same issue appeared. Gluster is not starting
the
>>>>     volumes.
>>>>
>>>>     Isn't active/passive supposed to work like that? Can
one have
>>>>     such redundant network setup or are there any other
recommended
>>>>     approaches?
>>>>
>>>
>>>     Yes, we use lacp, I guess this is mode 4 ( we use teamd ), it
>>>     is, no doubt, best way.
>>>
>>>
>>>>     Thanx,
>>>>     Alex
>>>>
>>>>     _______________________________________________
>>>>     Gluster-users mailing list
>>>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>>>     https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>>     _______________________________________________
>>>     Gluster-users mailing list
>>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>>     https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>>
>>     Met vriendelijke groet, With kind regards,
>>
>>     Jorick Astrego
>>     *
>>     Netbulae Virtualization Experts *
>>    
------------------------------------------------------------------------
>>     Tel: 053 20 30 270 	info at netbulae.eu <mailto:info at
netbulae.eu>
>>     Staalsteden 4-3A 	KvK 08198180
>>     Fax: 053 20 30 271 	www.netbulae.eu <http://www.netbulae.eu/>
>>     7547 TA Enschede 	BTW NL821234584B01
>>
>>
>>    
------------------------------------------------------------------------
>>
>>     _______________________________________________
>>     Gluster-users mailing list
>>     Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>     https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



Met vriendelijke groet, With kind regards,

Jorick Astrego

Netbulae Virtualization Experts 

----------------

	Tel: 053 20 30 270 	info at netbulae.eu 	Staalsteden 4-3A 	KvK 08198180
 	Fax: 053 20 30 271 	www.netbulae.eu 	7547 TA Enschede 	BTW NL821234584B01

----------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190225/74bd54bd/attachment.html>

Martin Toth

2019-Feb-25 13:22 UTC

head link

[Gluster-users] Gluster and bonding

How long does it take to your devices (using mode 5 or 6, ALB is prefered for
GlusterFS) to take-over the MAC? This can result in your error - 
"transport endpoint is not connected? - there are some timeouts within
gluster set by default.
I am using LACP and it works without any problem. Can you share your mode 5 / 6
configuration ?

Thanks.
Martin
> On 25 Feb 2019, at 13:44, Jorick Astrego <jorick at netbulae.eu>
wrote:
> 
> Hi,
> 
> Well no, mode 5 and mode 6 also have fault tollerance and don't need
any special switch config.
> 
> Quick google search:
> 
>
https://serverfault.com/questions/734246/does-balance-alb-and-balance-tlb-support-fault-tolerance
<https://serverfault.com/questions/734246/does-balance-alb-and-balance-tlb-support-fault-tolerance>
> Bonding Mode 5 (balance-tlb) works by looking at all the devices in the
bond, and sending out the slave with the least current traffic load. Traffic is
only received by one slave (the "primary slave"). If a slave is lost,
that slave is not considered for transmission, so this mode is fault-tolerant.
> 
> Bonding Mode 6 (balance-alb) works as above, except incoming ARP requests
are intercepted by the bonding driver, and the bonding driver generates ARP
replies so that external hosts are tricked into sending their traffic into one
of the other bonding slaves instead of the primary slave. If many hosts in the
same broadcast domain contact the bond, then traffic should balance roughly
evenly into all slaves.
> 
> If a slave is lost in Mode 6, then it may take some time for a remote host
to time out its ARP table entry and send a new ARP request. A TCP or SCTP
retransmission tents to lead into ARP request fairly quickly, but a UDP datagram
does not, and will rely on the usual ARP table refresh. So Mode 6 is fault
tolerant, but convergence on slave loss may take some time depending on the
Layer 4 protocol used.
> 
> If you are worried about fast fault tolerance, then consider using Mode 4
(802.3ad aka LACP) which negotiates link aggregation between the bond and the
switch, and constantly updates the link status between the aggregation partners.
Mode 4 also has configurable load balance hashing so is better for in-order
delivery of TCP streams compared to Mode 5 or Mode 6.
> 
> https://wiki.linuxfoundation.org/networking/bonding
<https://wiki.linuxfoundation.org/networking/bonding>
> balance-tlb or 5
> Adaptive transmit load balancing: channel bonding that does not require any
special switch support. The outgoing traffic is distributed according to the
current load (computed relative to the speed) on each slave. Incoming traffic is
received by the current slave. If the receiving slave fails, another slave takes
over the MAC address of the failed receiving slave.
> Prerequisite:
> Ethtool support in the base drivers for retrieving the speed of each slave.
> balance-alb or 6 
> Adaptive load balancing: includes balance-tlb plus receive load balancing
(rlb) for IPV4 traffic, and does not require any special switch support. The
receive load balancing is achieved by ARP negotiation.
> The bonding driver intercepts the ARP Replies sent by the local system on
their way out and overwrites the source hardware address with the unique
hardware address of one of the slaves in the bond such that different peers use
different hardware addresses for the server.
> Receive traffic from connections created by the server is also balanced.
When the local system sends an ARP Request the bonding driver copies and saves
the peer's IP information from the ARP packet.
> When the ARP Reply arrives from the peer, its hardware address is retrieved
and the bonding driver initiates an ARP reply to this peer assigning it to one
of the slaves in the bond.
> A problematic outcome of using ARP negotiation for balancing is that each
time that an ARP request is broadcast it uses the hardware address of the bond.
Hence, peers learn the hardware address of the bond and the balancing of receive
traffic collapses to the current slave. This is handled by sending updates (ARP
Replies) to all the peers with their individually assigned hardware address such
that the traffic is redistributed. Receive traffic is also redistributed when a
new slave is added to the bond and when an inactive slave is re-activated. The
receive load is distributed sequentially (round robin) among the group of
highest speed slaves in the bond.
> When a link is reconnected or a new slave joins the bond the receive
traffic is redistributed among all active slaves in the bond by initiating ARP
Replies with the selected mac address to each of the clients. The updelay
parameter (detailed below) must be set to a value equal or greater than the
switch's forwarding delay so that the ARP Replies sent to the peers will not
be blocked by the switch.
> On 2/25/19 1:16 PM, Martin Toth wrote:
>> Hi Alex,
>> 
>> you have to use bond mode 4 (LACP - 802.3ad) in order to achieve
redundancy of cables/ports/switches. I suppose this is what you want.
>> 
>> BR,
>> Martin
>> 
>>> On 25 Feb 2019, at 11:43, Alex K <rightkicktech at gmail.com
<mailto:rightkicktech at gmail.com>> wrote:
>>> 
>>> Hi All, 
>>> 
>>> I was asking if it is possible to have the two separate cables
connected to two different physical switched. When trying mode6 or mode1 in this
setup gluster was refusing to start the volumes, giving me "transport
endpoint is not connected".
>>> 
>>> server1: cable1 ---------------- switch1 ---------------------
server2: cable1
>>>                                             |
>>> server1: cable2 ---------------- switch2 ---------------------
server2: cable2
>>> 
>>> Both switches are connected with each other also. This is done to
achieve redundancy for the switches.
>>> When disconnecting cable2 from both servers, then gluster was
happy.
>>> What could be the problem?
>>> 
>>> Thanx,
>>> Alex
>>> 
>>> 
>>> On Mon, Feb 25, 2019 at 11:32 AM Jorick Astrego <jorick at
netbulae.eu <mailto:jorick at netbulae.eu>> wrote:
>>> Hi,
>>> 
>>> We use bonding mode 6 (balance-alb) for GlusterFS traffic
>>> 
>>>
https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4
<https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/network4>
>>> Preferred bonding mode for Red Hat Gluster Storage client is mode 6
(balance-alb), this allows client to transmit writes in parallel on separate
NICs much of the time.
>>> Regards,
>>> 
>>> Jorick Astrego
>>> On 2/25/19 5:41 AM, Dmitry Melekhov wrote:
>>>> 23.02.2019 19:54, Alex K ?????:
>>>>> Hi all, 
>>>>> 
>>>>> I have a replica 3 setup where each server was configured
with a dual interfaces in mode 6 bonding. All cables were connected to one
common network switch.
>>>>> 
>>>>> To add redundancy to the switch, and avoid being a single
point of failure, I connected each second cable of each server to a second
switch. This turned out to not function as gluster was refusing to start the
volume logging "transport endpoint is disconnected" although all nodes
were able to reach each other (ping) in the storage network. I switched the mode
to mode 1 (active/passive) and initially it worked but following a reboot of all
cluster same issue appeared. Gluster is not starting the volumes.
>>>>> 
>>>>> Isn't active/passive supposed to work like that? Can
one have such redundant network setup or are there any other recommended
approaches?
>>>>> 
>>>> 
>>>> Yes, we use lacp, I guess this is mode 4 ( we use teamd ), it
is, no doubt, best way.
>>>> 
>>>> 
>>>>> Thanx, 
>>>>> Alex
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
>>> 
>>> 
>>> 
>>> Met vriendelijke groet, With kind regards,
>>> 
>>> Jorick Astrego
>>> 
>>> Netbulae Virtualization Experts 
>>> Tel: 053 20 30 270	info at netbulae.eu <mailto:info at
netbulae.eu>	Staalsteden 4-3A	KvK 08198180
>>> Fax: 053 20 30 271	www.netbulae.eu <http://www.netbulae.eu/>
7547 TA Enschede	BTW NL821234584B01
>>> 
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>_______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
>> https://lists.gluster.org/mailman/listinfo/gluster-users
<https://lists.gluster.org/mailman/listinfo/gluster-users>
> 
> 
> 
> Met vriendelijke groet, With kind regards,
> 
> Jorick Astrego
> 
> Netbulae Virtualization Experts 
> Tel: 053 20 30 270	info at netbulae.eu	Staalsteden 4-3A	KvK 08198180
> Fax: 053 20 30 271	www.netbulae.eu	7547 TA Enschede	BTW NL821234584B01
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190225/e6f3ff27/attachment.html>

Gluster users - Feb 2019 - Gluster and bonding

[Gluster-users] Gluster and bonding

[Gluster-users] Gluster and bonding