Hi Atin,
You?re right in saying if it?s activate then all nodes should have it activated.
What I find strange is that when glusterfsd has problems communicating with the
other peers that that single node with issues isn?t considered ?not connected?
and thus expelled from the cluster somehow; in my case it caused a complete hang
of the trusted storage pool.
And to emphasise this, pinging was no problem as it uses small packets anyway so
jumbo frames were not used at all? enabling jumbo frames on the interface and
switches is only a way to tell the TCP/IP stack that it can send larger packets
but it does?t have to.
Or am I mistaking in that the TCP/IP stack will control wether to send the
bigger packets and that glusterfsd has no control over that?
Met vriendelijke groet / kind regards,
Sander Zijlstra
| Linux Engineer | SURFsara | Science Park 140 | 1098XG Amsterdam | T +31 (0)6
43 99 12 47 | sander.zijlstra at surfsara.nl | www.surfsara.nl |
Regular day off on friday
> On 15 Oct 2015, at 08:24, Atin Mukherjee <amukherj at redhat.com>
wrote:
>
>
>
> On 10/14/2015 05:09 PM, Sander Zijlstra wrote:
>> LS,
>>
>> I recently reconfigured one of my gluster nodes and forgot to update
the MTU size on the switch while I did configure the host with jumbo frames.
>>
>> The result was that the complete cluster had communication issues.
>>
>> All systems are part of a distributed striped volume with a replica
size of 2 but still the cluster was completely unusable until I updated the
switch port to accept jumbo frames rather than to discard them.
> This is expected. When enabling the network components to communicate
> with TCP jumbo frames in a Gluster Trusted Storage Pool, you'd need to
> ensure that all the network components such as switches, nodes are
> configured properly. I think with this setting you'd fail to ping the
> other nodes in the pool. So that could be a step of verification before
> you set the cluster up.
>>
>> The symptoms were:
>>
>> - Gluster clients had a very hard time reading the volume information
and thus couldn?t do any filesystem ops on them.
>> - The glusterfs servers could see each other (peer status) and a volume
info command was ok, but a volume status command would not return or would
return a ?staging failed? error.
>>
>> I know MTU size mixing and don?t fragment bit?s can screw up a lot but
why wasn?t that gluster peer just discarded from the cluster so that not all
clients kept on communicating with it and causing all sorts of errors.
> To answer this question, peer status & volume info are local operation
> and doesn't incur N/W, so in this very same case you might see peer
> status showing all the nodes are connected all though there is a
> breakage, OTOH in status command originator node communicates with other
> peers and hence it fails there.
>
> HTH,
> Atin
>>
>> I use glusterFS 3.6.2 at the moment?..
>>
>> Kind regards
>> Sander
>>
>>
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151027/76e1a45b/attachment.sig>