thr3ads.net - Ocfs2 devel - [Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability [Aug 2017]

If this information is useful, please help other people find it:
Share via:

Joseph Qi

2017-Aug-23 03:34 UTC

[Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

On 17/8/23 10:23, Junxiao Bi wrote:> On 08/10/2017 06:49 PM, Changwei Ge wrote:
>> Hi Joseph,
>>
>>
>> On 2017/8/10 17:53, Joseph Qi wrote:
>>> Hi Changwei,
>>>
>>> On 17/8/9 23:24, ge changwei wrote:
>>>> Hi
>>>>
>>>>
>>>> On 2017/8/9 ??7:32, Joseph Qi wrote:
>>>>> Hi,
>>>>>
>>>>> On 17/8/7 15:13, Changwei Ge wrote:
>>>>>> Hi,
>>>>>>
>>>>>> In current code, while flushing AST, we don't
handle an exception that
>>>>>> sending AST or BAST is failed.
>>>>>> But it is indeed possible that AST or BAST is lost due
to some kind of
>>>>>> networks fault.
>>>>>>
>>>>> Could you please describe this issue more clearly? It is
better analyze
>>>>> issue along with the error message and the status of
related nodes.
>>>>> IMO, if network is down, one of the two nodes will be
fenced. So what's
>>>>> your case here?
>>>>>
>>>>> Thanks,
>>>>> Joseph
>>>> I have posted the status of related lock resource in my
preceding email.
>>>> Please check them out.
>>>>
>>>> Moreover, network is not down forever even not longer than
threshold  to
>>>> be fenced.
>>>> So no node will be fenced.
>>>>
>>>> This issue happens in terrible network environment. Some
messages may be
>>>> abandoned by switch due to various conditions.
>>>> And even frequent and fast link up and down will also cause
this issue.
>>>>
>>>> In a nutshell,  re-queuing AST and BAST is crucial when link
between
>>>> nodes recover quickly. It prevents cluster from hanging.
>>>> So you mean the tcp packet is lost due to connection reset?
IIRC,
>> Yes, it's something like that exception which I think is deserved
to be
>> fixed within OCFS2.
>>> Junxiao has posted a patchset to fix this issue.
>>> If you are using the way of re-queuing, how to make sure the
original
>>> message is *truly* lost and the same ast/bast won't be sent
twice?
>> With regards to TCP layer, if it returns error to OCFS2, packets must
>> not be sent successfully. So no node will obtain such an AST or BAST.
> Right, but not only AST/BAST, other messages pending in tcp queue will
> also lost if tcp return error to ocfs2, this can also caused hung.
> Besides, your fix may introduce duplicated ast/bast message Joseph
> mentioned.
> Ocfs2 depends tcp a lot, it can't work well if tcp return error to it.
> To fix it, maybe ocfs2 should maintain its own message queue and ack
> messages while not depend on TCP.>Agree. Or we can add a sequence to distinguish duplicate message. Under
this, we can simply resend message if fails.

Thanks,
Joseph
 > Thanks,
> Junxiao.

Gang He

2017-Aug-23 04:47 UTC

head link

[Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

>>> 
> 
> On 17/8/23 10:23, Junxiao Bi wrote:
>> On 08/10/2017 06:49 PM, Changwei Ge wrote:
>>> Hi Joseph,
>>>
>>>
>>> On 2017/8/10 17:53, Joseph Qi wrote:
>>>> Hi Changwei,
>>>>
>>>> On 17/8/9 23:24, ge changwei wrote:
>>>>> Hi
>>>>>
>>>>>
>>>>> On 2017/8/9 ??7:32, Joseph Qi wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 17/8/7 15:13, Changwei Ge wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> In current code, while flushing AST, we don't
handle an exception that
>>>>>>> sending AST or BAST is failed.
>>>>>>> But it is indeed possible that AST or BAST is lost
due to some kind of
>>>>>>> networks fault.
>>>>>>>
>>>>>> Could you please describe this issue more clearly? It
is better analyze
>>>>>> issue along with the error message and the status of
related nodes.
>>>>>> IMO, if network is down, one of the two nodes will be
fenced. So what's
>>>>>> your case here?
>>>>>>
>>>>>> Thanks,
>>>>>> Joseph
>>>>> I have posted the status of related lock resource in my
preceding email.
>>>>> Please check them out.
>>>>>
>>>>> Moreover, network is not down forever even not longer than
threshold  to
>>>>> be fenced.
>>>>> So no node will be fenced.
>>>>>
>>>>> This issue happens in terrible network environment. Some
messages may be
>>>>> abandoned by switch due to various conditions.
>>>>> And even frequent and fast link up and down will also cause
this issue.
>>>>>
>>>>> In a nutshell,  re-queuing AST and BAST is crucial when
link between
>>>>> nodes recover quickly. It prevents cluster from hanging.
>>>>> So you mean the tcp packet is lost due to connection reset?
IIRC,
>>> Yes, it's something like that exception which I think is
deserved to be
>>> fixed within OCFS2.
>>>> Junxiao has posted a patchset to fix this issue.
>>>> If you are using the way of re-queuing, how to make sure the
original
>>>> message is *truly* lost and the same ast/bast won't be sent
twice?
>>> With regards to TCP layer, if it returns error to OCFS2, packets
must
>>> not be sent successfully. So no node will obtain such an AST or
BAST.
>> Right, but not only AST/BAST, other messages pending in tcp queue will
>> also lost if tcp return error to ocfs2, this can also caused hung.
>> Besides, your fix may introduce duplicated ast/bast message Joseph
>> mentioned.
>> Ocfs2 depends tcp a lot, it can't work well if tcp return error to
it.
>> To fix it, maybe ocfs2 should maintain its own message queue and ack
>> messages while not depend on TCP.>
> Agree. Or we can add a sequence to distinguish duplicate message. Under
> this, we can simply resend message if fails.Look likes, we need to make the message stateless.
Maybe, we can refer to GFS2, to see if GFS2 has considered this issue.

Thanks
Gang
> 
> Thanks,
> Joseph
>  
>> Thanks,
>> Junxiao.
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com 
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Ocfs2 devel - Aug 2017 - [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

[Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability

[Ocfs2-devel] [PATCH] ocfs2: re-queue AST or BAST if sending is failed to improve the reliability