Hi all, As far as we know, ocfs2/o2net is not a reliable message mechanism. Messages might get lost due to a sudden TCP socket connection shutdown. And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm hang(missing AST and ASSERT MASTER). Sometimes it also causes ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that won't happen since target node is still heartbeating and no dlm recovery procedure will be launched. So I think above cases drive us to improve current ocfs2/o2net making it more reliable. I already have a draft design for it. And we indeed need to change o2net behavior. To accomplish this goal, we tag each o2net message with a sequence ::msg_seq to let receiver tell if the newly coming message is a duplicated one or not and ::msg_seq will work as a key value for searching a following key structure in a red-black tree. A brandy new structure is added to o2net named as *Message Holder*, it is responsible for _handle_status_ storing. When TCP has to shutdown or reset due to unknown reason, although we lose the packets in send or receive buffer, o2net still manages those messages. This gives a chance to o2net to re-send the messages once TCP connection is established again. Below diagram demonstrates how it works: SEND RECV send message tag message header with ::msg_seq search for Message Holder with ::msg_seq NOT FOUND - insert one (FOUND - means a duplicated one) handle message store status into Message Holder send back status instruct RECV to remove MH notify SEND that MH is already removed return to caller I am expecting your comments especially from @Mark, @Joseph and @Junxiao. Thanks, Changwei.
Hello Changwei, Base on your description, it looks make sense. Since I uses fs/dlm kernel module, it looks stable. Do you compare both dlm implementation? maybe can learn from each other. Thanks Gang>>> > Hi all, > As far as we know, ocfs2/o2net is not a reliable message mechanism. > Messages might get lost due to a sudden TCP socket connection shutdown. > And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm > hang(missing AST and ASSERT MASTER). Sometimes it also causes > ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that > won't happen since target node is still heartbeating and no dlm recovery > procedure will be launched. > > So I think above cases drive us to improve current ocfs2/o2net making it > more reliable. I already have a draft design for it. And we indeed need > to change o2net behavior. > > To accomplish this goal, we tag each o2net message with a sequence > ::msg_seq to let receiver tell if the newly coming message is a > duplicated one or not and ::msg_seq will work as a key value for > searching a following key structure in a red-black tree. > > A brandy new structure is added to o2net named as *Message Holder*, it > is responsible for _handle_status_ storing. > > When TCP has to shutdown or reset due to unknown reason, although we > lose the packets in send or receive buffer, o2net still manages those > messages. This gives a chance to o2net to re-send the messages once TCP > connection is established again. > > Below diagram demonstrates how it works: > > SEND RECV > send message > tag message header with ::msg_seq > search for Message Holder with > ::msg_seq > NOT FOUND - insert one > (FOUND - means a duplicated one) > handle message > store status into Message Holder > send back status > instruct RECV to remove MH > notify SEND that MH is already > removed > return to caller > > I am expecting your comments especially from @Mark, @Joseph and @Junxiao. > > Thanks, > Changwei. > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
On 2017/11/16 1:49, Changwei Ge wrote:> Hi all, > As far as we know, ocfs2/o2net is not a reliable message mechanism. > Messages might get lost due to a sudden TCP socket connection shutdown. > And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm > hang(missing AST and ASSERT MASTER). Sometimes it also causes > ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that > won't happen since target node is still heartbeating and no dlm recovery > procedure will be launched. > > So I think above cases drive us to improve current ocfs2/o2net making it > more reliable. I already have a draft design for it. And we indeed need > to change o2net behavior. > > To accomplish this goal, we tag each o2net message with a sequence > ::msg_seq to let receiver tell if the newly coming message is a > duplicated one or not and ::msg_seq will work as a key value for > searching a following key structure in a red-black tree. > > A brandy new structure is added to o2net named as *Message Holder*, it > is responsible for _handle_status_ storing. > > When TCP has to shutdown or reset due to unknown reason, although we > lose the packets in send or receive buffer, o2net still manages those > messages. This gives a chance to o2net to re-send the messages once TCP > connection is established again.This sounds a good idea. some questions. So the sender keeps the pending messages (to send) and re-send them when necessary.> Below diagram demonstrates how it works: > > SEND RECV > send message > tag message header with ::msg_seq > search for Message Holder with > ::msg_seq > NOT FOUND - insert one > (FOUND - means a duplicated one) > handle message > store status into Message Holder > send back statusI didn't get clear about the receiver's response. what if FOUND?? the saved status still apply currently? why? For example, sender sends the message asking which node is the owner of a lock; receiver handles the message and the response is node X; network issue happened and sender didn't get the response The owner of that lock migrated to node X2 network recovered the sender resend the message receiver send back it's node X, but actually it's now X2. I am quite sure if the above example can happen, but you may need to prove the stale status still apply now. This is the biggest concern.> instruct RECV to remove MH > notify SEND that MH is already > removedSo another round of network message? What if sending the instrument failed due to network issue. And this will almost double the network overhead. thanks, wengang> return to caller > > I am expecting your comments especially from @Mark, @Joseph and @Junxiao. > > Thanks, > Changwei. > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
On 2017/11/16 17:49, Changwei Ge wrote:> Hi all, > As far as we know, ocfs2/o2net is not a reliable message mechanism. > Messages might get lost due to a sudden TCP socket connection shutdown.Hi Changwei, Junxiao has already solved the situation about you mentioned. in commit(c43c363def04cdaed0d9e26dae846081f55714e7), it don't shutdown connection until node is fenced, so I don't understand the scenario what you mentioned about TCP socket connection shutdown, can you give a specific description? thank you. In addition, as far as I know, TCP is reliable and trustworthy, TCP will resend messages in a certain retransmit time. So as long as o2net didn't active shutdown socket, TCP will resend message for us. Thanks, Yiwen Jiang.> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm > hang(missing AST and ASSERT MASTER). Sometimes it also causes > ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that > won't happen since target node is still heartbeating and no dlm recovery > procedure will be launched. > > So I think above cases drive us to improve current ocfs2/o2net making it > more reliable. I already have a draft design for it. And we indeed need > to change o2net behavior. > > To accomplish this goal, we tag each o2net message with a sequence > ::msg_seq to let receiver tell if the newly coming message is a > duplicated one or not and ::msg_seq will work as a key value for > searching a following key structure in a red-black tree. > > A brandy new structure is added to o2net named as *Message Holder*, it > is responsible for _handle_status_ storing. > > When TCP has to shutdown or reset due to unknown reason, although we > lose the packets in send or receive buffer, o2net still manages those > messages. This gives a chance to o2net to re-send the messages once TCP > connection is established again. > > Below diagram demonstrates how it works: > > SEND RECV > send message > tag message header with ::msg_seq > search for Message Holder with > ::msg_seq > NOT FOUND - insert one > (FOUND - means a duplicated one) > handle message > store status into Message Holder > send back status > instruct RECV to remove MH > notify SEND that MH is already > removed > return to caller > > I am expecting your comments especially from @Mark, @Joseph and @Junxiao. > > Thanks, > Changwei. > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel > >