thr3ads.net - Ocfs2 devel - [Ocfs2-devel] Do you know this issue? thanks [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Joseph Qi

2015-Aug-04 09:13 UTC

[Ocfs2-devel] Do you know this issue? thanks

Hi Gang,

On 2015/8/4 11:21, Gang He wrote:> Hi Joseph,
> 
> Thank for your good explaining, have more one question.
> 
> 
>>>>
>> Hi Gang,
>> On 2015/8/3 17:28, Gang He wrote:
>>> Hello guys,
>>>
>>> I went through OCFS2 journal and JBD2 code, I just have one
question as
>> below,
>>> If there are some nodes which are running, one node (node A)
suddenly
>> crashes, one another node (node B) will recover node A's journal
records. But
>> here looks a problem, if node B ever changed one file, and node A also 
>> changed this same file, then node B will replay these changed meta
buffers,
>> JBD2 recovery code will memcpy the journal meta buffer to the node
B's
>> memory, this inode's meta buffer will be replaced by node A's
journal record,
>> but this inode structure in memory will not be reflected, this will
cause
>> this kind of issue? I feel that my guess should be wrong, since this
problem
>> looks too obvious, but who can help to figure out how to solve this
problem
>> when a running node try to recover a crashed node's journal.
>>>
>> Please note that nodes can update the same inode only after it has got
>> the cluster lock. And if the lock level is not compatible, it will
>> downcovert first, which will do the checkpoint.
>> So I don't think the issue you described really exists.
> You means, if Node A try to change the same file when Node B is changing
(or just changed) this file, it must wait until Node B finishes the checkpoint
for these meta buffers,
> then, Node A will re-read these meta buffers from the shared disk and gets
the lock, my understanding is right? if yes, how the inode meta buffer reflect
the inode structure in the memory?
> There is a case, if Node A ever read a file, then Node B changes the same
file and write the journal records to the log file (the meta buffers are not
flushed to the file system) and crashes, at this moment, Node A is replaying the
journal records and a user is trying to access/change this file, what will
happen? the memory inode will be inconsistent with just recovered meta buffer?
looks a little complicated.
> Node A reads a file (take inode lock, level PR), then Node B changes the
same file (take inode lock, level EX). Here when Node B takes the inode
EX lock, Node A should downcovert to NL because PR and EX are incompatible.
So inode cache in Node A is invalid now.
And only after recovering Node B successfully, Node A can access the file.
(Because lock is holding by Node B).
> Thanks
> Gang   
>  
> 
>>
>> Thanks
>> Joseph
>>>
>>> Thanks
>>> Gang

Gang He

2015-Aug-05 02:20 UTC

head link

[Ocfs2-devel] Do you know this issue? thanks

Hi Joseph,

Thank a lot, more one question.

>>> 
> Hi Gang,
> 
> On 2015/8/4 11:21, Gang He wrote:
>> Hi Joseph,
>> 
>> Thank for your good explaining, have more one question.
>> 
>> 
>>>>>
>>> Hi Gang,
>>> On 2015/8/3 17:28, Gang He wrote:
>>>> Hello guys,
>>>>
>>>> I went through OCFS2 journal and JBD2 code, I just have one
question as
>>> below,
>>>> If there are some nodes which are running, one node (node A)
suddenly
>>> crashes, one another node (node B) will recover node A's
journal records.
> But 
>>> here looks a problem, if node B ever changed one file, and node A
also
>>> changed this same file, then node B will replay these changed meta
buffers,
>>> JBD2 recovery code will memcpy the journal meta buffer to the node
B's
>>> memory, this inode's meta buffer will be replaced by node
A's journal
> record, 
>>> but this inode structure in memory will not be reflected, this will
cause
>>> this kind of issue? I feel that my guess should be wrong, since
this problem
> 
>>> looks too obvious, but who can help to figure out how to solve this
problem
>>> when a running node try to recover a crashed node's journal.
>>>>
>>> Please note that nodes can update the same inode only after it has
got
>>> the cluster lock. And if the lock level is not compatible, it will
>>> downcovert first, which will do the checkpoint.
>>> So I don't think the issue you described really exists.
>> You means, if Node A try to change the same file when Node B is
changing (or
> just changed) this file, it must wait until Node B finishes the checkpoint 
> for these meta buffers,
>> then, Node A will re-read these meta buffers from the shared disk and
gets
> the lock, my understanding is right? if yes, how the inode meta buffer 
> reflect the inode structure in the memory?
>> There is a case, if Node A ever read a file, then Node B changes the
same
> file and write the journal records to the log file (the meta buffers are
not
> flushed to the file system) and crashes, at this moment, Node A is
replaying
> the journal records and a user is trying to access/change this file, what 
> will happen? the memory inode will be inconsistent with just recovered meta
> buffer? looks a little complicated.
>> 
> Node A reads a file (take inode lock, level PR), then Node B changes the
> same file (take inode lock, level EX). Here when Node B takes the inode
> EX lock, Node A should downcovert to NL because PR and EX are incompatible.
> So inode cache in Node A is invalid now.
> And only after recovering Node B successfully, Node A can access the file.
> (Because lock is holding by Node B).The answer looks reasonable, just one question for how Node A re-get  the
file(inode) lock after Node B crashed?
since Node B crashed, it no longer do anything, how Node A re-get the file
cluster lock? base on timeout? or journal recovery of Node B from another Node
(maybe or not maybe Node A), I just doubt that journal records do not include
any DLM lock related information.


Thanks
Gang
> 
>> Thanks
>> Gang   
>>  
>> 
>>>
>>> Thanks
>>> Joseph
>>>>
>>>> Thanks
>>>> Gang

Ocfs2 devel - Aug 2015 - Do you know this issue? thanks

[Ocfs2-devel] Do you know this issue? thanks

[Ocfs2-devel] Do you know this issue? thanks