Wengang Wang
2019-May-20 23:18 UTC
[Ocfs2-devel] [PATCH v3] fs/ocfs2: fix race in ocfs2_dentry_attach_lock
On 2019/5/20 15:55, Wengang Wang wrote:> Hi Joseph, > > On 2019/5/19 18:35, Joseph Qi wrote: >> Hi Wengang, >> >> On 19/5/18 00:10, Wengang Wang wrote: >>> ocfs2_dentry_attach_lock() can be executed in parallel threads against the >>> same dentry. Make that race safe. >>> The race is like this: >>> >>> thread A thread B >>> >>> (A1) enter ocfs2_dentry_attach_lock, >>> seeing dentry->d_fsdata is NULL, >>> and no alias found by >>> ocfs2_find_local_alias, so kmalloc >>> a new ocfs2_dentry_lock structure >>> to local variable "dl", dl1 >>> >>> ..... >>> >>> (B1) enter ocfs2_dentry_attach_lock, >>> seeing dentry->d_fsdata is NULL, >>> and no alias found by >>> ocfs2_find_local_alias so kmalloc >>> a new ocfs2_dentry_lock structure >>> to local variable "dl", dl2. >>> >>> ...... >>> >>> (A2) set dentry->d_fsdata with dl1, >>> call ocfs2_dentry_lock() and increase >>> dl1->dl_lockres.l_ro_holders to 1 on >>> success. >>> ...... >>> >>> (B2) set dentry->d_fsdata with dl2 >>> call ocfs2_dentry_lock() and increase >>> dl2->dl_lockres.l_ro_holders to 1 on >>> success. >>> >>> ...... >>> >>> (A3) call ocfs2_dentry_unlock() >>> and decrease >>> dl2->dl_lockres.l_ro_holders to 0 >>> on success. >>> .... >>> >>> (B3) call ocfs2_dentry_unlock(), >>> decreasing >>> dl2->dl_lockres.l_ro_holders, but >>> see it's zero now, panic >>> >>> Signed-off-by: Wengang Wang <wen.gang.wang at oracle.com> >>> Reported-by: Daniel Sobe <daniel.sobe at nxp.com> >>> Tested-by: Daniel Sobe <daniel.sobe at nxp.com> >>> Reviewed-by: Changwei Ge <gechangwei at live.cn> >> Thanks for the detailed description. >> IIUC, you are trying to identify the race case and free the second >> dentry lock. > Yes. >> So would the following be more clear? Something like: >> >> if (unlikely(dentry->d_fsdata && !alias)) { >> dl_attached = true;Oh, seems "dl_attached" means d_fsdata is set with a dentry lock. Well, this name may can't reflect the racing.? It can't tell it's set with new allocated dentry lock or with the one from alias dentry either, but it's set true when it's set with a new lock. Also, at the beginning of the function, we are checking if d_fsdata is set, in the d_fsdata set case, we don't set dl_attached, so I get the feeling that that variable is inconsistent. But I don't care much about the this.? If you feel this way is better, choose a better name to replace dl_free_on_race. thanks, wengang>> } else >> ... >> } >> >> ... >> >> if (unlikely(dl_attached)) { >> ocfs2_lock_res_free(&dl->dl_lockres); >> kfree(dl); >> iput(inode); >> return 0; >> } > Seems your idea is to rename the variable "dl_free_on_race" to > "dl_attached". > > I think "dl_free_on_race" is more meaningful than "dl_attached" :D > "dl_free_on_race" includes the following meanings: > 0) is a dentry lock > 1) need to be freed > 2) on race condition only > > By this name, we know it's a dentry lock that needs to be freed when race condition hits. > > I don't exactly understand what "dl_attached" stands for, but from it's name, I am guessing the following: > 0) it's dentry lock > 1) it's attached to something > > I don't think the dentry lock is attached anywhere yet, it's a local variable, no other reference so far. And why we want to free an "attached" object? > Can you shall your thought for "dl_attached"? > > thanks, > wengang > >> Thanks, >> Joseph >> >>> --- >>> v3: add Reviewed-by, Reported-by and Tested-by only >>> >>> v2: 1) removed lock on dentry_attach_lock at the first access of >>> dentry->d_fsdata since it helps very little. >>> 2) do cleanups before freeing the duplicated dl >>> 3) return after freeing the duplicated dl found. >>> --- >>> fs/ocfs2/dcache.c | 16 ++++++++++++++-- >>> 1 file changed, 14 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/ocfs2/dcache.c b/fs/ocfs2/dcache.c >>> index 290373024d9d..0d220a66297e 100644 >>> --- a/fs/ocfs2/dcache.c >>> +++ b/fs/ocfs2/dcache.c >>> @@ -230,6 +230,7 @@ int ocfs2_dentry_attach_lock(struct dentry *dentry, >>> int ret; >>> struct dentry *alias; >>> struct ocfs2_dentry_lock *dl = dentry->d_fsdata; >>> + struct ocfs2_dentry_lock *dl_free_on_race = NULL; >>> >>> trace_ocfs2_dentry_attach_lock(dentry->d_name.len, dentry->d_name.name, >>> (unsigned long long)parent_blkno, dl); >>> @@ -310,10 +311,21 @@ int ocfs2_dentry_attach_lock(struct dentry *dentry, >>> >>> out_attach: >>> spin_lock(&dentry_attach_lock); >>> - dentry->d_fsdata = dl; >>> - dl->dl_count++; >>> + /* d_fsdata could be set by parallel thread */ >>> + if (unlikely(dentry->d_fsdata && !alias)) { >>> + dl_free_on_race = dl; >>> + } else { >>> + dentry->d_fsdata = dl; >>> + dl->dl_count++; >>> + } >>> spin_unlock(&dentry_attach_lock); >>> >>> + if (unlikely(dl_free_on_race)) { >>> + iput(dl_free_on_race->dl_inode); >>> + ocfs2_lock_res_free(&dl_free_on_race->dl_lockres); >>> + kfree(dl_free_on_race); >>> + return 0; >>> + } >>> /* >>> * This actually gets us our PRMODE level lock. From now on, >>> * we'll have a notification if one of these names is >>> > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Joseph Qi
2019-May-21 01:26 UTC
[Ocfs2-devel] [PATCH v3] fs/ocfs2: fix race in ocfs2_dentry_attach_lock
On 19/5/21 07:18, Wengang Wang wrote:> > On 2019/5/20 15:55, Wengang Wang wrote: >> Hi Joseph, >> >> On 2019/5/19 18:35, Joseph Qi wrote: >>> Hi Wengang, >>> >>> On 19/5/18 00:10, Wengang Wang wrote: >>>> ocfs2_dentry_attach_lock() can be executed in parallel threads against the >>>> same dentry. Make that race safe. >>>> The race is like this: >>>> >>>> ????????????? thread A?????????????????????????????? thread B >>>> >>>> (A1) enter ocfs2_dentry_attach_lock, >>>> seeing dentry->d_fsdata is NULL, >>>> and no alias found by >>>> ocfs2_find_local_alias, so kmalloc >>>> a new ocfs2_dentry_lock structure >>>> to local variable "dl", dl1 >>>> >>>> ???????????????? ..... >>>> >>>> ????????????????????????????????????? (B1) enter ocfs2_dentry_attach_lock, >>>> ????????????????????????????????????? seeing dentry->d_fsdata is NULL, >>>> ????????????????????????????????????? and no alias found by >>>> ????????????????????????????????????? ocfs2_find_local_alias so kmalloc >>>> ????????????????????????????????????? a new ocfs2_dentry_lock structure >>>> ????????????????????????????????????? to local variable "dl", dl2. >>>> >>>> ???????????????????????????????????????????????????? ...... >>>> >>>> (A2) set dentry->d_fsdata with dl1, >>>> call ocfs2_dentry_lock() and increase >>>> dl1->dl_lockres.l_ro_holders to 1 on >>>> success. >>>> ??????????????? ...... >>>> >>>> ????????????????????????????????????? (B2) set dentry->d_fsdata with dl2 >>>> ????????????????????????????????????? call ocfs2_dentry_lock() and increase >>>> ??????????????????? dl2->dl_lockres.l_ro_holders to 1 on >>>> ??????????????????? success. >>>> >>>> ??????????????????????????????????????????????????? ...... >>>> >>>> (A3) call ocfs2_dentry_unlock() >>>> and decrease >>>> dl2->dl_lockres.l_ro_holders to 0 >>>> on success. >>>> ?????????????? .... >>>> >>>> ????????????????????????????????????? (B3) call ocfs2_dentry_unlock(), >>>> ????????????????????????????????????? decreasing >>>> ??????????????????? dl2->dl_lockres.l_ro_holders, but >>>> ??????????????????? see it's zero now, panic >>>> >>>> Signed-off-by: Wengang Wang <wen.gang.wang at oracle.com> >>>> Reported-by: Daniel Sobe <daniel.sobe at nxp.com> >>>> Tested-by: Daniel Sobe <daniel.sobe at nxp.com> >>>> Reviewed-by: Changwei Ge <gechangwei at live.cn> >>> Thanks for the detailed description. >>> IIUC, you are trying to identify the race case and free the second >>> dentry lock. >> Yes. >>> So would the following be more clear? Something like: >>> >>> if (unlikely(dentry->d_fsdata && !alias)) { >>> ????dl_attached = true; > > Oh, seems "dl_attached" means d_fsdata is set with a dentry lock.Yes, it means we've already attached the dentry lock. We can add some comments before setting it to true for code readability.> > Well, this name may can't reflect the racing.? It can't tell it's set with new allocated dentry lock or with the one from alias dentry either, but it's set true when it's set with a new lock. Also, at the beginning of the function, we are checking if d_fsdata is set, in the d_fsdata set case, we don't set dl_attached, so I get the feeling that that variable is inconsistent. > > But I don't care much about the this.? If you feel this way is better, choose a better name to replace dl_free_on_race.I don't meant 'dl_free_on_race' is not good. I just don't want to introduce another dentry lock here since it's in fact the local dl itself. Thanks, Joseph> > thanks, > wengang > > >>> } else >>> ????... >>> } >>> >>> ... >>> >>> if (unlikely(dl_attached)) { >>> ????ocfs2_lock_res_free(&dl->dl_lockres); >>> ????kfree(dl); >>> ????iput(inode); >>> ????return 0; >>> } >> Seems your idea is to rename the variable "dl_free_on_race" to >> "dl_attached". >> >> I think "dl_free_on_race" is more meaningful than "dl_attached" :D >> "dl_free_on_race" includes the following meanings: >> 0) is a dentry lock >> 1) need to be freed >> 2) on race condition only >> >> By this name, we know it's a dentry lock that needs to be freed when race condition hits. >> >> I don't exactly understand what "dl_attached" stands for, but from it's name, I am guessing the following: >> 0) it's dentry lock >> 1) it's attached to something >> >> I don't think the dentry lock is attached anywhere yet, it's a local variable, no other reference so far. And why we want to free an "attached" object? >> Can you shall your thought for "dl_attached"? >> >> thanks, >> wengang >> >>> Thanks, >>> Joseph >>> >>>> --- >>>> v3: add Reviewed-by, Reported-by and Tested-by only >>>> >>>> v2: 1) removed lock on dentry_attach_lock at the first access of >>>> ???????? dentry->d_fsdata since it helps very little. >>>> ????? 2) do cleanups before freeing the duplicated dl >>>> ????? 3) return after freeing the duplicated dl found. >>>> --- >>>> ?? fs/ocfs2/dcache.c | 16 ++++++++++++++-- >>>> ?? 1 file changed, 14 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/fs/ocfs2/dcache.c b/fs/ocfs2/dcache.c >>>> index 290373024d9d..0d220a66297e 100644 >>>> --- a/fs/ocfs2/dcache.c >>>> +++ b/fs/ocfs2/dcache.c >>>> @@ -230,6 +230,7 @@ int ocfs2_dentry_attach_lock(struct dentry *dentry, >>>> ?????? int ret; >>>> ?????? struct dentry *alias; >>>> ?????? struct ocfs2_dentry_lock *dl = dentry->d_fsdata; >>>> +??? struct ocfs2_dentry_lock *dl_free_on_race = NULL; >>>> ?? ?????? trace_ocfs2_dentry_attach_lock(dentry->d_name.len, dentry->d_name.name, >>>> ????????????????????????? (unsigned long long)parent_blkno, dl); >>>> @@ -310,10 +311,21 @@ int ocfs2_dentry_attach_lock(struct dentry *dentry, >>>> ?? ?? out_attach: >>>> ?????? spin_lock(&dentry_attach_lock); >>>> -??? dentry->d_fsdata = dl; >>>> -??? dl->dl_count++; >>>> +??? /* d_fsdata could be set by parallel thread */ >>>> +??? if (unlikely(dentry->d_fsdata && !alias)) { >>>> +??????? dl_free_on_race = dl; >>>> +??? } else { >>>> +??????? dentry->d_fsdata = dl; >>>> +??????? dl->dl_count++; >>>> +??? } >>>> ?????? spin_unlock(&dentry_attach_lock); >>>> ?? +??? if (unlikely(dl_free_on_race)) { >>>> +??????? iput(dl_free_on_race->dl_inode); >>>> +??????? ocfs2_lock_res_free(&dl_free_on_race->dl_lockres); >>>> +??????? kfree(dl_free_on_race); >>>> +??????? return 0; >>>> +??? } >>>> ?????? /* >>>> ??????? * This actually gets us our PRMODE level lock. From now on, >>>> ??????? * we'll have a notification if one of these names is >>>> >> _______________________________________________ >> Ocfs2-devel mailing list >> Ocfs2-devel at oss.oracle.com >> https://oss.oracle.com/mailman/listinfo/ocfs2-devel