Heming Zhao
2022-Aug-08 12:09 UTC
[Ocfs2-devel] [PATCH 1/4] ocfs2: Fix freeing uninitialized resource on ocfs2_dlm_shutdown
On Mon, Aug 08, 2022 at 02:51:12PM +0800, Joseph Qi wrote:> > > On 7/30/22 9:14 AM, Heming Zhao wrote: > > On local mount mode, there is no dlm resource initalized. If > > ocfs2_mount_volume() fails in ocfs2_find_slot(), error handling > > flow will call ocfs2_dlm_shutdown(), then does dlm resource > > cleanup job, which will trigger kernel crash. > > > > Fixes: 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job before > > return error") > > Should be put at the same line.OK> > > Signed-off-by: Heming Zhao <heming.zhao at suse.com> > > --- > > fs/ocfs2/dlmglue.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c > > index 801e60bab955..1438ac14940b 100644 > > --- a/fs/ocfs2/dlmglue.c > > +++ b/fs/ocfs2/dlmglue.c > > @@ -3385,6 +3385,9 @@ int ocfs2_dlm_init(struct ocfs2_super *osb) > > void ocfs2_dlm_shutdown(struct ocfs2_super *osb, > > int hangup_pending) > > { > > + if (ocfs2_mount_local(osb)) > > + return; > > + > > IMO, we have to do part of ocfs2_dlm_shutdown() jobs such as > ocfs2_lock_res_free(), which will remove lockres from d_lockres_tracking > added by ocfs2_xxx_lock_res_init(). >ocfs2_dlm_shutdown does the cleanup job for ocfs2_dlm_init. This patch fixed crash in local mount error flow. In local mount mode, ocfs2_dlm_init does nothing, which should make ocfs2_dlm_shutdown do nothing. And I checked all calling ocfs2_dlm_shutdown cases: 1. mount flow: ocfs2_fill_super + xxx =fails=> label:out_super (checked, work fine) | + ocfs2_mount_volume =fails=> label:out_debugfs (checked, work fine) | | | + xxx =fails=> cleanup everything before returning | + xxx =fails=> label:out_dismout At this time, dlm has been init successfully, we can call all lines of ocfs2_dlm_shutdown. 2. ocfs2_dismount_volume => 'osb->cconn' is true. this MUST be dlm successfully init case. everything looks fine. In previous mail/patch: [PATCH] test error handling during mount stage, I may forget to test local mount mode. So this crash didn't be triggered.> Before commit 0737e01de9c4, it seems this issue also exists since > osb->cconn is already set under local mount mode.Yes. The bug exists since local mount feature was introduced, commit number: c271c5c22b0a7ca45fda15f1f4d258bca36a5b94. I will change 'Fixes' on next version. (Hope 'CC' list takes effect for this mail. -_-# ) Thanks, Heming
Joseph Qi
2022-Aug-10 01:31 UTC
[Ocfs2-devel] [PATCH 1/4] ocfs2: Fix freeing uninitialized resource on ocfs2_dlm_shutdown
On 8/8/22 8:09 PM, Heming Zhao wrote:> On Mon, Aug 08, 2022 at 02:51:12PM +0800, Joseph Qi wrote: >> >> >> On 7/30/22 9:14 AM, Heming Zhao wrote: >>> On local mount mode, there is no dlm resource initalized. If >>> ocfs2_mount_volume() fails in ocfs2_find_slot(), error handling >>> flow will call ocfs2_dlm_shutdown(), then does dlm resource >>> cleanup job, which will trigger kernel crash. >>> >>> Fixes: 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job before >>> return error") >> >> Should be put at the same line. > OK > >> >>> Signed-off-by: Heming Zhao <heming.zhao at suse.com> >>> --- >>> fs/ocfs2/dlmglue.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c >>> index 801e60bab955..1438ac14940b 100644 >>> --- a/fs/ocfs2/dlmglue.c >>> +++ b/fs/ocfs2/dlmglue.c >>> @@ -3385,6 +3385,9 @@ int ocfs2_dlm_init(struct ocfs2_super *osb) >>> void ocfs2_dlm_shutdown(struct ocfs2_super *osb, >>> int hangup_pending) >>> { >>> + if (ocfs2_mount_local(osb)) >>> + return; >>> + >> >> IMO, we have to do part of ocfs2_dlm_shutdown() jobs such as >> ocfs2_lock_res_free(), which will remove lockres from d_lockres_tracking >> added by ocfs2_xxx_lock_res_init(). >> > > ocfs2_dlm_shutdown does the cleanup job for ocfs2_dlm_init. > This patch fixed crash in local mount error flow. > In local mount mode, ocfs2_dlm_init does nothing, which should make > ocfs2_dlm_shutdown do nothing. >Not exactly, even in local mount we have to initialize lockres like super lock.> And I checked all calling ocfs2_dlm_shutdown cases: > 1. mount flow: > > ocfs2_fill_super > + xxx =fails=> label:out_super (checked, work fine) > | > + ocfs2_mount_volume =fails=> label:out_debugfs (checked, work fine) > | | > | + xxx =fails=> cleanup everything before returning > | > + xxx =fails=> label:out_dismout > At this time, dlm has been init successfully, we can call all lines > of ocfs2_dlm_shutdown. > > > 2. ocfs2_dismount_volume => 'osb->cconn' is true. > this MUST be dlm successfully init case. everything looks fine. > > In previous mail/patch: [PATCH] test error handling during mount stage, I may > forget to test local mount mode. So this crash didn't be triggered. > >> Before commit 0737e01de9c4, it seems this issue also exists since >> osb->cconn is already set under local mount mode. > > Yes. The bug exists since local mount feature was introduced, commit number: > c271c5c22b0a7ca45fda15f1f4d258bca36a5b94. I will change 'Fixes' on next version. >This patch can be a separate fix and we don't have to bind with nocluster mount feature, even though you've triggered it by related local mount. Thanks, Joseph> (Hope 'CC' list takes effect for this mail. -_-# ) > > Thanks, > Heming