Joseph Qi
2022-Jun-14 03:27 UTC
[Ocfs2-devel] [PATCH 1/1] ocfs2: fix ocfs2_find_slot repeats alloc same slot issue
On 6/14/22 10:59 AM, heming.zhao at suse.com wrote:> On 6/13/22 23:43, Junxiao Bi wrote: >> >>> ? 2022?6?13????1:48?heming.zhao at suse.com ??? >>> >>> ?On 6/13/22 16:21, Joseph Qi wrote: >>>>> On 6/13/22 3:59 PM, heming.zhao at suse.com wrote: >>>>> On 6/12/22 22:16, Joseph Qi wrote: >>>>>> Hi, >>>>>> >>>>>> Why can't use local mount? I don't remember if we discuss about this. >>>>>> >>>>> Sorry, I can't follow your question. >>>>> Do you mean why revert commit 912f655d78c5? >>>>> >>>>> or you are interest with the feature local mount? >>>>> the local mount is created by mkfs.ocfs2, which can't be converted to clustered. >>>>> see mkfs.ocfs2(8) '-M' option. >>>>> >>>> What Junxiao's main concern is data corruption, so I'm afraid we have to >>>> introduce an ondisk feature bit to prevent mixed nocluster and cluster >>>> mount, similar to local mount. >>> >>> this patch defined two new variants/flags: >>> #define OCFS2_SLOTMAP_CLUSTER?? 1 >>> #define OCFS2_SLOTMAP_NOCLUSTER 2 >>> >>> (I expect) Under OCFS2_SLOTMAP_CLUSTER, for clustered mount mode and for compatibility, >>> anything doesn't need to be changed. >>> >>> OCFS2_SLOTMAP_NOCLUSTER will introduce a new value for slotmap area. >>> this new value only take effect after a successfully nocluster mount. >>> (pls fix me), existed kernel/user space code don't do any special handle for >>> noclustered mount mode in slotmap area. So the new value is also compatibility. >>> >>> And the patch can also prevent mixed mount, the related code is in ocfs2_find_slot(). >>> code logic: >>> - noclustered mount condition: slotmap is empty or already mounted with noclustered >>> - clustered mount condition: slotmap is empty or already mounted with clustered. >>> - all other conditions will be denied. >> Finding slot required reading slot map and then? update slot map. It is not atomic , you can?t prevent mixed mount until you have a cluster lock. > > Could I say your mentioned use case is invalid. > I believe all (yes, *all*) the ocfs2 users use this fs under cluster mode in > their product env. > The nocluster mount is only used on maintained env (eg. backup, fsck). > > We only concern two ways: > 1. user forgets to unmount (eg crash) before using another mount mode. > 2. when ocfs2 volume is working, which should deny volume is mounted by another mode. > ?? this may happened user mistakenly runs command or script. > > Base on above 1 & 2, Junxiao above mentioned use case only happens on one scenario: > the volume doesn't be mounted by any node, then user mistakenly mounts volume > with [no]cluster mode at same time. I believe this use case is invalid. And I guess > gfs2 may also has the same issue. > > With this patch, ocfs2 has more sanity check ability than other fs, eg: xfs, ext4. > SUSE HA stack with corosync+pacemaker also supports running xfs/ext4 with A/P mode. > The xfs/ext4 never have detecting mount status code, Junxiao mentioned mixed mount > can also happens on these fs. How do xfs/ext4/HA maintainers handle it? Under these > fs mounting behavior, these fields maintainers also treat the mixed mount as > invalid use case and ignore handle it.Ext4 has a feature mmp (multiple mount protection) to do this. More details please refer to: https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout Thanks, Joseph
heming.zhao at suse.com
2022-Jun-14 12:13 UTC
[Ocfs2-devel] [PATCH 1/1] ocfs2: fix ocfs2_find_slot repeats alloc same slot issue
On 6/14/22 11:27, Joseph Qi wrote:> > > On 6/14/22 10:59 AM, heming.zhao at suse.com wrote: >> On 6/13/22 23:43, Junxiao Bi wrote: >>> >>>> ? 2022?6?13????1:48?heming.zhao at suse.com ??? >>>> >>>> ?On 6/13/22 16:21, Joseph Qi wrote: >>>>>> On 6/13/22 3:59 PM, heming.zhao at suse.com wrote: >>>>>> On 6/12/22 22:16, Joseph Qi wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Why can't use local mount? I don't remember if we discuss about this. >>>>>>> >>>>>> Sorry, I can't follow your question. >>>>>> Do you mean why revert commit 912f655d78c5? >>>>>> >>>>>> or you are interest with the feature local mount? >>>>>> the local mount is created by mkfs.ocfs2, which can't be converted to clustered. >>>>>> see mkfs.ocfs2(8) '-M' option. >>>>>> >>>>> What Junxiao's main concern is data corruption, so I'm afraid we have to >>>>> introduce an ondisk feature bit to prevent mixed nocluster and cluster >>>>> mount, similar to local mount. >>>> >>>> this patch defined two new variants/flags: >>>> #define OCFS2_SLOTMAP_CLUSTER?? 1 >>>> #define OCFS2_SLOTMAP_NOCLUSTER 2 >>>> >>>> (I expect) Under OCFS2_SLOTMAP_CLUSTER, for clustered mount mode and for compatibility, >>>> anything doesn't need to be changed. >>>> >>>> OCFS2_SLOTMAP_NOCLUSTER will introduce a new value for slotmap area. >>>> this new value only take effect after a successfully nocluster mount. >>>> (pls fix me), existed kernel/user space code don't do any special handle for >>>> noclustered mount mode in slotmap area. So the new value is also compatibility. >>>> >>>> And the patch can also prevent mixed mount, the related code is in ocfs2_find_slot(). >>>> code logic: >>>> - noclustered mount condition: slotmap is empty or already mounted with noclustered >>>> - clustered mount condition: slotmap is empty or already mounted with clustered. >>>> - all other conditions will be denied. >>> Finding slot required reading slot map and then? update slot map. It is not atomic , you can?t prevent mixed mount until you have a cluster lock. >> >> Could I say your mentioned use case is invalid. >> I believe all (yes, *all*) the ocfs2 users use this fs under cluster mode in >> their product env. >> The nocluster mount is only used on maintained env (eg. backup, fsck). >> >> We only concern two ways: >> 1. user forgets to unmount (eg crash) before using another mount mode. >> 2. when ocfs2 volume is working, which should deny volume is mounted by another mode. >> ?? this may happened user mistakenly runs command or script. >> >> Base on above 1 & 2, Junxiao above mentioned use case only happens on one scenario: >> the volume doesn't be mounted by any node, then user mistakenly mounts volume >> with [no]cluster mode at same time. I believe this use case is invalid. And I guess >> gfs2 may also has the same issue. >> >> With this patch, ocfs2 has more sanity check ability than other fs, eg: xfs, ext4. >> SUSE HA stack with corosync+pacemaker also supports running xfs/ext4 with A/P mode. >> The xfs/ext4 never have detecting mount status code, Junxiao mentioned mixed mount >> can also happens on these fs. How do xfs/ext4/HA maintainers handle it? Under these >> fs mounting behavior, these fields maintainers also treat the mixed mount as >> invalid use case and ignore handle it. > > Ext4 has a feature mmp (multiple mount protection) to do this. > More details please refer to: > https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout >Hi Joseph, I checked mmp feature related code, it can help ext4 to avoid multiple mount issue. From my code reading: - the mmp feature did mmp check when mouting. - the mmp feature will start a kernel thread (kmmpd) to monitor mount status. - kmmpd does detecting underlying dev IOs task. We could re-use mmp feature idea for ocfs2, do you plan to add it? If yes, I could take this feature porting job. Thanks, Heming