Heming Zhao
2023-Feb-24 08:03 UTC
[Ocfs2-devel] report BUG: io_uring triggers umount error
On 2/24/23 3:52 PM, Joseph Qi wrote:> > > On 2/24/23 3:48 PM, Heming Zhao via Ocfs2-devel wrote: >> On 2/24/23 2:54 PM, Joseph Qi wrote: >>> I can reproduce this in my local VM. >>> I've traced ocfs2_dismount_volume and found that it hasn't been called. >>> So EBUSY is returned in VFS layer. I guess something wrong when doing >>> a copy with linked SQEs (normal copy seems no problem). >>> >> >> I am inclined to agree with you. I also test liburing examples apps >> on ext4 partition, everything looks fine. >> >> I used below bpftrace method, the retval is '3'. >> ?bpftrace -e 'kr:mnt_get_count{printf("%d\n", retval);}' >> >> It responds to flow: path_umount() => do_umount => mnt_get_count (gets '3') >> > Yes, that's the place return EBUSY. > So the problem seems to be getmnt/putmnt not match in this case. >I didn't familiar with setting up kernel bi-search env. I used one last year openSUSE tumblweed (with kernel 5.16.2), this umount issue doesn't exist. So there is a possibility one ocfs2 commit introduced this issue. Thanks, Heming> >> >>> >>> On 2/24/23 8:32 AM, Heming Zhao wrote: >>>> Hello List, >>>> >>>> I found a weird bug on ocfs2. I am busying with other jobs, if anyone have time >>>> he/she could fix it. This bug is blocking fstest generic/013 test case, and also >>>> blocking fstest to do later test cases. >>>> >>>> How to trigger: >>>> ``` >>>> git clone git://git.kernel.dk/liburing.git >>>> cd liburing >>>> make >>>> cd examples >>>> mount -t ocfs2 /dev/sda /mnt >>>> cp /etc/hosts /mnt/a >>>> ./link-cp /mnt/a /mnt/b >>>> umount /mnt >>>> ``` >>>> >>>> umount trigger error message: >>>> ``` >>>> # umount /mnt >>>> umount: /mnt: target is busy. >>>> ``` >>>> >>>> The umount error can only be triggered by liburing write operation. >>>> >>>> Thanks, >>>> Heming >> >> >> _______________________________________________ >> Ocfs2-devel mailing list >> Ocfs2-devel at oss.oracle.com >> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
On 2/24/23 4:03 PM, Heming Zhao wrote:> On 2/24/23 3:52 PM, Joseph Qi wrote: >> >> >> On 2/24/23 3:48 PM, Heming Zhao via Ocfs2-devel wrote: >>> On 2/24/23 2:54 PM, Joseph Qi wrote: >>>> I can reproduce this in my local VM. >>>> I've traced ocfs2_dismount_volume and found that it hasn't been called. >>>> So EBUSY is returned in VFS layer. I guess something wrong when doing >>>> a copy with linked SQEs (normal copy seems no problem). >>>> >>> >>> I am inclined to agree with you. I also test liburing examples apps >>> on ext4 partition, everything looks fine. >>> >>> I used below bpftrace method, the retval is '3'. >>> ??bpftrace -e 'kr:mnt_get_count{printf("%d\n", retval);}' >>> >>> It responds to flow: path_umount() => do_umount => mnt_get_count (gets '3') >>> >> Yes, that's the place return EBUSY. >> So the problem seems to be getmnt/putmnt not match in this case. >> > > I didn't familiar with setting up kernel bi-search env. I used one last year > openSUSE tumblweed (with kernel 5.16.2), this umount issue doesn't exist. > So there is a possibility one ocfs2 commit introduced this issue. >You can checkout each mailine version like Linux 6.0, 6.1, ... and tryto check if it can be reproduced. I've tried trace mntget/mntput using the following bpftrace script, link-cp output shows it misses a fput. #include <linux/mount.h> #include <linux/string.h> kprobe:mntget { $n = ((struct vfsmount *)arg0)->mnt_sb->s_type->name; if (!strncmp(str($n), "ocfs2", 5)) { @[comm] += 1; printf("%s", kstack); } } kprobe:mntput { $n = ((struct vfsmount *)arg0)->mnt_sb->s_type->name; if (!strncmp(str($n), "ocfs2", 5)) { @[comm] +=1; printf("%s", kstack); } } Joseph