Hello, Using Samba 4.12.1 with kernel 5.6.7 on Gentoo Linux, I wanted to test the new vfs_io_uring module. * Stand-alone Samba server acting as NAS for home setup. * /media/usb-backup is a mounted read-only btrfs filesystem. * Samba share exported as: ===== smb.conf ====[global] ??? log level = 1 ??? workgroup = WORKGROUP ??? netbios name = NAS ??? server string = Samba Server ??? server role = standalone server ??? hosts allow = 192.168.0. 127. ??? interfaces = lan ??? max protocol = SMB3_11 ??? log file = /var/log/samba/%I.log ??? max log size = 10240 ??? security = user ??? passdb backend = tdbsam ??? wins support = yes ??? dns proxy = yes [usb-backup] ??? comment = USB Backup - Media files ??? path = /media/usb-backup ??? writeable = no ??? browseable = yes ??? read only = yes ??? create mask = 0664 ??? directory mask = 0775 ??? guest only = Yes ??? guest ok = Yes ??? force user = nasuser ??? force group = nas ??? vfs objects = btrfs, io_uring (also tried without the btrfs module) ==== * Connected from a Windows 10 computer over 1G ethernet. * Copy data using Windows Explorer and FastCopy(1) from the Samba share to a local disk. * Verify the sha-256 sum on the files. From what I can see there is data corruption on many of the files. Sha-256 does not match. I copied the same files many times and the data corruption occurs within minutes. The total data set is about 800GB. When I disable io_uring, no data corruptions occur. I verified this with multiple TBs of data transferred and no corruptions were detected. Samba log files contain no errors, nor does dmesg. The Windows client detect no error when copying the files. I'd really like to debug this problem and find the cause. io_uring seems to give a massive performance i/o boost. * net-fs/samba-4.12.1 * sys-devel/gcc-9.3.0 * sys-libs/glibc-2.30-r8 * sys-kernel/linux-headers-5.6 * Linux-kernel-5.6.7-gentoo Compile info: * CFLAGS="-O2 -march=native -pipe" * CPU: "AMD Athlon 3000G with Radeon Vega Graphics"
On Sun, Apr 26, 2020 at 11:51:42AM +0200, A L via samba wrote:> Hello, > > Using Samba 4.12.1 with kernel 5.6.7 on Gentoo Linux, I wanted to test the > new vfs_io_uring module. > > * Stand-alone Samba server acting as NAS for home setup. > * /media/usb-backup is a mounted read-only btrfs filesystem. > * Samba share exported as:Thanks SO MUCH for testing this ! Can you log this information as a bug at bugzilla.samba.org so we can properly track it ? Cheers, Jeremy.> ===== smb.conf ====> [global] > ??? log level = 1 > ??? workgroup = WORKGROUP > ??? netbios name = NAS > ??? server string = Samba Server > ??? server role = standalone server > ??? hosts allow = 192.168.0. 127. > ??? interfaces = lan > ??? max protocol = SMB3_11 > > ??? log file = /var/log/samba/%I.log > ??? max log size = 10240 > > ??? security = user > ??? passdb backend = tdbsam > ??? wins support = yes > ??? dns proxy = yes > > [usb-backup] > ??? comment = USB Backup - Media files > ??? path = /media/usb-backup > ??? writeable = no > ??? browseable = yes > ??? read only = yes > ??? create mask = 0664 > ??? directory mask = 0775 > ??? guest only = Yes > ??? guest ok = Yes > ??? force user = nasuser > ??? force group = nas > ??? vfs objects = btrfs, io_uring (also tried without the btrfs module) > ====> > * Connected from a Windows 10 computer over 1G ethernet. > * Copy data using Windows Explorer and FastCopy(1) from the Samba share to a > local disk. > * Verify the sha-256 sum on the files. > > From what I can see there is data corruption on many of the files. Sha-256 > does not match. I copied the same files many times and the data corruption > occurs within minutes. The total data set is about 800GB. > > When I disable io_uring, no data corruptions occur. I verified this with > multiple TBs of data transferred and no corruptions were detected. > > Samba log files contain no errors, nor does dmesg. The Windows client detect > no error when copying the files. > > I'd really like to debug this problem and find the cause. io_uring seems to > give a massive performance i/o boost. > > * net-fs/samba-4.12.1 > * sys-devel/gcc-9.3.0 > * sys-libs/glibc-2.30-r8 > * sys-kernel/linux-headers-5.6 > * Linux-kernel-5.6.7-gentoo > > Compile info: > * CFLAGS="-O2 -march=native -pipe" > * CPU: "AMD Athlon 3000G with Radeon Vega Graphics" > > > -- > To unsubscribe from this list go to the following URL and read the > instructions: https://lists.samba.org/mailman/options/samba
On Sun, Apr 26, 2020 at 11:51:42AM +0200, A L via samba wrote:> > * Connected from a Windows 10 computer over 1G ethernet. > * Copy data using Windows Explorer and FastCopy(1) from the Samba share to a > local disk. > * Verify the sha-256 sum on the files. > > From what I can see there is data corruption on many of the files. Sha-256 > does not match. I copied the same files many times and the data corruption > occurs within minutes. The total data set is about 800GB.Can you do checksums on file fragments so we can discover at what offset (if non-zero) the corruption occurs.
On 2020-04-26 19:46, Jeremy Allison via samba wrote:> On Sun, Apr 26, 2020 at 11:51:42AM +0200, A L via samba wrote: >> * Connected from a Windows 10 computer over 1G ethernet. * Copy data >> using Windows Explorer and FastCopy(1) from the Samba share to a >> local disk. * Verify the sha-256 sum on the files. From what I can >> see there is data corruption on many of the files. Sha-256 does not >> match. I copied the same files many times and the data corruption >> occurs within minutes. The total data set is about 800GB. > Can you do checksums on file fragments so we can discover at what > offset (if non-zero) the corruption occurs.Yes, I will check this. I saw a patch on the kernel mailing list about possible corruptions in during re-scheduling. I wonder if this is the problem I am hitting. I'll make some more tests with this patch. https://www.spinics.net/lists/io-uring/msg01706.html Regards, Anders