Hi, Michael I have met ext4 error when using vhost_scsi on arm64 platform, and suspect it is vhost_scsi issue. Ext4 error when testing virtio_scsi & vhost_scsi No issue: 1. virtio_scsi, ext4 2. vhost_scsi & virtio_scsi, ext2 3. Instead of vhost, also tried loopback and no problem. Using loopback, host can use the new block device, while vhost is used by guest (qemu). http://www.linux-iscsi.org/wiki/Tcm_loop Test directly in host, not find ext4 error. Have issue: 1. vhost_scsi & virtio_scsi, ext4 a. iblock b, fileio, file located in /tmp (ram), no device based. 2, Have tried 4.7-r2 and 4.5-rc1 on D02 board, both have issue. Since I need kvm specific patch for D02, so it may not freely to switch to older version. 3. Also test with ext4, disabling journal mkfs.ext4 -O ^has_journal /dev/sda Do you have any suggestion? Thanks On Tue, Jul 19, 2016 at 4:21 PM, Zhangfei Gao <zhangfei.gao at gmail.com> wrote:> On Tue, Jul 19, 2016 at 3:56 PM, Zhangfei Gao <zhangfei.gao at gmail.com> wrote: >> Dear Ted >> >> On Wed, Jul 13, 2016 at 12:43 AM, Theodore Ts'o <tytso at mit.edu> wrote: >>> On Tue, Jul 12, 2016 at 03:14:38PM +0800, Zhangfei Gao wrote: >>>> Some update: >>>> >>>> If test with ext2, no problem in iblock. >>>> If test with ext4, ext4_mb_generate_buddy reported error in the >>>> removing files after reboot. >>>> >>>> >>>> root@(none)$ rm test >>>> [ 21.006549] EXT4-fs error (device sda): ext4_mb_generate_buddy:758: group 18 >>>> , block bitmap and bg descriptor inconsistent: 26464 vs 25600 free clusters >>>> [ 21.008249] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). Th >>>> ere's a risk of filesystem corruption in case of system crash. >>>> >>>> Any special notes of using ext4 in qemu? >>> >>> Ext4 has more runtime consistency checking than ext2. So just because >>> ext4 complains doesn't mean that there isn't a problem with the file >>> system; it just means that ext4 is more likely to notice before you >>> lose user data. >>> >>> So if you test with ext2, try running e2fsck afterwards, to make sure >>> the file system is consistent. >>> >>> Given that I'm reguarly testing ext4 using kvm, and I haven't seen >>> anything like this in a very long time, I suspect the problemb is with >>> your SCSI code, and not with ext4. >>> >> >> Do you know what's the possible reason of this error. >> >> Have tried 4.7-rc2, same issue exist. >> It can be reproduced by fileio and iblock as backstore. >> It is easier to happen in qemu like this process: >> qemu-> mount-> dd xx -> umout -> mount -> rm xx, then the error may >> happen, no need to reboot. >> >> ramdisk can not cause error just because it just malloc and memcpy, >> while not going to blk layer. >> >> Also tried creating one file in /tmp, used as fileio, also can reproduce. >> So no real device is based. >> >> like: >> cd /tmp >> dd if=/dev/zero of=test bs=1M count=1024; sync; >> targetcli >> #targetcli >> (targetcli) /> cd backstores/fileio >> (targetcli) /> create name=file_backend file_or_dev=/tmp/test size=1G >> (targetcli) /> cd /vhost >> (targetcli) /> create wwn=naa.60014052cc816bf4 >> (targetcli) /> cd naa.60014052cc816bf4/tpgt1/luns >> (targetcli) /> create /backstores/fileio/file_backend >> (targetcli) /> cd / >> (targetcli) /> saveconfig >> (targetcli) /> exit >> >> /work/qemu.git/aarch64-softmmu/qemu-system-aarch64 \ >> -enable-kvm -nographic -kernel Image \ >> -device vhost-scsi-pci,wwpn=naa.60014052cc816bf4 \ >> -m 512 -M virt -cpu host \ >> -append "earlyprintk console=ttyAMA0 mem=512M" >> >> in qemu: >> mkfs.ext4 /dev/sda >> mount /dev/sda /mnt/ >> sync; date; dd if=/dev/zero of=/mnt/test bs=1M count=100; sync; date; >> >> using dd test, then some error happen. >> log like: >> oot@(none)$ sync; date; dd if=/dev/zero of=test bs=1M count=100; sync;; date; >> [ 1789.917963] sbc_parse_cdb cdb[0]=0x35 >> [ 1789.922000] fd_execute_sync_cache immed=0 >> Tue Jul 19 07:26:12 UTC 2016 >> [ 200.712879] EXT4-fs error (device sda) [ 1790.191770] sbc_parse_cdb >> cdb[0]=0x2a >> in ext4_reserve_inode_write:5362[ 1790.198382] fd_execute_rw >> : Corrupt filesystem >> [ 200.729001] EXT4-fs error (device sda) [ 1790.207843] sbc_parse_cdb >> cdb[0]=0x2a >> in ext4_reserve_inode_write:5362[ 1790.214495] fd_execute_rw >> : Corrupt filesystem >> >> Looks like the error usually happens after SYCHRONIZE CACHE, but not >> for sure it is always happen after sync cache. >> > It is not always happen after SYCHRONIZE CACHE > > Just tried in qemu: mount-> dd xx -> umount -> mount -> rm xx > ram based, (/tmp/test), no reboot. > > root@(none)$ cd /mnt > root@(none)$ ls > [ 301.444966] sbc_parse_cdb cdb[0]=0x28 > [ 301.449003] fd_execute_rw > lost+found test > root@(none)$ rm test > [ 304.281920] sbc_parse_cdb cdb[0]=0x28 > [ 304.285955] fd_execute_rw > [ 118.002338] EXT4-fs error (device sda):[ 304.290685] gzf sbc_parse_cdb cdb[0 > ]=0x28 > ext4_mb_generate_buddy:758: gro[ 304.296737] gzf fd_execute_rw > up 3, block bitmap and bg descri[ 304.304099] sbc_parse_cdb cdb[0]=0x28 > ptor inconsistent: 21504 vs 2143[ 304.309322] fd_execute_rw > 9 free clusters > [ 118.015903] JBD2: Spotted dirty metadata buffer (dev = sda, blocknr = 0). The > re's a risk of filesystem corruption in case of system crash. > root@(none)$ > > Thanks
Hi! On Wed 27-07-16 15:58:55, Zhangfei Gao wrote:> Hi, Michael > > I have met ext4 error when using vhost_scsi on arm64 platform, and > suspect it is vhost_scsi issue. > > Ext4 error when testing virtio_scsi & vhost_scsi > > > No issue: > 1. virtio_scsi, ext4 > 2. vhost_scsi & virtio_scsi, ext2 > 3. Instead of vhost, also tried loopback and no problem. > Using loopback, host can use the new block device, while vhost is used > by guest (qemu). > http://www.linux-iscsi.org/wiki/Tcm_loop > Test directly in host, not find ext4 error. > > > > Have issue: > 1. vhost_scsi & virtio_scsi, ext4 > a. iblock > b, fileio, file located in /tmp (ram), no device based. > > 2, Have tried 4.7-r2 and 4.5-rc1 on D02 board, both have issue. > Since I need kvm specific patch for D02, so it may not freely to switch > to older version. > > 3. Also test with ext4, disabling journal > mkfs.ext4 -O ^has_journal /dev/sda > > > Do you have any suggestion?So can you mount the filesystem with errors=remount-ro to avoid clobbering the fs after the problem happens? And then run e2fsck on the problematic filesystem and send the output here? Honza -- Jan Kara <jack at suse.com> SUSE Labs, CR
Hi, Jan On Wed, Jul 27, 2016 at 11:56 PM, Jan Kara <jack at suse.cz> wrote:> Hi! > > On Wed 27-07-16 15:58:55, Zhangfei Gao wrote: >> Hi, Michael >> >> I have met ext4 error when using vhost_scsi on arm64 platform, and >> suspect it is vhost_scsi issue. >> >> Ext4 error when testing virtio_scsi & vhost_scsi >> >> >> No issue: >> 1. virtio_scsi, ext4 >> 2. vhost_scsi & virtio_scsi, ext2 >> 3. Instead of vhost, also tried loopback and no problem. >> Using loopback, host can use the new block device, while vhost is used >> by guest (qemu). >> http://www.linux-iscsi.org/wiki/Tcm_loop >> Test directly in host, not find ext4 error. >> >> >> >> Have issue: >> 1. vhost_scsi & virtio_scsi, ext4 >> a. iblock >> b, fileio, file located in /tmp (ram), no device based. >> >> 2, Have tried 4.7-r2 and 4.5-rc1 on D02 board, both have issue. >> Since I need kvm specific patch for D02, so it may not freely to switch >> to older version. >> >> 3. Also test with ext4, disabling journal >> mkfs.ext4 -O ^has_journal /dev/sda >> >> >> Do you have any suggestion? > > So can you mount the filesystem with errors=remount-ro to avoid clobbering > the fs after the problem happens? And then run e2fsck on the problematic > filesystem and send the output here? >Tested twice, log pasted. Both using fileio, located in host ramfs /tmp Before e2fsck, umount /dev/sda 1. root@(none)$ mount -o errors=remount-ro /dev/sda /mnt [ 22.812053] EXT4-fs (sda): mounted filesystem with ordered data mode. Opts: errors=remount-ro $ rm /mnt/test [ 108.388905] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem [ 108.406930] Aborting journal on device sda-8. [ 108.414120] EXT4-fs (sda): Remounting filesystem read-only [ 108.414847] EXT4-fs error (device sda) in ext4_dirty_inode:5487: IO failure [ 108.423571] EXT4-fs error (device sda) in ext4_free_blocks:4904: Journal has aborted [ 108.431919] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem [ 108.440269] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem [ 108.448568] EXT4-fs error (device sda) in ext4_ext_remove_space:3058: IO failure [ 108.456917] EXT4-fs error (device sda) in ext4_ext_truncate:4657: Corrupt filesystem [ 108.465267] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem [ 108.473567] EXT4-fs error (device sda) in ext4_truncate:4150: IO failure [ 108.481917] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem root@(none)$ e2fsck /dev/sda e2fsck 1.42.9 (28-Dec-2013) /dev/sda is mounted. e2fsck: Cannot continue, aborting. root@(none)$ umount /mnt [ 260.756250] EXT4-fs error (device sda): ext4_put_super:837: Couldn't clean up the journal root@(none)$ umount /mnt e2fsck /dev/sda e2fsck 1.42.9 (28-Dec-2013) ext2fs_open2: Bad magic number in super-block e2fsck: Superblock invalid, trying backup blocks... Superblock needs_recovery flag is clear, but journal has data. Recovery flag not set in backup superblock, so running journal anyway. /dev/sda: recovering journal Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Free blocks count wrong for group #1 (32703, counted=8127). Fix<y>? yes Free blocks count wrong for group #2 (32768, counted=31744). Fix<y>? yes Free blocks count wrong (249509, counted=223909). Fix<y>? yes Free inodes count wrong for group #0 (8181, counted=8180). Fix<y>? yes Free inodes count wrong (65525, counted=65524). Fix<y>? yes /dev/sda: ***** FILE SYSTEM WAS MODIFIED ***** /dev/sda: 12/65536 files (8.3% non-contiguous), 38235/262144 blocks root@(none)$ 2. root@(none)$ rm /mnt/test [ 71.021484] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem [ 71.044959] Aborting journal on device sda-8. [ 71.052152] EXT4-fs (sda): Remounting filesystem read-only [ 71.052833] EXT4-fs error (device sda) in ext4_dirty_inode:5487: IO failure [ 71.061600] EXT4-fs error (device sda) in ext4_free_blocks:4904: Journal has aborted [ 71.069948] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem [ 71.078296] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem [ 71.086597] EXT4-fs error (device sda) in ext4_ext_remove_space:3058: IO failure [ 71.094946] EXT4-fs error (device sda) in ext4_ext_truncate:4657: Corrupt filesystem [ 71.103296] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem [ 71.111595] EXT4-fs error (device sda) in ext4_truncate:4150: IO failure [ 71.119946] EXT4-fs error (device sda) in ext4_reserve_inode_write:5362: Corrupt filesystem root@(none)$ e2fsck /dev/sda e2fsck 1.42.9 (28-Dec-2013) /dev/sda is mounted. e2fsck: Cannot continue, aborting. root@(none)$ umou nt /mnt/ [ 92.103221] EXT4-fs error (device sda): ext4_put_super:837: Couldn't clean up the journal root@(none)$ umount /mnt/ e2fsck /dev/sda e2fsck 1.42.9 (28-Dec-2013) ext2fs_open2: Bad magic number in super-block e2fsck: Superblock invalid, trying backup blocks... Superblock needs_recovery flag is clear, but journal has data. Recovery flag not set in backup superblock, so running journal anyway. /dev/sda: recovering journal Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Free blocks count wrong for group #1 (32703, counted=8127). Fix<y>? yes Free blocks count wrong for group #2 (32768, counted=31744). Fix<y>? yes Free blocks count wrong (249509, counted=223909). Fix<y>? yes Free inodes count wrong for group #0 (8181, counted=8180). Fix<y>? yes Free inodes count wrong (65525, counted=65524). Fix<y>? yes /dev/sda: ***** FILE SYSTEM WAS MODIFIED ***** /dev/sda: 12/65536 files (8.3% non-contiguous), 38235/262144 blocks root@(none)$ Thanks