Hi, I'm using ext3 filesystems in embedded devices (storage is on 512Mb or 1Gb CF cards). A typical development cycle would see the filesystem created on the desktop PC running linux 2.4 (eg. RedHat 9). The CF card would be installed in the hardware and linux 2.4 (eg. Montavista Pro 3.1, on PPC) would boot from the CF. Recently I tried a linux 2.6 desktop (CentOS) for the same task and found problems. Specifically the embedded device won't boot from the CF anymore. Since we use several partitions it's possible to boot from an old partition. We can then mount the new partition but attempts to write to it fail and the partition becomes RO mounted. Here are the logs associated with those operations: boot: kjournald starting. Commit interval 5 seconds EXT3-fs: mounted filesystem with ordered data mode. VFS: Mounted root (ext3 filesystem) readonly. Freeing unused kernel memory: 212k init ?attempt to access beyond end of device 03:02: rw=0, want=841835629, limit=151200 attempt to access beyond end of device 03:02: rw=0, want=841835629, limit=151200 Kernel panic: No init found. Try passing init= option to kernel. <0>Rebooting in 180 seconds.. mount/write: e2fsck 1.35 (28-Feb-2004) /dev/hda2 has gone 36663 days without being checked, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/hda2: 2297/37848 files (1.9% non-contiguous), 101563/151200 blocks ... kjournald starting. Commit interval 5 seconds EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,2), internal journal EXT3-fs: mounted filesystem with ordered data mode. /dev/hda2 on /file-system/root2 type ext3 (rw,noatime,errors=remount-ro) ... # rm -rf /file-system/root2/* EXT3-fs error (device ide0(3,2)): ext3_free_blocks: Freeing blocks not in datazone - block = 1752392034, count = 1 Aborting journal on device ide0(3,2). Remounting filesystem read-only ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_jdEXT3-fs error (device ide0(3,2)) in ext3_truncate: Journal has aborted ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_jdEXT3-fs error (device ide0(3,2)) in ext3_orphan_del: Journal has aborted ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_jdEXT3-fs error (device ide0(3,2)) in ext3_delete_inode: Journal has aborted rm: cannot unlink `/file-system/root2/bin/chroot': Read-only file system rm: cannot unlink `/file-system/root2/bin/run-parts': Read-only file system rm: cannot unlink `/file-system/root2/bin/tempfile': Read-only file system Looking at the versions, on the 2.4 desktop I have e2fsprogs-1.32-6, on embedded I have e2fsprogs-1.27-1. On the 2.6 desktop it's e2fsprogs-1.35-12. I built e2fsprogs-1.38 for the desktop and the result was the same. I used dumpe2fs on the working and non-working filesystems and found that the newer FS has different features: < Filesystem features: has_journal filetype sparse_super > Filesystem features: has_journal resize_inode filetype sparse_super After writing to a new FS on the desktop a further feature is added, < Filesystem features: has_journal resize_inode filetype sparse_super > Filesystem features: has_journal ext_attr resize_inode filetype sparse_super I'm not convinced the features are relevant though because if I mkfs with -O to restrict the features, the result is the same. I wonder if it could be an endianness issue? What should I do to investigate this further? Are there known incompatabilities with ext3 between different kernels? And are there any tricks I can use in 2.6 to make a 2.4 compatible filesystem? Thanks in advance for any help, -Cam -- camilo at mesias.co.uk <--
Andreas Dilger
2005-Sep-20 13:26 UTC
ext3 incompatability between linux 2.4/ppc and linux 2.6/x86
On Sep 20, 2005 12:47 +0100, Cam wrote:> Looking at the versions, on the 2.4 desktop I have e2fsprogs-1.32-6, on > embedded I have e2fsprogs-1.27-1. On the 2.6 desktop it's e2fsprogs-1.35-12. > > I built e2fsprogs-1.38 for the desktop and the result was the same. > > I used dumpe2fs on the working and non-working filesystems and found > that the newer FS has different features: > > < Filesystem features: has_journal filetype sparse_super > > Filesystem features: has_journal resize_inode filetype sparse_superThe resize_inode feature is relatively new, but _should_ be harmless for a kernel that doesn't understand it (it is just a file in the filesystem). That said, it is quite unlikely that you will ever need this for embedded systems, so you can turn it off at mke2fs time or afterward with tune2fs with "-O ^resize_inode".> After writing to a new FS on the desktop a further feature is added, > > < Filesystem features: has_journal resize_inode filetype sparse_super > > Filesystem features: has_journal ext_attr resize_inode filetype > sparse_superThe ext_attr feature is probably from selinux. This can be a problem for older kernels (quite sadly, as there is a "feature" which slipped in under the radar). The problem is that selinux added support for EAs on symlinks, but this confuses older kernels into thinking that a fast symlink (stored in the inode) has an external block and is (wrongly) considered a slow symlink. The older kernel then tries to decode the EA as a symlink. I don't know if this is causing your problem though. I'm not sure if there is some way to prevent selinux from tagging all of the files in the filesystem or not (e.g. mount option or other). There is a trivial change to the ext3 code to fix this for your embedded platform - add ext3_inode_is_fast_symlink() to check for i_file_acl and change ext3_read_inode() to use this instead of just checking i_blocks).> I'm not convinced the features are relevant though because if I mkfs > with -O to restrict the features, the result is the same. I wonder if it > could be an endianness issue?Note that in newer e2fsprogs you need to use "mke2fs -O none -O {features}" to clear the default feature set. Also, it isn't clear whether this will prevent selinux from enabling the ext_attr feature. I would initially suspect an endian issue, but none of the values printed in the error messages appear to be byte-swapped values. They instead look like ASCII values (e.g. "md-2" and "bash"). Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.