Hello , I use kernel 2.4.7 patched with corresponding ext3 patch. The problem I have is that when I startup I get a panic in a kernel. I started to debug util-linux and found that mount does a segmentation fault when trying to open /etc/mtab file. The segmentation fault ocurrs on open() call inside mtab_is_writable() procedure: printf("mtab_writable: pass 1\n"); if (ret == -1) { int fd; if (!MOUNTED) printf("mtab_writable: MOUNTED=null\n"); fd = open("/etc/passwd", O_RDWR | O_CREAT, 0644); printf("mtab_writable: could open /etc/passwd with no probs\n"); printf("mtab_writable: opening file=%s\n",MOUNTED); if (fd >= 0) close(fd); fd = open(MOUNTED, O_RDWR | O_CREAT, 0644); printf("mtab_writable: opened fd\n"); if (fd >= 0) { printf("mtab_writable: closing fd\n"); close(fd); ret = 1; } else ret = 0; } printf("mtab_writable: pass 2\n"); As the output I get: mtab_writable: could open /etc/passwd with no probs mtab_writable: opening file=/etc/mtab invalid operand 0000 CPU: 0 EIP: 0010:[<c0127304>] EFLAGS: 00010082 ......... Process mount (pid: 41, stackpage=ce355000) ........ This only happends with /etc/mtab file, as you can see the file /etc/passwd (and many others) open fine. The problem ocurrs after using ext3 for a few days and then turning off the machine. I also noted that the panic ocurrs only during startup scripts. I am not sure if it is called "panic", because kernel is not crashing completely, it is just can't mount any more filesystems but is operating. If I start up the machine and do not mount the damaged filesystem I do not get a kernel panic at all when I mount it later, I just can not access "mtab" file: bash-2.04# ls -l mtab ls: mtab: Input/output error bash-2.04# cd / Let me tell you how I get into tis situation because it is not very common. I do not boot from a hard disk, I boot from a Flash wich has it's own mini-filesystem. During the boot process I mount 5 ext3 partitions, and one of them is /etc. I found an error in my startup script, when I do mount -t ext3 /dev/hda5 /etc I forget to pass -n option to "mount" command, so the new /etc directory is replaced with a partition on a hard disk and /etc/mtab gets corrupted. I beleive this could happend because the file was accesed in some ocasion before "mount" by the ext3 code and placed in cache, so a mount without -n may write a corrupted data ... This only happends with /etc/mtab, other files maintain its integrity no mater of how much hard resets I do. I don't know if mounting with -n option will correct this problem (i have to test), but in the mean time I would like to report this bug. If you want to see my damaged /etc partition, download from here: http://www.quazartecnologia.com/bad-fs.img.gz (675K compressed, 52 MB uncompressed) uncompress and dd if=bad-fs.img of=/dev/hdaX Now, I don't know, maybe this problem has been fixed already? I just can't move to the lastest kernel right now. Is it possible to update 2.4.7 with the lastest ext3 sources? How can I do this? Nikolai
Nikolai Vladychevski wrote:> > ... > As the output I get: > mtab_writable: could open /etc/passwd with no probs > mtab_writable: opening file=/etc/mtab > invalid operand 0000 > CPU: 0 > EIP: 0010:[<c0127304>] > EFLAGS: 00010082 > ......... > Process mount (pid: 41, stackpage=ce355000) > ........Was there more text associated with this crash? Please send it all.> This only happends with /etc/mtab file, as you can see the file > /etc/passwd (and many others) open fine. The problem > ocurrs after using ext3 for a few days and then turning off the machine. I > also noted that the panic ocurrs only during startup scripts. I am not > sure if it is called "panic", because kernel is not crashing completely, > it is just can't mount any more filesystems but is operating. > > If I start up the machine and do not mount the damaged filesystem I do not > get a kernel panic at all when I mount it later, I just can not access > "mtab" file: > > bash-2.04# ls -l mtab > ls: mtab: Input/output error > bash-2.04# cd /I ran your filesystem image through fsck: akpm-1:/home/akpm> 0 e2fsck -f /dev/loop0 e2fsck 1.22, 22-Jun-2001 for EXT2 FS 0.5b, 95/08/09 Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Entry 'mtab' in / (2) has deleted/unused inode 69. Clear<y>? no Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: -4817 Fix<y>? no Inode bitmap differences: -69 Fix<y>? no /dev/loop0: 291/12880 files (1.0% non-contiguous), 6542/51407 blocks Somehow, the inode number for /etc/mtab (which is held in the directory) has been set to a bad value: 69. fsck will be able to repair this. Current ext3 simply complains about the inode (IO error) and keeps going OK.> Now, I don't know, maybe this problem has been fixed already? I just can't > move to the lastest kernel right now. Is it possible to update 2.4.7 with > the lastest ext3 sources? How can I do this? >I'd like to know exactly what the panic is beofre saying. But unfortunately a lot of header files changed between 2.4.7 and current. Also things like the quota API and memory allocation masks. It'd be a lot of work to make recent ext3 work on 2.4.7.
Andrew Morton writes:> Nikolai Vladychevski wrote: >> >> ... >> As the output I get: >> mtab_writable: could open /etc/passwd with no probs >> mtab_writable: opening file=/etc/mtab >> invalid operand 0000 >> CPU: 0 >> EIP: 0010:[<c0127304>] >> EFLAGS: 00010082 >> ......... >> Process mount (pid: 41, stackpage=ce355000) >> ........ > > Was there more text associated with this crash? Please send > it all.ok, i didnt put it complete because I had to transcript from the monitr, but here it goes: invalid operand: 0000 CPU: 0 EIP: 0010:[<c0127304>] EFLAGS: 00010082 eax: 0000001b ebx: cf68e000 ecx: 00000001 edx: c025fe08 esi: 00000000 edi: c141fd40 ebp: cf6c7240 esp: ce355ed0 ds: 018 es: 0018 ss: 0018 Process mount (pid: 41, stackpage=ce355000) Stack: c021fb8f c021fc2b 0000048d cf68e005 cf6c7240 00008043 cf68e005 c012793f c141fd40 cf6c7240 cf68e005 cf5af1c0 00000001 00000246 cc26b440 ffffffd8 00008043 ce355f7c c013a5d1 c141fd40 cf68e005 ce355f84 cc26b440 00000001 Call Trace: [<c012793f>] [<c013a5d1>] [<c012793f>] [<c012eaa4>] [<c012ed93>] [<c0106f77>] Code: 0f 0b 83 c4 0c 8b 5d 14 83 fb ff 74 35 eb 0d 90 90 90 90 90
Hi, On Wed, Oct 24, 2001 at 11:17:55PM +0000, Nikolai Vladychevski wrote:> ok, i didnt put it complete because I had to transcript from the monitr, but > here it goes:> invalid operand: 0000 > CPU: 0 > EIP: 0010:[<c0127304>] > EFLAGS: 00010082 > eax: 0000001b ebx: cf68e000 ecx: 00000001 edx: c025fe08 > esi: 00000000 edi: c141fd40 ebp: cf6c7240 esp: ce355ed0 > ds: 018 es: 0018 ss: 0018 > Process mount (pid: 41, stackpage=ce355000) > Stack: > c021fb8f c021fc2b 0000048d cf68e005 cf6c7240 00008043 cf68e005 c012793f > c141fd40 cf6c7240 cf68e005 cf5af1c0 00000001 00000246 cc26b440 ffffffd8 > 00008043 ce355f7c c013a5d1 c141fd40 cf68e005 ce355f84 cc26b440 00000001 > Call Trace: [<c012793f>] [<c013a5d1>] [<c012793f>] [<c012eaa4>] [<c012ed93>] > [<c0106f77>] > Code: 0f 0b 83 c4 0c 8b 5d 14 83 fb ff 74 35 eb 0d 90 90 90 90 90Code: 0f 0b is a BUG() assertion. Can you run this trace through ksymoops to make it useful for debugging? Is it reproducible for you? Thanks, Stephen