Hello ,
I use kernel 2.4.7 patched with corresponding ext3 patch. The problem I
have is that when I startup I get a panic in a kernel. I started
to debug util-linux and found that mount does a segmentation fault when
trying to open /etc/mtab file.
The segmentation fault ocurrs on open() call inside mtab_is_writable()
procedure:
printf("mtab_writable: pass 1\n");
if (ret == -1) {
int fd;
if (!MOUNTED) printf("mtab_writable: MOUNTED=null\n");
fd = open("/etc/passwd", O_RDWR | O_CREAT, 0644);
printf("mtab_writable: could open /etc/passwd with no probs\n");
printf("mtab_writable: opening file=%s\n",MOUNTED);
if (fd >= 0) close(fd);
fd = open(MOUNTED, O_RDWR | O_CREAT, 0644);
printf("mtab_writable: opened fd\n");
if (fd >= 0) {
printf("mtab_writable: closing fd\n");
close(fd);
ret = 1;
} else
ret = 0;
}
printf("mtab_writable: pass 2\n");
As the output I get:
mtab_writable: could open /etc/passwd with no probs
mtab_writable: opening file=/etc/mtab
invalid operand 0000
CPU: 0
EIP: 0010:[<c0127304>]
EFLAGS: 00010082
.........
Process mount (pid: 41, stackpage=ce355000)
........
This only happends with /etc/mtab file, as you can see the file
/etc/passwd (and many others) open fine. The problem
ocurrs after using ext3 for a few days and then turning off the machine. I
also noted that the panic ocurrs only during startup scripts. I am not
sure if it is called "panic", because kernel is not crashing
completely,
it is just can't mount any more filesystems but is operating.
If I start up the machine and do not mount the damaged filesystem I do not
get a kernel panic at all when I mount it later, I just can not access
"mtab" file:
bash-2.04# ls -l mtab
ls: mtab: Input/output error
bash-2.04# cd /
Let me tell you how I get into tis situation because it is not very
common. I do not boot from a hard disk, I boot from a Flash wich
has it's own mini-filesystem. During the boot process I mount 5 ext3
partitions, and one of them is /etc. I found an error in my startup
script, when I do mount -t ext3 /dev/hda5 /etc I forget to pass -n option
to "mount" command, so the new /etc directory is replaced with a
partition
on a hard disk and /etc/mtab gets corrupted. I beleive this could happend
because the file was accesed in some ocasion before "mount" by the
ext3 code
and placed in cache, so a mount without -n may write a corrupted data ...
This only happends with /etc/mtab, other files maintain its integrity no
mater of how much hard resets I do.
I don't know if mounting with -n option will correct this problem (i have
to test), but in the mean time I would like to report this bug.
If you want to see my damaged /etc partition, download from here:
http://www.quazartecnologia.com/bad-fs.img.gz
(675K compressed, 52 MB uncompressed)
uncompress and dd if=bad-fs.img of=/dev/hdaX
Now, I don't know, maybe this problem has been fixed already? I just
can't
move to the lastest kernel right now. Is it possible to update 2.4.7 with
the lastest ext3 sources? How can I do this?
Nikolai
Nikolai Vladychevski wrote:> > ... > As the output I get: > mtab_writable: could open /etc/passwd with no probs > mtab_writable: opening file=/etc/mtab > invalid operand 0000 > CPU: 0 > EIP: 0010:[<c0127304>] > EFLAGS: 00010082 > ......... > Process mount (pid: 41, stackpage=ce355000) > ........Was there more text associated with this crash? Please send it all.> This only happends with /etc/mtab file, as you can see the file > /etc/passwd (and many others) open fine. The problem > ocurrs after using ext3 for a few days and then turning off the machine. I > also noted that the panic ocurrs only during startup scripts. I am not > sure if it is called "panic", because kernel is not crashing completely, > it is just can't mount any more filesystems but is operating. > > If I start up the machine and do not mount the damaged filesystem I do not > get a kernel panic at all when I mount it later, I just can not access > "mtab" file: > > bash-2.04# ls -l mtab > ls: mtab: Input/output error > bash-2.04# cd /I ran your filesystem image through fsck: akpm-1:/home/akpm> 0 e2fsck -f /dev/loop0 e2fsck 1.22, 22-Jun-2001 for EXT2 FS 0.5b, 95/08/09 Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Entry 'mtab' in / (2) has deleted/unused inode 69. Clear<y>? no Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Block bitmap differences: -4817 Fix<y>? no Inode bitmap differences: -69 Fix<y>? no /dev/loop0: 291/12880 files (1.0% non-contiguous), 6542/51407 blocks Somehow, the inode number for /etc/mtab (which is held in the directory) has been set to a bad value: 69. fsck will be able to repair this. Current ext3 simply complains about the inode (IO error) and keeps going OK.> Now, I don't know, maybe this problem has been fixed already? I just can't > move to the lastest kernel right now. Is it possible to update 2.4.7 with > the lastest ext3 sources? How can I do this? >I'd like to know exactly what the panic is beofre saying. But unfortunately a lot of header files changed between 2.4.7 and current. Also things like the quota API and memory allocation masks. It'd be a lot of work to make recent ext3 work on 2.4.7.
Andrew Morton writes:> Nikolai Vladychevski wrote: >> >> ... >> As the output I get: >> mtab_writable: could open /etc/passwd with no probs >> mtab_writable: opening file=/etc/mtab >> invalid operand 0000 >> CPU: 0 >> EIP: 0010:[<c0127304>] >> EFLAGS: 00010082 >> ......... >> Process mount (pid: 41, stackpage=ce355000) >> ........ > > Was there more text associated with this crash? Please send > it all.ok, i didnt put it complete because I had to transcript from the monitr, but here it goes: invalid operand: 0000 CPU: 0 EIP: 0010:[<c0127304>] EFLAGS: 00010082 eax: 0000001b ebx: cf68e000 ecx: 00000001 edx: c025fe08 esi: 00000000 edi: c141fd40 ebp: cf6c7240 esp: ce355ed0 ds: 018 es: 0018 ss: 0018 Process mount (pid: 41, stackpage=ce355000) Stack: c021fb8f c021fc2b 0000048d cf68e005 cf6c7240 00008043 cf68e005 c012793f c141fd40 cf6c7240 cf68e005 cf5af1c0 00000001 00000246 cc26b440 ffffffd8 00008043 ce355f7c c013a5d1 c141fd40 cf68e005 ce355f84 cc26b440 00000001 Call Trace: [<c012793f>] [<c013a5d1>] [<c012793f>] [<c012eaa4>] [<c012ed93>] [<c0106f77>] Code: 0f 0b 83 c4 0c 8b 5d 14 83 fb ff 74 35 eb 0d 90 90 90 90 90
Hi, On Wed, Oct 24, 2001 at 11:17:55PM +0000, Nikolai Vladychevski wrote:> ok, i didnt put it complete because I had to transcript from the monitr, but > here it goes:> invalid operand: 0000 > CPU: 0 > EIP: 0010:[<c0127304>] > EFLAGS: 00010082 > eax: 0000001b ebx: cf68e000 ecx: 00000001 edx: c025fe08 > esi: 00000000 edi: c141fd40 ebp: cf6c7240 esp: ce355ed0 > ds: 018 es: 0018 ss: 0018 > Process mount (pid: 41, stackpage=ce355000) > Stack: > c021fb8f c021fc2b 0000048d cf68e005 cf6c7240 00008043 cf68e005 c012793f > c141fd40 cf6c7240 cf68e005 cf5af1c0 00000001 00000246 cc26b440 ffffffd8 > 00008043 ce355f7c c013a5d1 c141fd40 cf68e005 ce355f84 cc26b440 00000001 > Call Trace: [<c012793f>] [<c013a5d1>] [<c012793f>] [<c012eaa4>] [<c012ed93>] > [<c0106f77>] > Code: 0f 0b 83 c4 0c 8b 5d 14 83 fb ff 74 35 eb 0d 90 90 90 90 90Code: 0f 0b is a BUG() assertion. Can you run this trace through ksymoops to make it useful for debugging? Is it reproducible for you? Thanks, Stephen