Hello, We have about 20 - 30 embedded machines running Linux 2.4.13, which the ext3-2.4-0.9.13-2413 patch applied. These are PowerPC 7410 based systems. I am getting reports of Oops happening either during removes (rm's), or coppies (cp's). Most reports state that issuing the same command once the system boots back up, does not cause it to crash. I guess its something else leading up to the crash. Here is the ksymoops output of one of the crashes. Any input/help would be appreciated.. This is the first time using ksymoops, so let me know if I should run it with different options. Btw: Could anyone tell me what the warning line below means? Thanks.. Oops: Exception in kernel mode, sig: 4 NIP: C00803F4 XER: 00000000 LR: C00804D8 SP: C6DC3C90 REGS: c6dc3be0 TRAP: 0700 Not tainted Using defaults from ksymoops -t elf32-powerpc -a powerpc:common MSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11 TASK = c6dc2000[386] 'cp' Last syscall: 5 last math c6dc2000 last altivec 00000000 GPR00: 00000001 C6DC3C90 C6DC2000 C05E1000 00000000 00000000 00000001 00004B31 GPR08: 00002219 00000001 00006219 00000001 44000042 018211F0 00000000 00000000 GPR16: 7FFFF864 00000000 C0190000 C0190000 00000100 C0190000 C0190000 C6DC3DA8 GPR24: 00000000 C7751C80 00004B31 C05E1000 C05E1000 00000001 00000000 00000000 Call backtrace: C0190000 C007E2D8 C006BE04 C006ED18 C006F1C0 C0025424 C00589AC C00701C8 C0058BC4 C00398DC C004BE94 C003AD70 C003B25C C0003DFC 01819404 0180295C 01803E08 01804590 01801CF4 01802564 016DBA08 00000000 Warning (Oops_read): Code line not seen, dumping what data is available>>NIP; c00803f4 <journal_check_used_features+28/7c> <====Trace; c0190000 <Unused_offset+7894/1c258>Trace; c007e2d8 <journal_revoke+40/358> Trace; c006be04 <ext3_forget+f4/184> Trace; c006ed18 <ext3_free_branches+11c/288> Trace; c006f1c0 <ext3_truncate+33c/4c0> Trace; c0025424 <vmtruncate+148/160> Trace; c00589ac <inode_setattr+34/108> Trace; c00701c8 <ext3_setattr+144/190> Trace; c0058bc4 <notify_change+90/13c> Trace; c00398dc <do_truncate+108/1cc> Trace; c004be94 <open_namei+650/7e8> Trace; c003ad70 <filp_open+58/84> Trace; c003b25c <sys_open+4c/fc> Trace; c0003dfc <ret_from_syscall_1+0/b4> Trace; 01819404 Before first symbol Trace; 0180295c Before first symbol Trace; 01803e08 Before first symbol Trace; 01804590 Before first symbol Trace; 01801cf4 Before first symbol Trace; 01802564 Before first symbol Trace; 016dba08 Before first symbol Trace; 00000000 Before first symbol A little more information about our system: MPC7410, 256MB of ram, the drive is a SanDisk CF+ 512MB Drive. The root partition is mounted as EXT3: (Output of tune2fs -l): tune2fs 1.24a (02-Sep-2001) Filesystem volume name: / Last mounted on: <not available> Filesystem UUID: 1e9b04dc-f9ef-463e-bc4c-69f1d8506edb Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal filetype sparse_super Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 122400 Block count: 489384 Reserved block count: 24469 Free blocks: 165035 Free inodes: 102699 First block: 1 Block size: 1024 Fragment size: 1024 Blocks per group: 8192 Fragments per group: 8192 Inodes per group: 2040 Inode blocks per group: 255 Last mount time: Sun Nov 11 06:50:09 2001 Last write time: Sun Nov 11 06:50:09 2001 Mount count: 5 Maximum mount count: 24 Last checked: Wed Nov 7 03:46:45 2001 Check interval: 15552000 (6 months) Next check after: Mon May 6 04:46:45 2002 Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 128 Journal UUID: <none> Journal inode: 8 Journal device: 0x0000 First orphan inode: 0
Hi, On Tue, Nov 13, 2001 at 04:41:36PM -0800, Paul White wrote:> We have about 20 - 30 embedded machines running Linux 2.4.13, which the > ext3-2.4-0.9.13-2413 patch applied. These are PowerPC 7410 based systems. > > I am getting reports of Oops happening either during removes (rm's), or > coppies (cp's). Most reports state that issuing the same command once the > system boots back up, does not cause it to crash. I guess its something > else leading up to the crash.Is it always the same oops?> Btw: Could anyone tell me what the warning line below means? Thanks..> >>NIP; c00803f4 <journal_check_used_features+28/7c> <====> Trace; c0190000 <Unused_offset+7894/1c258> > Trace; c007e2d8 <journal_revoke+40/358> > Trace; c006be04 <ext3_forget+f4/184> > Trace; c006ed18 <ext3_free_branches+11c/288> > Trace; c006f1c0 <ext3_truncate+33c/4c0> > Trace; c0025424 <vmtruncate+148/160>This is _really_ weird. That is an enormously simple function, which gets called all the time to check journal capabilities when we are about to write a revoke record into a journal. That happens any time we delete any metadata, so both truncating a file or overwriting an old file with new data will go down this path, every time. The function only tests a few simple data structures in memory, so by the time it gets called, the damage must already have been done. If we could see some of the other oopses, it would help in determining any pattern here. Thanks, Stephen