thr3ads.net - Ext3 users - Kernel Ooops probably in conjunction with lvm [Oct 2001]

If this information is useful, please help other people find it:
Share via:

Wiktor Wodecki

2001-Oct-05 11:17 UTC

Kernel Ooops probably in conjunction with lvm

Hello,

I've got a rather strange setup over here with about 6 ext3 partitions, one
50gig lvm partition and one 2 gig software-raid1 partition running under a
heavily patched 2.4.10 kernel (various netfilter patches, freeswan, current lvm
patch (1.0.1-rc4 and current ext3 patch). Furthermore I have all of those
neccessary filesystem tools, so don't tell me to upgrade :-)

The machine was under heavily load (7.89) when the crash happend, with an
updatedb, various makes and some tar processes running on the same partiton
(lvm). I can reproduce this when bringing the partition on heavy load again
(lot's of tar's and find's should do the trick). The oops trace
below didn't crash my system only the updatedb find command, but I also had
some oops crashing my box completely (it even stopped answering echo request
packets!). I didn't figgure out yet how to catch those oops to a floppy or
something, so if one of you knows, tell me please, so I can provide some more
information.

Anyway, here is one ksymoops output:

************************************
ksymoops 2.4.3 on i686 2.4.10.  Options used
     -v /boot/vmlinuz (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.10/ (default)
     -m /boot/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

/usr/bin/nm: /boot/vmlinuz: File format not recognized
Error (pclose_local): read_nm_symbols pclose failed 0x100
Warning (read_vmlinux): no kernel symbols in vmlinux, is /boot/vmlinuz a valid
vmlinux file?
Warning (compare_maps): mismatch on symbol partition_name  , ksyms_base says
c023f9e0, System.map says c014c090.  Ignoring ksyms_base entry
Unable to handle kernel NULL pointer dereference at virtual address 00000906
c0152ddb
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[ext3_find_entry+475/788]
EFLAGS: 00010287
eax: 00000905   ebx: 00000900   ecx: 00000005   edx: 00001900
esi: c38eeb60   edi: c38eeb60   ebp: 00000000   esp: c4e83eb0
ds: 0018   es: 0018   ss: 0018
Process find (pid: 27143, stackpage=c4e83000)
Stack: fffffff4 c38eeb60 c38eeb60 c48fed60 c4e83efc 00000005 c38eebbc cf8afc94
       c0d1c980 00000000 00000000 00000000 00000001 00000000 00000000 00000000
       cf7d4400 cf8afc94 00000000 cf8afc94 00000246 00000000 c0d1c980 c0d1c9e4
Call Trace: [d_alloc+27/348] [ext3_lookup+39/124] [real_lookup+83/196]
[path_walk+1201/1736] [getname+93/156]
Code: 0f b6 43 06 39 c1 75 4d 83 3b 00 74 48 8b 74 24 18 8b 4c 24
Using defaults from ksymoops -t elf32-i386 -a i386

Code;  00000000 Before first symbol
00000000 <_EIP>:
Code;  00000000 Before first symbol
   0:   0f b6 43 06               movzbl 0x6(%ebx),%eax
Code;  00000004 Before first symbol
   4:   39 c1                     cmp    %eax,%ecx
Code;  00000006 Before first symbol
   6:   75 4d                     jne    55 <_EIP+0x55> 00000054 Before
first symbol
Code;  00000008 Before first symbol
   8:   83 3b 00                  cmpl   $0x0,(%ebx)
Code;  0000000a Before first symbol
   b:   74 48                     je     55 <_EIP+0x55> 00000054 Before
first symbol
Code;  0000000c Before first symbol
   d:   8b 74 24 18               mov    0x18(%esp,1),%esi
Code;  00000010 Before first symbol
  11:   8b 4c 24 00               mov    0x0(%esp,1),%ecx


3 warnings and 1 error issued.  Results may not be reliable.
************************************

Yes, /boot/vmlinuz *is* a valid kernel file and yes I also tried
/usr/src/linux/vmlinux, same results.

The crashes started yesterday, the day when I upgraded my system from 2.4.8 to
2.4.10. I was using lvm patch 1.0.1-rc1 then, and switched to 1.0.1-rc4 now.
I'm not really sure who's fault it is, but since there were some debug
statements about ext3 in the oops message, I thought to sent it here first.

I hope you can use this information, if something's missing, tell me please,
I'm on list.

-- 
Regards,

Wiktor Wodecki      |    http://johoho.eggheads.org
wodecki@gmx.de      |    IRC: Johoho@IrcNET

Stephen C. Tweedie

2001-Oct-05 13:32 UTC

head link

Re: Kernel Ooops probably in conjunction with lvm

Hi,

On Fri, Oct 05, 2001 at 01:17:45PM +0200, Wiktor Wodecki wrote:
 > I've got a rather strange setup over here with about 6 ext3 partitions,
one 50gig lvm partition and one 2 gig software-raid1 partition running under a
heavily patched 2.4.10 kernel (various netfilter patches, freeswan, current lvm
patch (1.0.1-rc4 and current ext3 patch). Furthermore I have all of those
neccessary filesystem tools, so don't tell me to upgrade :-)
Is this a large-memory (>=1GB) box?  It appears that we've got a
buffer_head whose "b_data" is 0x900, which indicates that the buffer
is a highmem one.  Highmem buffers should not be used for filesystem
metadata: if ext3 is being given such a buffer, it's a core VFS fault
(and the VFS changed subsantially in this area in 2.4.10).  There have
been fixes to related code since 2.4.10, but it's also entirely
possible that it's an LVM interaction which is causing the problem.

Once ext3-0.9.11 is out for 2.4.11-pre*, I'd suggest giving that a try
and seeing if you can reproduce this.  I'm 99% sure it's not an ext3
fault, though --- the footprint is clearly a highmem buffer_head
occurring where we don't expect ever to see one.

Cheers,
 Stephen

Maybe Matching Threads

Search for more seemingly similar threads

Ext3 users - Oct 2001 - Kernel Ooops probably in conjunction with lvm

Kernel Ooops probably in conjunction with lvm

Re: Kernel Ooops probably in conjunction with lvm

Maybe Matching Threads