Am using RedHat 7.3 with software raid. it was crashing from time to time so i desided to investigate. after running some stress tests on it it died with a kernel message (see below). it might not be ext3 related but since I was running redhat with ext2 for some time without problems (not on the same machine) I thought it might have somthing to do with the fs. Any ideas. I'll try converting to ext2 to see if it solves the problem. Feb 4 23:57:37 mail kernel: Unable to handle kernel paging request at virtual address 10000004 Feb 4 23:57:37 mail kernel: printing eip: Feb 4 23:57:37 mail kernel: c013b4c6 Feb 4 23:57:37 mail kernel: *pde = 00000000 Feb 4 23:57:37 mail kernel: Oops: 0000 Feb 4 23:57:37 mail kernel: autofs 8139too mii ide-cd cdrom usb-uhci usbcore ext3 jbd raid1 Feb 4 23:57:37 mail kernel: CPU: 0 Feb 4 23:57:37 mail kernel: EIP: 0010:[<c013b4c6>] Not tainted Feb 4 23:57:37 mail kernel: EFLAGS: 00010206 Feb 4 23:57:37 mail kernel: Feb 4 23:57:37 mail kernel: EIP is at get_hash_table [kernel] 0x66 (2.4.18-24.7.x) Feb 4 23:57:37 mail kernel: eax: d3fa0000 ebx: 00000002 ecx: d3fa7818 edx: 10000000 Feb 4 23:57:37 mail kernel: esi: 0012d4ec edi: 00000901 ebp: 0000000e esp: cabafe08 Feb 4 23:57:37 mail kernel: ds: 0018 es: 0018 ss: 0018 Feb 4 23:57:37 mail kernel: Process tar (pid: 3115, stackpage=cabaf000) Feb 4 23:57:37 mail kernel: Stack: 00001e06 00000000 00001000 00000000 00000000 c013c18b 00000901 0012d4ec Feb 4 23:57:37 mail kernel: 00001000 0000000c 00001000 c013c498 c0ddb480 00000000 c013c4b5 c6e2de80 Feb 4 23:57:37 mail kernel: c6e2de80 c0c7b000 cabafe60 00001000 0000001a c6e2de80 d4824e54 d2102ec0 Feb 4 23:57:37 mail kernel: Call Trace: [<c013c18b>] unmap_underlying_metadata [kernel] 0x1b (0xcabafe1c)) Feb 4 23:57:37 mail kernel: [<c013c498>] __block_prepare_write [kernel] 0xe8 (0xcabafe34)) Feb 4 23:57:37 mail kernel: [<c013c4b5>] __block_prepare_write [kernel] 0x105 (0xcabafe40)) Feb 4 23:57:37 mail kernel: [<d4824e54>] ext3_mark_iloc_dirty [ext3] 0x24 (0xcabafe60)) Feb 4 23:57:37 mail kernel: [<d481a810>] .rodata.str1.1 [jbd] 0x30 (0xcabafe74)) Feb 4 23:57:37 mail kernel: [<c013cd75>] block_prepare_write [kernel] 0x25 (0xcabafe88)) Feb 4 23:57:37 mail kernel: [<d4822a20>] ext3_get_block [ext3] 0x0 (0xcabafe9c)) Feb 4 23:57:37 mail kernel: [<d481238d>] journal_start_R79e68e08 [jbd] 0x7d (0xcabafea8)) Feb 4 23:57:37 mail kernel: [<d4822efe>] ext3_prepare_write [ext3] 0x7e (0xcabafeb8)) Feb 4 23:57:37 mail kernel: [<d4822a20>] ext3_get_block [ext3] 0x0 (0xcabafec8)) Feb 4 23:57:37 mail kernel: [<c012c98d>] generic_file_write [kernel] 0x4cd (0xcabafeec)) Feb 4 23:57:37 mail kernel: [<c01158db>] __wake_up [kernel] 0x3b (0xcabaff3c)) Feb 4 23:57:37 mail kernel: [<d4820b02>] ext3_file_write [ext3] 0x22 (0xcabaff5c)) Feb 4 23:57:37 mail kernel: [<c013a036>] sys_write [kernel] 0x96 (0xcabaff7c)) Feb 4 23:57:37 mail kernel: [<c0109f18>] do_IRQ [kernel] 0x88 (0xcabaffa0)) Feb 4 23:57:37 mail kernel: [<c0109f3c>] do_IRQ [kernel] 0xac (0xcabaffa8)) Feb 4 23:57:37 mail kernel: [<c010896b>] system_call [kernel] 0x33 (0xcabaffc0)) Feb 4 23:57:37 mail kernel: Feb 4 23:57:37 mail kernel: Feb 4 23:57:37 mail kernel: Code: 39 72 04 89 d1 75 f3 0f b7 42 08 3b 44 24 20 75 e9 66 39 7a Yavor **** http://www.netsmartlogos.com **** Latest Logos, ringtones, call now and win!! **Powered by NetSmart.Com.Cy**
Hi, On Tue, 2003-02-04 at 23:38, Yavor Shahpasov wrote:> Am using RedHat 7.3 with software raid. it was crashing from time to time > so i desided to investigate. after running some stress tests on it it died > with a kernel message (see below). it might not be ext3 related but since > I was running redhat with ext2 for some time without problems (not on the > same machine) I thought it might have somthing to do with the fs.> Any ideas.Hardware problem, most likely bad memory:> Feb 4 23:57:37 mail kernel: Unable to handle kernel paging request at > virtual address 10000004 > Feb 4 23:57:37 mail kernel: EIP is at get_hash_table [kernel] 0x66 > Feb 4 23:57:37 mail kernel: eax: d3fa0000 ebx: 00000002 ecx: d3fa7818 > edx: 10000000That's the single most common oops footprint pointing to bad memory: an error in get_hash_table, with a single-bit flipped in one of the registers (%edx=0x10000000) and a paging error trying to reference that as a struct (virtual address 0x10000004). Your first port of call here is memtest86. Cheers, Stephen