Since I ain't got a better place to report this, I do it here:
Kernel 2.4.20-ac1
# uptime
16:31:00 up 2 days, 22:10, 7 users, load average: 1.89, 2.20, 2.99
This is our main mailbox server. We're running ext3:
# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw 0 0
/proc /proc proc rw 0 0
/dev/sdb5 /boot ext3 rw 0 0
/dev/sda6 /home ext3 rw,noatime,nosuid 0 0
/dev/sdb8 /tmp ext3 rw 0 0
/dev/sdb7 /var ext3 rw,noatime 0 0
/dev/sdc6 /copymail ext3 rw,noatime,nosuid 0 0
/dev/sdb9 /extra ext3 rw,noatime 0 0
/dev/hda6 /mnt/hda ext3 rw,noatime 0 0
none /dev/pts devpts rw 0 0
All over a sudden I spotted that the pop3d (Courier) wasn't running
anymore. I checked and restarted it. /var/log/message showed the
authpam (piece belongin to Courier died):
...
Dec 5 00:00:07 postamt1 httpd: httpd startup succeeded
Dec 5 15:48:35 postamt1 kernel: swap_dup: Bad swap file entry 516a444c
Dec 5 15:48:36 postamt1 kernel: swap_dup: Bad swap file entry 7741314a
Dec 5 15:48:36 postamt1 kernel: Page has mapping still set. This is a serious
situation. However if you
Dec 5 15:48:36 postamt1 kernel: are using the NVidia binary only module please
report this bug to
Dec 5 15:48:36 postamt1 kernel: NVidia and not to the linux kernel mailinglist.
Dec 5 15:48:36 postamt1 kernel: kernel BUG at page_alloc.c:106!
Dec 5 15:48:36 postamt1 kernel: invalid operand: 0000
Dec 5 15:48:36 postamt1 kernel: CPU: 0
Dec 5 15:48:36 postamt1 kernel: EIP: 0010:[__free_pages_ok+94/800] Not
tainted
Dec 5 15:48:36 postamt1 kernel: EFLAGS: 00010282
Dec 5 15:48:36 postamt1 kernel: eax: 00000033 ebx: c1a599b8 ecx: f78f8000
edx: f78f9f64
Dec 5 15:48:36 postamt1 kernel: esi: ef44b128 edi: 00000000 ebp: 2f507541
esp: e2277df4
Dec 5 15:48:36 postamt1 kernel: ds: 0018 es: 0018 ss: 0018
Dec 5 15:48:36 postamt1 kernel: Process authpam (pid: 20623,
stackpage=e2277000)
Dec 5 15:48:36 postamt1 kernel: Stack: c0221400 c02213a0 c0221340 ffffffff
080496d1 c4d16228 00000040 00003000
Dec 5 15:48:36 postamt1 kernel: c0125950 c4d16180 00000004 00002000
ef44b128 00003000 2f507541 c0122cef
Dec 5 15:48:36 postamt1 kernel: c1a599b8 00000001 00000000 0804b000
d0b0e080 08048000 00000000 00003000
Dec 5 15:48:36 postamt1 kernel: Call Trace: [set_page_dirty+80/96]
[zap_page_range+447/656] [fput+188/224] [exit_mmap+186/304] [do_coredum
p+210/222]
Dec 5 15:48:36 postamt1 kernel: [mmput+55/96] [do_exit+145/528]
[collect_signal+150/224] [do_signal+495/604] [__mmdrop+47/52] [do_exit+515/528]
Dec 5 15:48:36 postamt1 kernel: [sys_munmap+52/80] [do_invalid_op+0/160]
[signal_return+20/24]
Dec 5 15:48:36 postamt1 kernel:
Dec 5 15:48:36 postamt1 kernel: Code: 0f 0b 6a 00 27 13 22 c0 83 c4 0c 89 d8 2b
05 30 3e 2c c0 69
Dec 5 15:53:20 postamt1 authdaemond.mysql: authdaemon: modules="authcustom
authcram authuserdb authmysql authpam", daemons=5
# This is where I restarted Courier!
Dec 5 15:53:53 postamt1 kernel: <2>EXT3-fs error (device sd(8,6)):
ext3_free_blocks: Freeing blocks not in datazone - block = 1094928743, count = 1
Dec 5 15:53:53 postamt1 kernel: EXT3-fs error (device sd(8,6)):
ext3_free_blocks: Freeing blocks not in datazone - block = 1095059815, count = 1
Dec 5 15:53:53 postamt1 kernel: EXT3-fs error (device sd(8,6)):
ext3_free_blocks: Freeing blocks not in datazone - block = 1094996087, count = 1
Dec 5 15:53:53 postamt1 kernel: EXT3-fs error (device sd(8,6)):
ext3_free_blocks: Freeing blocks not in datazone - block = 1380209241, count = 1
Dec 5 15:53:53 postamt1 kernel: EXT3-fs error (device sd(8,6)):
ext3_free_blocks: Freeing blocks not in datazone - block = 1095060801, count = 1
...
# cat /proc/swaps
Filename Type Size Used Priority
/copymail/SWAP file 2047992 0 -2
You see, we use FS swap. My idea was to
sync
swapoff
sync
swapon
and that seemed to clear things.
Questions:
* ext3 or VM problem?
* How to track down more professionally?
* fsck recommended?
* Should I report this to lkml?
# cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 1056690176 1043947520 12742656 0 72699904 726638592
Swap: 2097143808 0 2097143808
MemTotal: 1031924 kB
MemFree: 12444 kB
MemShared: 0 kB
Buffers: 70996 kB
Cached: 709608 kB
SwapCached: 0 kB
Active: 551256 kB
Inact_dirty: 287348 kB
Inact_clean: 77224 kB
Inact_target: 183164 kB
HighTotal: 131072 kB
HighFree: 1024 kB
LowTotal: 900852 kB
LowFree: 11420 kB
SwapTotal: 2047992 kB
SwapFree: 2047992 kB
Committed_AS: 159316 kB
# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 4
model name : AMD Athlon(tm) Processor
stepping : 2
cpu MHz : 1202.771
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips : 2398.61
--
Ralf Hildebrandt (Im Auftrag des Referat V a) Ralf.Hildebrandt@charite.de
Charite Campus Mitte Tel. +49 (0)30-450 570-155
Referat V a - Kommunikationsnetze - Fax. +49 (0)30-450 570-916
Microsoft Vaccine 2000 is configuring your immune system. This may
take a few minutes. If your body stops responding for a long time and
there is no brain activity please die. Setup will continue after you
are reborn.