Hi all,
i just had some enlightnening expirience.
One of our servers died early this morning. When i rebooted it, it crashed
again something like 5 seconds after finishing the boot, with the same oops
message. Trying to figure out what is causing it i booted into single user
mode and after some more crashes concluded that it is enough just to start
postfix and the machine will crash within seconds. The oops is attached.
Digging further, i rm -rfed /var/spool/postfix/maildrop (and now i'm banging
my head on the wall for this), recreated it with the same permissions,
voila, box was up and running.
If memory serves me right, there were about 25 files, most of them zero
size, some of them with some size, most of them looked ok, postsuper was
complaining about two of them: bogus file name: maildrop/820675.380, bogus
file name: maildrop/889323.372.
Inode size of the directory itself was about 1.3MB.
My theory goes like this:
There is some fundamental flaw in the linux kernel that can be triggered by
some foo file. What exactly would foo stand here i don't know (yet). I'm
sure to save the maildrop directory when this happens next time :)
I'm battling this problem for a month and a half here and am only now
starting to get the pieces together. See my initial postings at
http://www.ussg.iu.edu/hypermail/linux/kernel/0304.0/1773.html and
https://listman.redhat.com:443/archives/valhalla-list/2003-April/msg00192.html,
where i describe the very same oops on two differnet machines with different
kernels. Oh, one of those boxen was running sendmail, so the problem is not
postfix specific.
I entered this into redhat bugzilla as #90938, more info there.
If there's anyone willing to help me figure this out ...
--
Jure Pecar