Pasi Kärkkäinen
2008-Sep-16 07:38 UTC
[CentOS] netfilter kernel crash in ip_ct_refresh_acct / ip_conntrack with centos 5.x
Hello! Has anyone seen this netfilter kernel crash? Images from the console of the crashed firewall: http://pasik.reaktio.net/centos5-kernel-crash/ Firewall is HP DL360 G4 server running CentOS 5.x 32 bit. I've seen this firewall crashing multiple times, but I only started investigating it lately.. It has happened using CentOS 5.0, 5.1 and now also with 5.2. I'm not sure if it was the same bug earlier, but at least the last two times (with CentOS 5.2) it has been the same, see screenshots. Last lines of the console output: EIP: [<f8af2c5c>] __ip_ct_refresh_acct+0xa1/0x129 [ip_conntrack] SS:ESP 0068:c0724e4c <0>Kernel panic - not syncing: Fatal exception in interrupt At the moment firewall is running CentOS 5.2, Linux kernel 2.6.18-92.1.10.el5.centos.plus. Any tips how to resolve this? -- Pasi
Akemi Yagi
2008-Sep-16 11:22 UTC
[CentOS] netfilter kernel crash in ip_ct_refresh_acct / ip_conntrack with centos 5.x
On Tue, Sep 16, 2008 at 12:38 AM, Pasi K?rkk?inen <pasik at iki.fi> wrote:> Hello! > > Has anyone seen this netfilter kernel crash? > > Images from the console of the crashed firewall: > http://pasik.reaktio.net/centos5-kernel-crash/ > > Firewall is HP DL360 G4 server running CentOS 5.x 32 bit. > > I've seen this firewall crashing multiple times, but I only started investigating it lately.. > > It has happened using CentOS 5.0, 5.1 and now also with 5.2. I'm not sure if > it was the same bug earlier, but at least the last two times (with CentOS 5.2) > it has been the same, see screenshots. > > Last lines of the console output: > > EIP: [<f8af2c5c>] __ip_ct_refresh_acct+0xa1/0x129 [ip_conntrack] SS:ESP 0068:c0724e4c > <0>Kernel panic - not syncing: Fatal exception in interrupt > > At the moment firewall is running CentOS 5.2, Linux kernel 2.6.18-92.1.10.el5.centos.plus. > > Any tips how to resolve this?You might want to look at this upstream bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=456664 There is a quick fix in comment #24 that you can try out without having to rebuild kernel. Akemi
Pasi Kärkkäinen
2008-Sep-16 16:51 UTC
[CentOS] Re: netfilter kernel crash in ip_ct_refresh_acct / ip_conntrack with centos 5.x
On Tue, Sep 16, 2008 at 12:12:45PM -0400, Jan Engelhardt wrote:> > On Tuesday 2008-09-16 03:38, Pasi K?rkk?inen wrote: > > > >Has anyone seen this netfilter kernel crash? > > > >Images from the console of the crashed firewall: > >http://pasik.reaktio.net/centos5-kernel-crash/ > > There have been some crashes reported; and some even fixed. > As such you are encouraged to try a (much) newer kernel > because 2.6.18 is outside the scope of the -stable team. >Yeah I know it's old.. but it's the only standard/supported kernel in Redhat Enterprise Linux 5 (and CentOS 5). I guess the best option would be to somehow figure out what the actual bug is, get (or make) a patch for it, and sent it for inclusion in RHEL 5.x errata kernels.. -- Pasi
Jake Holmquist
2008-Sep-19 19:55 UTC
[CentOS] netfilter kernel crash in ip_ct_refresh_acct / ip_conntrack with centos 5.x
> Hello! > > Has anyone seen this netfilter kernel crash? > > Images from the console of the crashed firewall: > http://pasik.reaktio.net/centos5-kernel-crash/ > > Firewall is HP DL360 G4 server running CentOS 5.x 32 bit. > > I've seen this firewall crashing multiple times, but I only startedinvestigating it lately..> > It has happened using CentOS 5.0, 5.1 and now also with 5.2. I'm not sureif> it was the same bug earlier, but at least the last two times (with CentOS5.2)> it has been the same, see screenshots. > > Last lines of the console output: > > EIP: [<f8af2c5c>] __ip_ct_refresh_acct+0xa1/0x129 [ip_conntrack] SS:ESP0068:c0724e4c> <0>Kernel panic - not syncing: Fatal exception in interrupt > > At the moment firewall is running CentOS 5.2, Linux kernel2.6.18-92.1.10.el5.centos.plus.> > Any tips how to resolve this?Take a look here: https://bugzilla.redhat.com/show_bug.cgi?id=433661 Looks like a test kernel is available.... We've been having this problem for quite some time - actually moved our production box to RHEL 4.x Jake