On Monday, September 26, 2011 2:06:16 am Eugene Grosbein
wrote:> Hi!
>
> I use several SuperMicro boxes with intergrated IPMI card.
> http://www.supermicro.com/products/system/1U/5016/SYS-5016T-MTF.cfm
>
> FreeBSD 8.2 sometimes hang in the past after panics so I use IPMI's
watchdog
> and generally it works nice with 5 minute timeout. The card is detected as
following:>
> ipmi0: <IPMI System Interface> on isa0
> ipmi0: KCS mode found at io 0xca2 alignment 0x1 on isa
> ipmi0: IPMI device rev. 1, firmware rev. 1.07, version 2.0
> ipmi0: Number of channels 2
> ipmi0: Attached watchdog
>
> Sometimes ipmi driver issues "KCS errors" to system logs that I
ignore
> as they seem harmless. However, one of my boxes suddenly rebooted with
watchdog> after following errors written to console:
>
> ipmi0: KCS: Reply address mismatch
> ipmi0: KCS error: 01
> ipmi0: KCS: Reply address mismatch
> ipmi0: KCS error: 01
> ipmi0: KCS: Command mismatch
> ipmi0: KCS error: 01
> ipmi0: KCS: Reply address mismatch
> ipmi0: KCS error: 01
> ipmi0: KCS: Reply address mismatch
> ipmi0: KCS error: 01
> ipmi0: KCS: Command mismatch
> ipmi0: KCS error: 01
> ipmi0: KCS: Reply address mismatch
> ipmi0: KCS error: 01
> ipmi0: KCS: Reply address mismatch
> ipmi0: KCS error: 01
> ipmi0: KCS: Reply address mismatch
> ipmi0: KCS error: 01
> ipmi0: Failed to reset watchdog
> ipmi0: KCS: Command mismatch
> ipmi0: KCS error: 01
>
> It seems, the driver lost ability to contact IPMI watchdog timer and that
was the reason of reboot.>
> What can be done to avoid such resets in the future?
Hmm, it looks like the IPMI BMC wedged in some fashion. The driver tries to
reset the KCS interface when it encounters an error and from your log it
didn't unwedge even after several resets. In that case there isn't a
lot we
can do since we can't talk to the watchdog to turn it off.
--
John Baldwin