I have 4 HP DL145G2 boxes (dual opteron). I recently upgrded them from 6.2-STABLE to 7.0-RELEASE using cvsup on each, compiling world + kernel. The upgrade was (fairly painless on two machines, and has broken the other two machines. On boot, the broken machines hang somewhere after kbdc and psm, and before serial driver initialization. Specifically, I see: psm0: unable to allocate IRQ psmcpnp0: PS/2 mouse port irq 12 on acpi0 pm0: ps2 mouse irq 12 on atkbdc0 ioapic0: routing intpin 12 (ISA IRQ 12) to vector 57 psm0: giant-locked psm0: thread psm0: model intellimouse explorer.... psm0: config; 0000000000 flags: 000000008, packet size:4 psm0: syncmask:08, syncbits:00 After that, it hangs. I'm using a 7.0 world, mergemastered, and a kernel compiled on the broken system, and also a kernel compiled on one of the working systems and copied over, with the same config. (attached) dmesg from one of the working systems also attached -- same config modulo RAM and CPU speed. If I boot using the 6.2 kernel and 7.0 userland, I can ssh in and some things work, but others don't. I'm using an IPKVM vs. serial console, so this is slightly more difficult to debug, plus I have to ask someone to reboot the machine, vs. using a power cycler. Any help in debugging this would be most appreciated. -------------- next part -------------- A non-text attachment was scrubbed... Name: dmesg.yesterday Type: application/octet-stream Size: 7257 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080421/11d93bbd/dmesg.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: METACOLO Type: application/octet-stream Size: 676 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080421/11d93bbd/METACOLO.obj
On Monday 21 April 2008 03:23:04 pm Ryan Lackey wrote:> I have 4 HP DL145G2 boxes (dual opteron). I recently upgrded them from > 6.2-STABLE to 7.0-RELEASE using cvsup on each, compiling world + kernel. > > The upgrade was (fairly painless on two machines, and has broken the other > two machines. > > On boot, the broken machines hang somewhere after kbdc and psm, and before > serial driver initialization. > > Specifically, I see: > psm0: unable to allocate IRQ > psmcpnp0: PS/2 mouse port irq 12 on acpi0 > pm0: ps2 mouse irq 12 on atkbdc0 > ioapic0: routing intpin 12 (ISA IRQ 12) to vector 57 > psm0: giant-locked > psm0: thread > psm0: model intellimouse explorer.... > psm0: config; 0000000000 flags: 000000008, packet size:4 > psm0: syncmask:08, syncbits:00 > > After that, it hangs.I would add more printfs to figure out exactly where it dies. I would start by seeing if it makes it out of the psm driver. If so, then I would start adding printfs to the new-bus code in sys/kern/subr_bus.c to see if drivers are probing when it hangs. -- John Baldwin
On Mon, 2008-04-21 at 12:23 -0700, Ryan Lackey wrote:> I have 4 HP DL145G2 boxes (dual opteron). I recently upgrded them from > 6.2-STABLE to 7.0-RELEASE using cvsup on each, compiling world + kernel. > > The upgrade was (fairly painless on two machines, and has broken the other > two machines. > > On boot, the broken machines hang somewhere after kbdc and psm, and before > serial driver initialization. > > Specifically, I see: > psm0: unable to allocate IRQ > psmcpnp0: PS/2 mouse port irq 12 on acpi0 > pm0: ps2 mouse irq 12 on atkbdc0 > ioapic0: routing intpin 12 (ISA IRQ 12) to vector 57 > psm0: giant-locked > psm0: thread > psm0: model intellimouse explorer.... > psm0: config; 0000000000 flags: 000000008, packet size:4 > psm0: syncmask:08, syncbits:00 > > After that, it hangs. > > I'm using a 7.0 world, mergemastered, and a kernel compiled on the broken > system, and also a kernel compiled on one of the working systems and copied > over, with the same config. > > (attached) > > dmesg from one of the working systems also attached -- same config modulo > RAM and CPU speed. > > If I boot using the 6.2 kernel and 7.0 userland, I can ssh in and some > things work, but others don't.That's unsurprising really, and is exactly why the upgrade instructions say to reboot with the new kernel before installing the new world. Doing that would have caught this problem before it was too late.> I'm using an IPKVM vs. serial console, so this is slightly more difficult to > debug, plus I have to ask someone > to reboot the machine, vs. using a power cycler. > > Any help in debugging this would be most appreciated.OK, recompile the kernel, adding the following options: options KDB options DDB When the machine hangs, hit Ctrl-Alt-Escape, this may or may not drp you into the debugger, depending on exactly how the machine is hanging. (if it doesn't, your only option may well be to try to get serial access to the machine) Assuming it does get you into the debugger, send the output of "bt" to the list. With that, hopefully there will be enough information to diagnose this issue. Thanks, Gavin