Marc G. Fournier
2009-Apr-14 20:13 UTC
7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi ... Over the past little while, two of my servers have suddenly started to hang ... servers that up until this started, have been reasonably rock solid ... they are generally within a day of each other for source code, and the hardware on both are pretty much identical (HP Proliant DL360 Servers) ... I have serial console configured on both so that I can do CR ~ ^b to get to DDB ... except, when it hangs, all I get is: "KDB: enter: Break sequence on console" And it hangs there, no prompt. I setup a simple script (see attached) to run every 5 minutes that gathers various pieces of info that I think are pertinent, but most likely don't cover everything ... Whenever this happens, on either machine, vmstat show data *like* (notice the high procs -> w values?): procs memory page disks faults cpu r b w avm fre flt re pi po fr sr da0 pa0 in sy cs us sy id 165 106 2 12699168 33840 3080 38 2 2 3082 1623 0 0 337 36961 4731 18 7 75 64 75 4 12761744 23084 46809 623 65 43 19307 116 334 0 1189 83674 11708 70 20 10 1 68 25 12773980 23068 11036 3003 9 36 4055 116 282 0 1336 78346 14869 56 16 28 0 71 25 12774236 23084 186 769 1 5 18 80 249 0 609 9298 5894 5 5 91 5 90 31 12747296 23352 626 2546 5 104 1147 368 281 0 1536 40945 19980 6 5 90 Where procs -> w just seems to keep rising ... note that the output for vmstat *5 minutes before* shows: procs memory page disks faults cpu r b w avm fre flt re pi po fr sr da0 pa0 in sy cs us sy id 35 121 0 12414692 90552 3080 32 2 1 3090 1403 0 0 337 37022 4730 18 7 75 31 93 0 12314408 62024 36550 414 46 6 34285 27 563 0 916 94851 8813 67 33 0 43 179 0 12270932 23080 24035 101 41 12 13887 36 375 0 766 61969 6945 69 23 7 92 44 0 12265524 119804 2122 2028 1 32 13051 1096092 205 0 558 19460 4561 19 50 32 38 34 0 12330068 89140 30758 103 39 119 37037 2837365 165 0 773 92041 7111 47 53 0 I have one QEMU VPS running on this box, with kqemu running the latest kernel module ... but the other machine experiencing the same issue is only running FreeBSD jails ... Both servers are running SCHED_4BSD, if that matters any ... ? I'm at a loss as to what to look at / for next ... pointers would be greatly appreciated ... I have the various output files that the script generates available if anyone thinks they would be useful ... thank you ... Marc G. Fournier Hub.Org Hosting Solutions S.A. (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAknlRcMACgkQ4QvfyHIvDvNmIgCfSWdT9gug6VCjYM1VVMuv1UkN K28AoK298b6mxEeiddu4BAH0+IpkRsti =q6lD -----END PGP SIGNATURE----- -------------- next part -------------- Skipped content of type multipart/mixed
Hi Marc and List,
i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs)
seems to hang in intervals of about 8 hours.
kernel is still there but no connections can be made to nfs/ssh and
login on local console doesn't seem to
work due to incredible slowness. breaking to the debugger takes a
moment but works.
(compiling kernel with WITNESS didnt help)
the server had been solid before with 7 stable kernel from around 19
October 2008.
I now added these lines to /boot/loader.conf
hw.pci.enable_msi=0
hw.pci.enable_msix=0
to disable Message Signaled Interrupts. Which are used by the 3ware
twa driver and igb network driver on our server.
With this the server had run 3 days with no hangs. I then enabled msi
again and had a hang within 24 hours. Disabled again and now the
server is online without an issue for 6 days.
Im not 100% sure yet if this really is the sole source of the problems
(e.g. workload might be another factor). But i guess its worth a try
to check if it might help you too.
If this is a known problem or there are any other hints to solve this
problem or if the server configuration just seems wrong, i appreciate
the feedback.
regards,
Martin
pciconf (with msi):
hostb0@pci0:0:0:0: class=0x060000 card=0xa28015d9 chip=0x40038086
rev=0x20 hdr=0x00
cap 01[50] = powerspec 3 supports D0 D3 current D0
cap 05[58] = MSI supports 2 messages
cap 10[6c] = PCI-Express 2 root port
pcib1@pci0:0:1:0: class=0x060400 card=0xa28015d9 chip=0x40218086
rev=0x20 hdr=0x01
cap 01[50] = powerspec 3 supports D0 D3 current D0
cap 05[58] = MSI supports 2 messages
cap 10[6c] = PCI-Express 2 root port
cap 0d[b0] = PCI Bridge card=0xa28015d9
pcib2@pci0:0:3:0: class=0x060400 card=0xa28015d9 chip=0x40238086
rev=0x20 hdr=0x01
cap 01[50] = powerspec 3 supports D0 D3 current D0
cap 05[58] = MSI supports 2 messages
cap 10[6c] = PCI-Express 2 root port
cap 0d[b0] = PCI Bridge card=0xa28015d9
pcib3@pci0:0:5:0: class=0x060400 card=0xa28015d9 chip=0x40258086
rev=0x20 hdr=0x01
cap 01[50] = powerspec 3 supports D0 D3 current D0
cap 05[58] = MSI supports 2 messages
cap 10[6c] = PCI-Express 2 root port
cap 0d[b0] = PCI Bridge card=0xa28015d9
pcib4@pci0:0:7:0: class=0x060400 card=0xa28015d9 chip=0x40278086
rev=0x20 hdr=0x01
cap 01[50] = powerspec 3 supports D0 D3 current D0
cap 05[58] = MSI supports 2 messages
cap 10[6c] = PCI-Express 2 root port
cap 0d[b0] = PCI Bridge card=0xa28015d9
pcib8@pci0:0:9:0: class=0x060400 card=0xa28015d9 chip=0x40298086
rev=0x20 hdr=0x01
cap 01[50] = powerspec 3 supports D0 D3 current D0
cap 05[58] = MSI supports 2 messages
cap 10[6c] = PCI-Express 2 root port
cap 0d[b0] = PCI Bridge card=0xa28015d9
none0@pci0:0:15:0: class=0x088000 card=0xa28015d9 chip=0x402f8086
rev=0x20 hdr=0x00
cap 01[50] = powerspec 3 supports D0 D3 current D0
cap 11[58] = MSI-X supports 4 messages in map 0x10
cap 10[6c] = PCI-Express 2 type 0
hostb1@pci0:0:16:0: class=0x060000 card=0xa28015d9 chip=0x40308086
rev=0x20 hdr=0x00
hostb2@pci0:0:16:1: class=0x060000 card=0xa28015d9 chip=0x40308086
rev=0x20 hdr=0x00
hostb3@pci0:0:16:2: class=0x060000 card=0xa28015d9 chip=0x40308086
rev=0x20 hdr=0x00
hostb4@pci0:0:16:3: class=0x060000 card=0xa28015d9 chip=0x40308086
rev=0x20 hdr=0x00
hostb5@pci0:0:16:4: class=0x060000 card=0xa28015d9 chip=0x40308086
rev=0x20 hdr=0x00
hostb6@pci0:0:17:0: class=0x060000 card=0xa28015d9 chip=0x40318086
rev=0x20 hdr=0x00
hostb7@pci0:0:21:0: class=0x060000 card=0xa28015d9 chip=0x40358086
rev=0x20 hdr=0x00
hostb8@pci0:0:21:1: class=0x060000 card=0xa28015d9 chip=0x40358086
rev=0x20 hdr=0x00
hostb9@pci0:0:22:0: class=0x060000 card=0xa28015d9 chip=0x40368086
rev=0x20 hdr=0x00
hostb10@pci0:0:22:1: class=0x060000 card=0xa28015d9 chip=0x40368086
rev=0x20 hdr=0x00
pcib9@pci0:0:28:0: class=0x060400 card=0xa28015d9 chip=0x26908086
rev=0x09 hdr=0x01
cap 10[40] = PCI-Express 1 root port
cap 05[80] = MSI supports 1 message
cap 0d[90] = PCI Bridge card=0xa28015d9
cap 01[a0] = powerspec 2 supports D0 D3 current D0
uhci0@pci0:0:29:0: class=0x0c0300 card=0xa28015d9 chip=0x26888086
rev=0x09 hdr=0x00
uhci1@pci0:0:29:1: class=0x0c0300 card=0xa28015d9 chip=0x26898086
rev=0x09 hdr=0x00
uhci2@pci0:0:29:2: class=0x0c0300 card=0xa28015d9 chip=0x268a8086
rev=0x09 hdr=0x00
ehci0@pci0:0:29:7: class=0x0c0320 card=0xa28015d9 chip=0x268c8086
rev=0x09 hdr=0x00
cap 01[50] = powerspec 2 supports D0 D3 current D0
cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14
pcib10@pci0:0:30:0: class=0x060401 card=0xa28015d9 chip=0x244e8086
rev=0xd9 hdr=0x01
cap 0d[50] = PCI Bridge card=0xa28015d9
isab0@pci0:0:31:0: class=0x060100 card=0xa28015d9 chip=0x26708086
rev=0x09 hdr=0x00
atapci0@pci0:0:31:1: class=0x01018a card=0xa28015d9 chip=0x269e8086
rev=0x09 hdr=0x00
atapci1@pci0:0:31:2: class=0x010601 card=0xa28015d9 chip=0x26818086
rev=0x09 hdr=0x00
cap 01[70] = powerspec 2 supports D0 D3 current D0
cap 12[a8] = unknown
none1@pci0:0:31:3: class=0x0c0500 card=0xa28015d9 chip=0x269b8086
rev=0x09 hdr=0x00
twa0@pci0:1:0:0: class=0x010400 card=0x100413c1 chip=0x100413c1
rev=0x01 hdr=0x00
cap 01[40] = powerspec 2 supports D0 D1 D2 D3 current D0
cap 05[50] = MSI supports 32 messages, 64 bit
cap 10[70] = PCI-Express 1 legacy endpoint
pcib5@pci0:4:0:0: class=0x060400 card=0xa28015d9 chip=0x35008086
rev=0x01 hdr=0x01
cap 10[44] = PCI-Express 1 upstream port
cap 01[70] = powerspec 2 supports D0 D3 current D0
cap 0d[80] = PCI Bridge card=0xa28015d9
pcib7@pci0:4:0:3: class=0x060400 card=0xa28015d9 chip=0x350c8086
rev=0x01 hdr=0x01
cap 10[44] = PCI-Express 1 PCI bridge
cap 01[6c] = powerspec 2 supports D0 D3 current D0
cap 0d[80] = PCI Bridge card=0xa28015d9
cap 07[d8] = PCI-X bridge supports
pcib6@pci0:5:0:0: class=0x060400 card=0xa28015d9 chip=0x35108086
rev=0x01 hdr=0x01
cap 10[44] = PCI-Express 1 downstream port
cap 05[60] = MSI supports 1 message, 64 bit
cap 01[70] = powerspec 2 supports D0 D3 current D0
cap 0d[80] = PCI Bridge card=0xa28015d9
twa1@pci0:6:0:0: class=0x010400 card=0x100413c1 chip=0x100413c1
rev=0x01 hdr=0x00
cap 01[40] = powerspec 2 supports D0 D1 D2 D3 current D0
cap 05[50] = MSI supports 32 messages, 64 bit
cap 10[70] = PCI-Express 1 legacy endpoint
igb0@pci0:8:0:0: class=0x020000 card=0x10a715d9 chip=0x10a78086
rev=0x02 hdr=0x00
cap 01[40] = powerspec 2 supports D0 D3 current D0
cap 05[50] = MSI supports 1 message, 64 bit
cap 11[60] = MSI-X supports 10 messages in map 0x1c enabled
cap 10[a0] = PCI-Express 2 endpoint
igb1@pci0:8:0:1: class=0x020000 card=0x10a715d9 chip=0x10a78086
rev=0x02 hdr=0x00
cap 01[40] = powerspec 2 supports D0 D3 current D0
cap 05[50] = MSI supports 1 message, 64 bit
cap 11[60] = MSI-X supports 10 messages in map 0x1c enabled
cap 10[a0] = PCI-Express 2 endpoint
vgapci0@pci0:10:1:0: class=0x030000 card=0xa28015d9 chip=0x515e1002
rev=0x02 hdr=0x00
cap 01[50] = powerspec 2 supports D0 D1 D2 D3 current D0
vmstat -i (with msi):
mstat -i
interrupt total rate
irq1: atkbd0 2 0
irq14: ata0 216 0
irq17: atapci1 172855 200
irq23: ehci0 12 0
irq48: twa0 1472 1
irq54: twa1 1895 2
cpu0: timer 1722548 1998
irq256: igb0 772 0
irq257: igb0 2673 3
irq258: igb0 485 0
irq259: igb0 2121 2
irq260: igb0 1319 1
irq261: igb0 2 0
cpu1: timer 1714417 1988
cpu2: timer 1713997 1988
cpu3: timer 1714220 1988
Total 7049006 8177
vmstat -i (without msi):
interrupt total rate
irq1: atkbd0 2 0
irq14: ata0 216 0
irq17: atapci1 210359 536
irq23: ehci0 11 0
irq48: twa0 1331 3
irq54: twa1 1751 4
irq56: igb0 3733 9
cpu0: timer 783575 1998
cpu1: timer 775435 1978
cpu2: timer 775251 1977
cpu3: timer 775364 1977
Total 3327028 8487
dmesg (without msi):
Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
1994
The Regents of the University of California. All rights
reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.2-PRERELEASE #6: Mon Apr 13 13:30:07 CEST 2009
adm...@space.neurobiopsychologie.Uni-Osnabrueck.DE:/usr/obj/usr/
src/sys/SPACE
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU E5410 @ 2.33GHz (2327.51-MHz K8-
class CPU)
Origin = "GenuineIntel" Id = 0x10676 Stepping = 6
Features
=
0xbfebfbff
<
FPU
,VME
,DE
,PSE
,TSC
,MSR
,PAE
,MCE
,CX8
,APIC
,SEP
,MTRR
,PGE
,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Features2
=
0xce3bd
<SSE3,RSVD2,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA,<b19>>
AMD Features=0x20100800<SYSCALL,NX,LM>
AMD Features2=0x1<LAHF>
Cores per package: 4
usable memory = 4280475648 (4082 MB)
avail memory = 4107509760 (3917 MB)
ACPI APIC Table: <PTLTD APIC >
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
cpu0 (BSP): APIC ID: 0
cpu1 (AP): APIC ID: 1
cpu2 (AP): APIC ID: 2
cpu3 (AP): APIC ID: 3
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
ioapic2 <Version 2.0> irqs 48-71 on motherboard
kbd1 at kbdmux0
acpi0: <PTLTD XSDT> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff
on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> irq 48 at device 1.0 on pci0
pci1: <ACPI PCI bus> on pcib1
3ware device driver for 9000 series storage controllers, version:
3.70.05.001
twa0: <3ware 9000 series Storage Controller> port 0x2000-0x20ff mem
0xd8000000-0xd9ffffff,0xdc100000-0xdc100fff irq 48 at device 0.0 on
pci1
twa0: [ITHREAD]
twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=3
twa0: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-8LPML, 8
ports, Firmware FE9X 4.06.00.004, BIOS BE9X 4.05.00.015
pcib2: <ACPI PCI-PCI bridge> irq 50 at device 3.0 on pci0
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> irq 52 at device 5.0 on pci0
pci3: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> irq 54 at device 7.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> irq 54 at device 0.0 on pci4
pci5: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> irq 54 at device 0.0 on pci5
pci6: <ACPI PCI bus> on pcib6
twa1: <3ware 9000 series Storage Controller> port 0x3000-0x30ff mem
0xda000000-0xdbffffff,0xdc400000-0xdc400fff irq 54 at device 0.0 on
pci6
twa1: [ITHREAD]
twa1: INFO: (0x04: 0x0001): Controller reset occurred: resets=3
twa1: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-8LPML, 8
ports, Firmware FE9X 4.06.00.004, BIOS BE9X 4.05.00.015
pcib7: <ACPI PCI-PCI bridge> at device 0.3 on pci4
pci7: <ACPI PCI bus> on pcib7
pcib8: <ACPI PCI-PCI bridge> irq 56 at device 9.0 on pci0
pci8: <ACPI PCI bus> on pcib8
igb0: <Intel(R) PRO/1000 Network Connection version - 1.4.1> port
0x4000-0x401f mem 0xdc020000-0xdc03ffff,0xdc000000-0xdc01ffff,
0xdc080000-0xdc083fff irq 56 at device 0.0 on pci8
igb0: [FILTER]
igb0: Ethernet address: 00:30:48:c2:35:76
igb1: <Intel(R) PRO/1000 Network Connection version - 1.4.1> port
0x4020-0x403f mem 0xdc060000-0xdc07ffff,0xdc040000-0xdc05ffff,
0xdc084000-0xdc087fff irq 70 at device 0.1 on pci8
igb1: [FILTER]
igb1: Ethernet address: 00:30:48:c2:35:77
pci0: <base peripheral> at device 15.0 (no driver attached)
pcib9: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
pci9: <ACPI PCI bus> on pcib9
uhci0: <Intel 631XESB/632XESB/3100 USB controller USB-1> port
0x1800-0x181f irq 20 at device 29.0 on pci0
uhci0: [GIANT-LOCKED]
uhci0: [ITHREAD]
usb0: <Intel 631XESB/632XESB/3100 USB controller USB-1> on uhci0
usb0: USB revision 1.0
uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0
uhub0: 2 ports with 2 removable, self powered
uhci1: <Intel 631XESB/632XESB/3100 USB controller USB-2> port
0x1820-0x183f irq 21 at device 29.1 on pci0
uhci1: [GIANT-LOCKED]
uhci1: [ITHREAD]
usb1: <Intel 631XESB/632XESB/3100 USB controller USB-2> on uhci1
usb1: USB revision 1.0
uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1
uhub1: 2 ports with 2 removable, self powered
uhci2: <Intel 631XESB/632XESB/3100 USB controller USB-3> port
0x1840-0x185f irq 22 at device 29.2 on pci0
uhci2: [GIANT-LOCKED]
uhci2: [ITHREAD]
usb2: <Intel 631XESB/632XESB/3100 USB controller USB-3> on uhci2
usb2: USB revision 1.0
uhub2: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2
uhub2: 2 ports with 2 removable, self powered
ehci0: <Intel 63XXESB USB 2.0 controller> mem 0xdc704000-0xdc7043ff
irq 23 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
ehci0: [ITHREAD]
usb3: EHCI version 1.0
usb3: companion controllers, 2 ports each: usb0 usb1 usb2
usb3: <Intel 63XXESB USB 2.0 controller> on ehci0
usb3: USB revision 2.0
uhub3: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb3
uhub3: 6 ports with 6 removable, self powered
ums0: <Peppercon AG Multidevice, class 0/0, rev 2.00/0.01, addr 2> on
uhub3
ums0: 3 buttons and Z dir.
ukbd0: <Peppercon AG Multidevice, class 0/0, rev 2.00/0.01, addr 2> on
uhub3
kbd2 at ukbd0
pcib10: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci10: <ACPI PCI bus> on pcib10
vgapci0: <VGA-compatible display> port 0x5000-0x50ff mem
0xd0000000-0xd7ffffff,0xdc200000-0xdc20ffff irq 18 at device 1.0 on
pci10
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel 63XXESB2 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1860-0x186f at device 31.1 on
pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
atapci1: <Intel AHCI controller> port 0x18b0-0x18b7,0x18a8-0x18ab,
0x18a0-0x18a7,0x1874-0x1877,0x1880-0x189f mem 0xdc704400-0xdc7047ff
irq 17 at device 31.2 on pci0
atapci1: [ITHREAD]
atapci1: AHCI Version 01.10 controller with 6 ports detected
ata2: <ATA channel 0> on atapci1
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci1
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci1
ata4: [ITHREAD]
ata5: <ATA channel 3> on atapci1
ata5: [ITHREAD]
ata6: <ATA channel 4> on atapci1
ata6: [ITHREAD]
ata7: <ATA channel 5> on atapci1
ata7: [ITHREAD]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_button0: <Power Button> on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: [ITHREAD]
psm0: model IntelliMouse, device ID 3
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: configured irq 4 not in bitmap of probed irqs 0
sio0: port may not be enabled
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10
on acpi0
sio0: type 16550A
sio0: [FILTER]
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
sio1: [FILTER]
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
acpi0
fdc0: does not respond
device_attach: fdc0 attach returned 6
cpu0: <ACPI CPU> on acpi0
ACPI Error (psargs-0459): [\\_SB_.BCMD] Namespace lookup failure,
AE_NOT_FOUND
ACPI Error (psparse-0626): Method parse/execution failed [\
\_PR_.CPU0._OSC] (Node 0xffffff0001608c20), AE_NOT_FOUND
ACPI Error (psparse-0626): Method parse/execution failed [\
\_PR_.CPU0._PDC] (Node 0xffffff0001608c40), AE_NOT_FOUND
ACPI Error (psargs-0459): [\\_SB_.BCMD] Namespace lookup failure,
AE_NOT_FOUND
ACPI Error (psparse-0626): Method parse/execution failed [\
\_PR_.CPU0._OSC] (Node 0xffffff0001608c20), AE_NOT_FOUND
coretemp0: <CPU On-Die Thermal Sensors> on cpu0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
cpu1: <ACPI CPU> on acpi0
coretemp1: <CPU On-Die Thermal Sensors> on cpu1
est1: <Enhanced SpeedStep Frequency Control> on cpu1
p4tcc1: <CPU Frequency Thermal Control> on cpu1
cpu2: <ACPI CPU> on acpi0
coretemp2: <CPU On-Die Thermal Sensors> on cpu2
est2: <Enhanced SpeedStep Frequency Control> on cpu2
p4tcc2: <CPU Frequency Thermal Control> on cpu2
cpu3: <ACPI CPU> on acpi0
coretemp3: <CPU On-Die Thermal Sensors> on cpu3
est3: <Enhanced SpeedStep Frequency Control> on cpu3
p4tcc3: <CPU Frequency Thermal Control> on cpu3
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on
acpi0
fdc0: does not respond
device_attach: fdc0 attach returned 6
ipmi0: <IPMI System Interface> on isa0
ipmi0: KCS mode found at io 0xca2 alignment 0x1 on isa
orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xcb000-0xcd7ff,
0xcd800-0xcf7ff,0xcf800-0xcffff on isa0
ppc0: cannot reserve I/O port range
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
isa0
Timecounters tick every 1.000 msec
acd0: DVDROM <DVD-ROM UJDA780/1.50> at ata0-slave UDMA33
ad4: 238475MB <Seagate ST3250310NS SN06> at ata2-master SATA150
ad6: 238475MB <Seagate ST3250310NS SN06> at ata3-master SATA300
ipmi0: IPMI device rev. 1, firmware rev. 1.2, version 2.0
ipmi0: Number of channels 8
ipmi0: Attached watchdog
da0 at twa0 bus 0 target 0 lun 0
da0: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da0: 100.000MB/s transfers
da0: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da1 at twa0 bus 0 target 1 lun 0
da1: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da1: 100.000MB/s transfers
da1: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da2 at twa0 bus 0 target 2 lun 0
da2: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da2: 100.000MB/s transfers
da2: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da3 at twa0 bus 0 target 3 lun 0
da3: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da3: 100.000MB/s transfers
da3: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da4 at twa0 bus 0 target 4 lun 0
da4: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da4: 100.000MB/s transfers
da4: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da5 at twa0 bus 0 target 5 lun 0
da5: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da5: 100.000MB/s transfers
da5: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da6 at twa0 bus 0 target 6 lun 0
da6: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da6: 100.000MB/s transfers
da6: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da7 at twa0 bus 0 target 7 lun 0
da7: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da7: 100.000MB/s transfers
da7: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da8 at twa1 bus 0 target 0 lun 0
da8: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da8: 100.000MB/s transfers
da8: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da9 at twa1 bus 0 target 1 lun 0
da9: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da9: 100.000MB/s transfers
da9: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da10 at twa1 bus 0 target 2 lun 0
da10: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da10: 100.000MB/s transfers
da10: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da11 at twa1 bus 0 target 3 lun 0
da11: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da11: 100.000MB/s transfers
da11: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da12 at twa1 bus 0 target 4 lun 0
da12: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da12: 100.000MB/s transfers
da12: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da13 at twa1 bus 0 target 5 lun 0
da13: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da13: 100.000MB/s transfers
da13: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da14 at twa1 bus 0 target 6 lun 0
da14: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da14: 100.000MB/s transfers
da14: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
da15 at twa1 bus 0 target 7 lun 0
da15: <AMCC 9650SE-8LP DISK 4.06> Fixed Direct Access SCSI-5 device
da15: 100.000MB/s transfers
da15: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C)
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
On Apr 15, 5:15 am, free...@hub.org ("Marc G. Fournier") wrote:
> --==========FBEC849F7CF9A3F6439C========= > Content-Type: text/plain;
charset=us-ascii
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> Hi ...
> Over the past little while, two of my servers have suddenly
started to hang
> ... servers that up until this started, have been reasonably rock
solid ...
> they are generally within a day of each other for source code, and
the hardware
> on both are pretty much identical (HP Proliant DL360 Servers) ...
> I have serial console configured on both so that I can do CR ~ ^b
to get to
> DDB ... except, when it hangs, all I get is:
> "KDB: enter: Break sequence on console"
> And it hangs there, no prompt.
> I setup a simple script (see attached) to run every 5 minutes
that gathers
> various pieces of info that I think are pertinent, but most likely
don't cover
> everything ...
> Whenever this happens, on either machine, vmstat show data *like*
(notice the
> high procs -> w values?):
> procs memory page disks
faults cpu
> r b w avm fre flt re pi po fr sr da0 pa0 in
sy cs us sy
> id
> 165 106 2 12699168 33840 3080 38 2 2 3082 1623 0 0
337 36961 4731
> 18 7 75
> 64 75 4 12761744 23084 46809 623 65 43 19307 116 334 0 1189
83674 11708
> 70 20 10
> 1 68 25 12773980 23068 11036 3003 9 36 4055 116 282 0 1336
78346 14869
> 56 16 28
> 0 71 25 12774236 23084 186 769 1 5 18 80 249 0 609
9298 5894 5
> 5 91
> 5 90 31 12747296 23352 626 2546 5 104 1147 368 281 0 1536
40945 19980
> 6 5 90
> Where procs -> w just seems to keep rising ... note that the
output for
> vmstat *5 minutes before* shows:
> procs memory page disks
faults cpu
> r b w avm fre flt re pi po fr sr da0 pa0 in
sy cs us sy
> id
> 35 121 0 12414692 90552 3080 32 2 1 3090 1403 0 0 337
37022 4730
> 18 7 75
> 31 93 0 12314408 62024 36550 414 46 6 34285 27 563 0 916
94851 8813 67
> 33 0
> 43 179 0 12270932 23080 24035 101 41 12 13887 36 375 0 766
61969 6945
> 69 23 7
> 92 44 0 12265524 119804 2122 2028 1 32 13051 1096092 205 0
558 19460
> 4561 19 50 32
> 38 34 0 12330068 89140 30758 103 39 119 37037 2837365 165 0
773 92041
> 7111 47 53 0
> I have one QEMU VPS running on this box, with kqemu running the
latest kernel
> module ... but the other machine experiencing the same issue is
only running
> FreeBSD jails ...
> Both servers are running SCHED_4BSD, if that matters any ... ?
> I'm at a loss as to what to look at / for next ... pointers would
be greatly
> appreciated ...
> I have the various output files that the script generates
available if anyone
> thinks they would be useful ...
> thank you ...
> Marc G. Fournier Hub.Org Hosting Solutions S.A. (http://www.hub.org
)
> Email . scra...@hub.org MSN .
scra...@hub.org
> Yahoo . yscrappy Skype: hub.org ICQ . 7615664