On 09/16/16 at 09:18P, Slawa Olhovchenkov wrote:> On Thu, Sep 15, 2016 at 12:06:33PM +0300, Slawa Olhovchenkov wrote: > > > On Thu, Sep 15, 2016 at 11:59:38AM +0300, Konstantin Belousov wrote: > > > > > On Thu, Sep 15, 2016 at 12:35:04AM +0300, Slawa Olhovchenkov wrote: > > > > On Sun, Sep 04, 2016 at 06:46:12PM -0700, hiren panchasara wrote: > > > > > > > > > On 09/05/16 at 12:57P, Slawa Olhovchenkov wrote: > > > > > > I am try using 11.0 on Dual E5-2620 (no X2APIC). > > > > > > Under high network load and may be addtional conditional system go to > > > > > > unresponsible state -- no reaction to network and console (USB IPMI > > > > > > emulation). INVARIANTS give to high overhad. Is this exist some way to > > > > > > debug this? > > > > > > > > > > Can you panic it from console to get to db> to get backtrace and other > > > > > info when it goes unresponsive? > > > > > > > > ipmi console don't respond (chassis power diag don't react) > > > > login on sol console stuck on *tcp. > > > > > > Is 'login' you reference is the ipmi client state, or you mean login(1) > > > on the wedged host ? > > > > on the wedged host > > > > > If BMC stops responding simultaneously with the host, I would suspect > > > the hardware platform issues instead of a software problem. Do you have > > > dedicated LAN port for BMC ? > > > > Yes. > > But BMC emulate USB keyboard and this is may be lock inside USB > > system. > > "ipmi console don't respond" must be read as "ipmi console runnnig and > > attached but system don't react to keypress on this console". > > at the sime moment system respon to `enter` on ipmi sol console, but > > after enter `root` stuck in login in the '*tcp' state (I think this is > > NIS related). > > ~^B don't break to debuger. > But I can login to sol console.You can probably: debug.kdb.enter: set to enter the debugger or force a panic and get vmcore: debug.kdb.panic: set to panic the kernel Cheers, Hiren -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 603 bytes Desc: not available URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20160916/31c263f2/attachment.sig>
On Fri, Sep 16, 2016 at 11:30:53AM -0700, hiren panchasara wrote:> On 09/16/16 at 09:18P, Slawa Olhovchenkov wrote: > > On Thu, Sep 15, 2016 at 12:06:33PM +0300, Slawa Olhovchenkov wrote: > > > > > On Thu, Sep 15, 2016 at 11:59:38AM +0300, Konstantin Belousov wrote: > > > > > > > On Thu, Sep 15, 2016 at 12:35:04AM +0300, Slawa Olhovchenkov wrote: > > > > > On Sun, Sep 04, 2016 at 06:46:12PM -0700, hiren panchasara wrote: > > > > > > > > > > > On 09/05/16 at 12:57P, Slawa Olhovchenkov wrote: > > > > > > > I am try using 11.0 on Dual E5-2620 (no X2APIC). > > > > > > > Under high network load and may be addtional conditional system go to > > > > > > > unresponsible state -- no reaction to network and console (USB IPMI > > > > > > > emulation). INVARIANTS give to high overhad. Is this exist some way to > > > > > > > debug this? > > > > > > > > > > > > Can you panic it from console to get to db> to get backtrace and other > > > > > > info when it goes unresponsive? > > > > > > > > > > ipmi console don't respond (chassis power diag don't react) > > > > > login on sol console stuck on *tcp. > > > > > > > > Is 'login' you reference is the ipmi client state, or you mean login(1) > > > > on the wedged host ? > > > > > > on the wedged host > > > > > > > If BMC stops responding simultaneously with the host, I would suspect > > > > the hardware platform issues instead of a software problem. Do you have > > > > dedicated LAN port for BMC ? > > > > > > Yes. > > > But BMC emulate USB keyboard and this is may be lock inside USB > > > system. > > > "ipmi console don't respond" must be read as "ipmi console runnnig and > > > attached but system don't react to keypress on this console". > > > at the sime moment system respon to `enter` on ipmi sol console, but > > > after enter `root` stuck in login in the '*tcp' state (I think this is > > > NIS related). > > > > ~^B don't break to debuger. > > But I can login to sol console. > > You can probably: > debug.kdb.enter: set to enter the debugger > > or force a panic and get vmcore: > debug.kdb.panic: set to panic the kernel >I am still waiting to exit pmcstat. Oh, for NMI need not `ipmitool chassis power diag`! need `ipmitool power diag`! But debugger not entered: ^C^C^C^C^C^C^C^CNMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger NMI ... going to debugger load: 9.91 cmd: pmcstat 16878 [runnable] 5930.57r 0.00u 0.00s 0% 2940k
On Fri, Sep 16, 2016 at 11:30:53AM -0700, hiren panchasara wrote:> On 09/16/16 at 09:18P, Slawa Olhovchenkov wrote: > > On Thu, Sep 15, 2016 at 12:06:33PM +0300, Slawa Olhovchenkov wrote: > > > > > On Thu, Sep 15, 2016 at 11:59:38AM +0300, Konstantin Belousov wrote: > > > > > > > On Thu, Sep 15, 2016 at 12:35:04AM +0300, Slawa Olhovchenkov wrote: > > > > > On Sun, Sep 04, 2016 at 06:46:12PM -0700, hiren panchasara wrote: > > > > > > > > > > > On 09/05/16 at 12:57P, Slawa Olhovchenkov wrote: > > > > > > > I am try using 11.0 on Dual E5-2620 (no X2APIC). > > > > > > > Under high network load and may be addtional conditional system go to > > > > > > > unresponsible state -- no reaction to network and console (USB IPMI > > > > > > > emulation). INVARIANTS give to high overhad. Is this exist some way to > > > > > > > debug this? > > > > > > > > > > > > Can you panic it from console to get to db> to get backtrace and other > > > > > > info when it goes unresponsive? > > > > > > > > > > ipmi console don't respond (chassis power diag don't react) > > > > > login on sol console stuck on *tcp. > > > > > > > > Is 'login' you reference is the ipmi client state, or you mean login(1) > > > > on the wedged host ? > > > > > > on the wedged host > > > > > > > If BMC stops responding simultaneously with the host, I would suspect > > > > the hardware platform issues instead of a software problem. Do you have > > > > dedicated LAN port for BMC ? > > > > > > Yes. > > > But BMC emulate USB keyboard and this is may be lock inside USB > > > system. > > > "ipmi console don't respond" must be read as "ipmi console runnnig and > > > attached but system don't react to keypress on this console". > > > at the sime moment system respon to `enter` on ipmi sol console, but > > > after enter `root` stuck in login in the '*tcp' state (I think this is > > > NIS related). > > > > ~^B don't break to debuger. > > But I can login to sol console. > > You can probably: > debug.kdb.enter: set to enter the debugger > > or force a panic and get vmcore: > debug.kdb.panic: set to panic the kernelI am reset this host. PMC samples collected and decoded: @ CPU_CLK_UNHALTED_CORE [4653445 samples] 51.86% [2413083] lock_delay @ /boot/kernel.VSTREAM/kernel 100.0% [2413083] __rw_wlock_hard 100.0% [2413083] tcp_tw_2msl_scan 99.99% [2412958] pfslowtimo 100.0% [2412958] softclock_call_cc 100.0% [2412958] softclock 100.0% [2412958] intr_event_execute_handlers 100.0% [2412958] ithread_loop 100.0% [2412958] fork_exit 00.01% [125] tcp_twstart 100.0% [125] tcp_do_segment 100.0% [125] tcp_input 100.0% [125] ip_input 100.0% [125] swi_net 100.0% [125] intr_event_execute_handlers 100.0% [125] ithread_loop 100.0% [125] fork_exit 09.43% [438774] _rw_runlock_cookie @ /boot/kernel.VSTREAM/kernel 100.0% [438774] tcp_tw_2msl_scan 99.99% [438735] pfslowtimo 100.0% [438735] softclock_call_cc 100.0% [438735] softclock 100.0% [438735] intr_event_execute_handlers 100.0% [438735] ithread_loop 100.0% [438735] fork_exit 00.01% [39] tcp_twstart 100.0% [39] tcp_do_segment 100.0% [39] tcp_input 100.0% [39] ip_input 100.0% [39] swi_net 100.0% [39] intr_event_execute_handlers 100.0% [39] ithread_loop 100.0% [39] fork_exit 08.57% [398970] __rw_wlock_hard @ /boot/kernel.VSTREAM/kernel 100.0% [398970] tcp_tw_2msl_scan 99.99% [398940] pfslowtimo 100.0% [398940] softclock_call_cc 100.0% [398940] softclock 100.0% [398940] intr_event_execute_handlers 100.0% [398940] ithread_loop 100.0% [398940] fork_exit 00.01% [30] tcp_twstart 100.0% [30] tcp_do_segment 100.0% [30] tcp_input 100.0% [30] ip_input 100.0% [30] swi_net 100.0% [30] intr_event_execute_handlers 100.0% [30] ithread_loop 100.0% [30] fork_exit 05.79% [269224] __rw_try_rlock @ /boot/kernel.VSTREAM/kernel 100.0% [269224] tcp_tw_2msl_scan 99.99% [269203] pfslowtimo 100.0% [269203] softclock_call_cc 100.0% [269203] softclock 100.0% [269203] intr_event_execute_handlers 100.0% [269203] ithread_loop 100.0% [269203] fork_exit 00.01% [21] tcp_twstart 100.0% [21] tcp_do_segment 100.0% [21] tcp_input 100.0% [21] ip_input 100.0% [21] swi_net 100.0% [21] intr_event_execute_handlers 100.0% [21] ithread_loop 100.0% [21] fork_exit 05.35% [249141] _rw_wlock_cookie @ /boot/kernel.VSTREAM/kernel 99.76% [248543] tcp_tw_2msl_scan 99.99% [248528] pfslowtimo 100.0% [248528] softclock_call_cc 100.0% [248528] softclock 100.0% [248528] intr_event_execute_handlers 100.0% [248528] ithread_loop 100.0% [248528] fork_exit 00.01% [15] tcp_twstart 100.0% [15] tcp_do_segment 100.0% [15] tcp_input 100.0% [15] ip_input 100.0% [15] swi_net 100.0% [15] intr_event_execute_handlers 100.0% [15] ithread_loop 100.0% [15] fork_exit 00.24% [598] pfslowtimo 100.0% [598] softclock_call_cc 100.0% [598] softclock 100.0% [598] intr_event_execute_handlers 100.0% [598] ithread_loop 100.0% [598] fork_exit