Working on the Dell R420 today, got most of it working, even the broadcom ethernet cards! However, I get the following when I reboot the system: Syncing disks, vnodes remaining...4 Sleeping thread (tid 100107, pid 9) owns a non-sleepable lock KDB: stack backtrace of thread 100107: sched_switch() at sched_switch+0x19f mi_switch() at mi_switch+0x208 sleepq_switch() at sleepq_switch+0xfc sleepq_wait() at sleepq_wait+0x4d _sleep() at _sleep+0x3f6 ipmi_submit_driver_request() at ipmi_submit_driver_request+0x97 ipmi_set_watchdog() at ipmi_set_watchdog+0xb1 ipmi_wd_event() at ipmi_wd_event+0x8f kern_do_pat() at kern_do_pat+0x10f sched_sync() at sched_sync+0x1ea fork_exit() at fork_exit+0x135 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff869b172bb0, rbp = 0 --- panic: sleeping thread cpuid = 26 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 panic() at panic+0x1d8 propagate_priority() at propagate_priority+0x223 turnstile_wait() at turnstile_wait+0x252 _mtx_lock_sleep() at _mtx_lock_sleep+0x124 _mtx_lock_flags() at _mtx_lock_flags+0xae vn_syncer_add_to_worklist() at vn_syncer_add_to_worklist+0x3d reassignbuf() at reassignbuf+0x12c bdirty() at bdirty+0x50 softdep_disk_write_complete() at softdep_disk_write_complete+0x19f bufdone_finish() at bufdone_finish+0x2d bufdone() at bufdone+0x6c g_io_schedule_up() at g_io_schedule_up+0xce g_up_procbody() at g_up_procbody+0x72 fork_exit() at fork_exit+0x135 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff85d8a93bb0, rbp = 0 --- Uptime: 1m59s Dumping 1219 out of 24477 MB:panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep Fatal double fault rip = 0xffffffff807ac9d5 rsp = 0xffffff85d8a90000 rbp = 0xffffff85d8a90020 cpuid = 26; apic id = 2a panic: double fault cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s panic: msleep cpuid = 26 Uptime: 1m59s Rebooting... cpu_reset: Restarting BSP cpu_reset_proxy: Stopped CPU 26
On Thursday, July 19, 2012 7:58:14 pm Sean Bruno wrote:> Working on the Dell R420 today, got most of it working, even the > broadcom ethernet cards! However, I get the following when I reboot the > system: > > Syncing disks, vnodes remaining...4 Sleeping thread (tid 100107, pid 9) > owns a non-sleepable lock > KDB: stack backtrace of thread 100107: > sched_switch() at sched_switch+0x19f > mi_switch() at mi_switch+0x208 > sleepq_switch() at sleepq_switch+0xfc > sleepq_wait() at sleepq_wait+0x4d > _sleep() at _sleep+0x3f6 > ipmi_submit_driver_request() at ipmi_submit_driver_request+0x97 > ipmi_set_watchdog() at ipmi_set_watchdog+0xb1 > ipmi_wd_event() at ipmi_wd_event+0x8f > kern_do_pat() at kern_do_pat+0x10f > sched_sync() at sched_sync+0x1ea > fork_exit() at fork_exit+0x135 > fork_trampoline() at fork_trampoline+0xeHmmm, the watchdog pat should probably happen without holding locks if possible. This is related to the IPMI watchdog being special and wanting to schedule a thread to work. -- John Baldwin
On Wednesday, August 01, 2012 6:48:48 pm Sean Bruno wrote:> On Wed, 2012-08-01 at 05:53 -0700, John Baldwin wrote: > > Index: vfs_subr.c > > ==================================================================> > --- vfs_subr.c (revision 238969) > > +++ vfs_subr.c (working copy) > > @@ -1868,8 +1868,11 @@ sched_sync(void) > > continue; > > } > > > > - if (first_printf == 0) > > + if (first_printf == 0) { > > + mtx_unlock(&sync_mtx); > > wdog_kern_pat(WD_LASTVAL); > > + mtx_lock(&sync_mtx); > > + } > > > > } > > if (!LIST_EMPTY(gslp)) { > > > > > > -- > > John Baldwin > > This definitely makes the panic go away on reboot.Attilio, does this change seem ok to you? -- John Baldwin
On 8/2/12, John Baldwin <jhb@freebsd.org> wrote:> On Wednesday, August 01, 2012 6:48:48 pm Sean Bruno wrote: >> On Wed, 2012-08-01 at 05:53 -0700, John Baldwin wrote: >> > Index: vfs_subr.c >> > ==================================================================>> > --- vfs_subr.c (revision 238969) >> > +++ vfs_subr.c (working copy) >> > @@ -1868,8 +1868,11 @@ sched_sync(void) >> > continue; >> > } >> > >> > - if (first_printf == 0) >> > + if (first_printf == 0) { >> > + mtx_unlock(&sync_mtx); >> > wdog_kern_pat(WD_LASTVAL); >> > + mtx_lock(&sync_mtx); >> > + } >> > >> > } >> > if (!LIST_EMPTY(gslp)) { >> > >> > >> > -- >> > John Baldwin >> >> This definitely makes the panic go away on reboot. > > Attilio, does this change seem ok to you?Thanks for asking me to review. I think it is safe because we are going to use LIST_EMPTY() on the global list anyway as next check. Attilio -- Peace can only be achieved by understanding - A. Einstein
On Wednesday, August 01, 2012 6:48:48 pm Sean Bruno wrote:> On Wed, 2012-08-01 at 05:53 -0700, John Baldwin wrote: > > Index: vfs_subr.c > > ==================================================================> > --- vfs_subr.c (revision 238969) > > +++ vfs_subr.c (working copy) > > @@ -1868,8 +1868,11 @@ sched_sync(void) > > continue; > > } > > > > - if (first_printf == 0) > > + if (first_printf == 0) { > > + mtx_unlock(&sync_mtx); > > wdog_kern_pat(WD_LASTVAL); > > + mtx_lock(&sync_mtx); > > + } > > > > } > > if (!LIST_EMPTY(gslp)) { > > > > > > -- > > John Baldwin > > This definitely makes the panic go away on reboot.Do you have watchdogd enabled at all? -- John Baldwin
On Thu, 2012-08-02 at 13:34 -0700, John Baldwin wrote:> On Wednesday, August 01, 2012 6:48:48 pm Sean Bruno wrote: > > On Wed, 2012-08-01 at 05:53 -0700, John Baldwin wrote: > > > Index: vfs_subr.c > > > ==================================================================> > > --- vfs_subr.c (revision 238969) > > > +++ vfs_subr.c (working copy) > > > @@ -1868,8 +1868,11 @@ sched_sync(void) > > > continue; > > > } > > > > > > - if (first_printf == 0) > > > + if (first_printf == 0) { > > > + mtx_unlock(&sync_mtx); > > > wdog_kern_pat(WD_LASTVAL); > > > + mtx_lock(&sync_mtx); > > > + } > > > > > > } > > > if (!LIST_EMPTY(gslp)) { > > > > > > > > > -- > > > John Baldwin > > > > This definitely makes the panic go away on reboot. > > Do you have watchdogd enabled at all? >No, we never had it enabled. Sean