That's really strange. I never saw those kinds of deadlocks, but I did
notice that if I kept the cpu busy using distributed.net I could keep the
full system lockups away for at least a week if not longer.
Not to keep harping on it, but what worked for me was lowering the memory
speed. I'm at 11 days of uptime so far without anything running the cpu.
Before the change it would lock up anywhere from an hour to a day.
On Tue, Jan 30, 2018 at 4:39 PM Mike Tancsa <mike at sentex.net> wrote:
> On 1/30/2018 2:51 PM, Mike Tancsa wrote:
> >
> > And sadly, I am still able to hang the compile in about the same
place.
> > However, if I set
>
>
> OK, here is a sort of work around. If I have the box a little more busy,
> I can avoid whatever deadlock is going on. In another console I have
> cat /dev/urandom | sha256
> running while the build runs
>
> ... and I can compile net/samba47 from scratch without the compile
> hanging. This problem also happens on HEAD from today. Should I start
> a new thread on freebsd-current ? Or just file a bug report ?
> The compile worked 4/4
>
> ---Mike
>
>
>
>
>
>
>
>
>
>
> >
> > hw.lower_amd64_sharedpage=0
> >
> > it seems to hang in a different way. CTRL+t shows
> >
> > load: 0.43 cmd: python2.7 15736 [umtxn] 165.00r 14.46u 6.65s 0%
233600k
> > make[1]: Working in: /usr/ports/net/samba47
> > make: Working in: /usr/ports/net/samba47
> >
> >
> > # procstat -t 15736
> > PID TID COMM TDNAME CPU PRI STATE
> > WCHAN
> > 15736 100855 python2.7 - -1 152 sleep
> > usem
> > 15736 100956 python2.7 - -1 124 sleep
> > umtxn
> > 15736 100957 python2.7 - -1 126 sleep
> > umtxn
> > 15736 100958 python2.7 - -1 124 sleep
> > umtxn
> > 15736 100959 python2.7 - -1 127 sleep
> > umtxn
> > 15736 100960 python2.7 - -1 126 sleep
> > umtxn
> > 15736 100961 python2.7 - -1 126 sleep
> > umtxn
> > 15736 100962 python2.7 - -1 126 sleep
> > umtxn
> > 15736 100963 python2.7 - -1 126 sleep
> > umtxn
> > 15736 100964 python2.7 - -1 127 sleep
> > umtxn
> > 15736 100965 python2.7 - -1 126 sleep
> > umtxn
> > 15736 100966 python2.7 - -1 126 sleep
> > umtxn
> > 15736 100967 python2.7 - -1 126 sleep
> > umtxn
> >
> > # procstat -kk 15736
> > PID TID COMM TDNAME KSTACK
> >
> > 15736 100855 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100956 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100957 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100958 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100959 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100960 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100961 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100962 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100963 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100964 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100965 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100966 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15736 100967 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> >
> > If I kill the make, reboot and just type make, it completes after the
> > reboot. If after the reboot, I do an rm -R work, it will hang again.
> > With the default of
> > hw.lower_amd64_sharedpage: 1
> > post reboot,
> >
> > CTRL+T shows
> > load: 2.73 cmd: python2.7 15703 [usem] 40.92r 12.34u 3.45s 0% 233640k
> > make[1]: Working in: /usr/ports/net/samba47
> > make: Working in: /usr/ports/net/samba47
> >
> >
> >
> > root at amdtestr12:/home/mdtancsa # procstat -kk 15703
> > PID TID COMM TDNAME KSTACK
> >
> > 15703 100824 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100956 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100957 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100958 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100959 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100960 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100961 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100962 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100963 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100964 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100965 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_lock_umutex+0x885 __umtx_op_wait_umutex+0x48
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100966 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > 15703 100967 python2.7 - mi_switch+0xf5
> > sleepq_catch_signals+0x405 sleepq_wait_sig+0xf _sleep+0x231
> > umtxq_sleep+0x143 do_sem2_wait+0x68a __umtx_op_sem2_wait+0x4b
> > amd64_syscall+0xa48 fast_syscall_common+0xfc
> > root at amdtestr12:/home/mdtancsa # procstat -t 15703
> > PID TID COMM TDNAME CPU PRI STATE
> > WCHAN
> > 15703 100824 python2.7 - -1 152 sleep
> > usem
> > 15703 100956 python2.7 - -1 125 sleep
> > usem
> > 15703 100957 python2.7 - -1 127 sleep
> > usem
> > 15703 100958 python2.7 - -1 125 sleep
> > usem
> > 15703 100959 python2.7 - -1 125 sleep
> > usem
> > 15703 100960 python2.7 - -1 126 sleep
> > usem
> > 15703 100961 python2.7 - -1 126 sleep
> > usem
> > 15703 100962 python2.7 - -1 126 sleep
> > usem
> > 15703 100963 python2.7 - -1 126 sleep
> > usem
> > 15703 100964 python2.7 - -1 126 sleep
> > usem
> > 15703 100965 python2.7 - -1 126 sleep
> > umtxn
> > 15703 100966 python2.7 - -1 126 sleep
> > usem
> > 15703 100967 python2.7 - -1 125 sleep
> > usem
> > root at amdtestr12:/home/mdtancsa #
> >
> >
> > ---Mike
> >
> >
> >>
> >>
------------------------------------------------------------------------
> >> r321608 | kib | 2017-07-27 01:37:07 -0700 (Thu, 27 Jul 2017) | 9
lines
> >>
> >> Use MFENCE to serialize RDTSC on non-Intel CPUs.
> >>
> >> Kernel already used the stronger barrier instruction for AMDs,
correct
> >> the userspace fast gettimeofday() implementation as well.
> >>
> >>
> >>
> >> I did go back and look at the build runaways that I've
occasionally seen
> >> on my AMD FX-8320E package builder. I haven't seen the python
issue
> >> there, but have seen gmake get stuck in a sleeping state with a
bunch of
> >> zombie offspring.
> >>
> >>
> >
> >
>
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400 <(519)%20651-3400>
> Sentex Communications, mike at sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at
freebsd.org"
>
--
--
Nimrod