On Sat, Jul 22, 2017 at 01:12:29PM -0700, Don Lewis
wrote:> On 21 Jul, G. Paul Ziemba wrote:
> >>Your best bet for a quick workaround for the stack overflow would
be to
> >>rebuild the kernel with a larger value of KSTACK_PAGES. You can
find
> >>teh default in /usr/src/sys/<arch>/conf/NOTES.
I bumped it from the default 4 to 5 in /boot/loader.conf:
kern.kstack_pages=5
and that prevented this crash. Uptime 5.5 hours at this point (instead of
1.5 minutes).
So what's the down-side of increasing kstack_pages? What if I made it 10?
I see comments elsewhere about reducing space for user-mode threads but I'm
not sure what that means in practical terms, or if there is some other
overarching tuning parameter that should also be increased.
> Page size is 4096.
Ah, I forgot to count the 2^0 bit.
> It's interesting that you are running into this on amd64. Usually i386
> is the problem child.
Maybe stack frames are bigger due to 64-bit variables? (And of course
we get paid mostly for adding code, not so much for removing it)
> >>It would probably be a good idea to compute the differences in the
stack
> >>pointer values between adjacent stack frames to see of any of them
are
> >>consuming an excessive amount of stack space.
For our collective amusement, I noted the stack pointer for each frame and
calculated frame size and cumulative stack consumption. If there is some
other stack overhead not shown in the trace, I can see it going over 0x4000:
Frame Stack Pointer sz cumu function
----- ------------- --- ---- ----------------
44 0xfffffe085cfa8a10 amd64_syscall
43 0xfffffe085cfa88b0 160 160 syscallenter
42 0xfffffe085cfa87f0 220 180 sys_execve
41 0xfffffe085cfa87c0 30 1B0 kern_execve
40 0xfffffe085cfa8090 730 8E0 do_execve
39 0xfffffe085cfa7ec0 1D0 AB0 namei
38 0xfffffe085cfa7d40 180 C30 lookup
37 0xfffffe085cfa7cf0 50 C80 VOP_LOOKUP
36 0xfffffe085cfa7c80 70 CF0 VOP_LOOKUP_APV
35 0xfffffe085cfa7650 630 1320 nfs_lookup
34 0xfffffe085cfa75f0 60 1380 VOP_ACCESS
33 0xfffffe085cfa7580 70 13F0 VOP_ACCESS_APV
32 0xfffffe085cfa7410 170 1560 nfs_access
31 0xfffffe085cfa7240 1D0 1730 nfs34_access_otw
30 0xfffffe085cfa7060 1E0 1910 nfsrpc_accessrpc
29 0xfffffe085cfa6fb0 B0 19C0 nfscl_request
28 0xfffffe085cfa6b20 490 1E50 newnfs_request
27 0xfffffe085cfa6980 1A0 1FF0 clnt_reconnect_call
26 0xfffffe085cfa6520 460 2450 clnt_vc_call
25 0xfffffe085cfa64c0 60 24B0 sosend
24 0xfffffe085cfa6280 240 26F0 sosend_generic
23 0xfffffe085cfa6110 170 2860 tcp_usr_send
22 0xfffffe085cfa5ca0 470 2CD0 tcp_output
21 0xfffffe085cfa5900 3A0 3070 ip_output
20 0xfffffe085cfa5880 80 30F0 looutput
19 0xfffffe085cfa5800 80 3170 if_simloop
18 0xfffffe085cfa57d0 30 31A0 netisr_queue
17 0xfffffe085cfa5780 50 31F0 netisr_queue_src
16 0xfffffe085cfa56f0 90 3280 netisr_queue_internal
15 0xfffffe085cfa56a0 50 32D0 swi_sched
14 0xfffffe085cfa5620 80 3350 intr_event_schedule_thread
13 0xfffffe085cfa55b0 70 33C0 sched_add
12 0xfffffe085cfa5490 120 34E0 sched_pickcpu
11 0xfffffe085cfa5420 70 3550 sched_lowest
10 0xfffffe085cfa52a0 180 36D0 cpu_search_lowest
9 0xfffffe085cfa52a0 0 36D0 cpu_search
8 0xfffffe085cfa5120 180 3850 cpu_search_lowest
7 0xfffffe085cfa5120 0 3850 cpu_search
6 0xfffffe085cfa4fa0 180 39D0 cpu_search_lowest
5 0xfffffe0839778f40 signal handler
--
G. Paul Ziemba
FreeBSD unix:
4:36PM up 5:28, 8 users, load averages: 6.53, 7.79, 7.94