On Sun, 2019-08-25 at 15:03 +0300, Konstantin Belousov wrote:> On Sun, Aug 25, 2019 at 12:40:22AM +0200, Trond Endrest?l wrote: > > On Sun, 25 Aug 2019 01:28+0300, Konstantin Belousov wrote: > > > > > On Sun, Aug 25, 2019 at 12:19:43AM +0200, Trond Endrest?l wrote: > > > > On Sat, 24 Aug 2019 23:41+0300, Konstantin Belousov wrote: > > > > > > I tried changing command="/usr/sbin/${name}" to > > > > > > command="/usr/bin/proccontrol -m aslr -s disable /usr/sbin/${name}" in > > > > > > /etc/rc.d/ntpd, but that didn't go well. > > > > > > > > > > If you set kern.elf64.aslr.stack_gap to zero, does it help ? > > > > > > > > That helped. Thank you again. > > > > > > Can you verify is ntpd sets new rlimit(RLIMIT_STACK) for the main thread, > > > and if yes, what this new limit is ? > > > > (gdb) > > 5265 if (-1 == setrlimit(RLIMIT_STACK, &rl)) { > > (gdb) print rl > > $1 = {rlim_cur = 204800, rlim_max = 536870912} > > So they set the stack limit to 200K, am I right ? I suspect they do > that because ntpd wires entire process address space, so 512M blows off > all limits on wiring. > > I do not have a good idea how to make this behaviour compatible with > the gap. Might be we can change the gap sizing parameter to KBs instead > of percentage, and set the defaults in 64KB range. > > > > > > aslr.stack_gap is the percentage for the gap on that stack, and since > > > default size of the main stack limit is quite large 512M, even 3% > > > (default gap upper limit) are whole 15M. If the new limit is less than > > > 15M, there is a likely probability that only the gap is left after the > > > rlimit(2) call, leaving no space for the program frames. > > > > > > At least this looks like a nice theory.So is the problem here that before ntpd is running and has the chance to call setrlimit(), aslr has already created a large stack gap? If so, it seems to me that aslr and setrlimit(RLIMIT_STACK, ...) are never going to work right together. Even if the default stack gap were much smaller, code using RLIMIT_STACK is going to end up with a stack smaller than it asked for because the gap it has no way of knowing about uses up some part (or all of) the limited space. If the default gap were 64K or less, things would be much more likely to work accidentally (and we might never have noticed this situtation), but they still wouldn't be working correctly. Is it possible for the code on the kernel side to add the requested limit to the gap size to generate a result that gives the caller the usable stack size they asked for? -- Ian
Konstantin Belousov
2019-Sep-10 17:48 UTC
ntpd doesn't like ASLR on stable/12 post-r350672
On Tue, Sep 10, 2019 at 10:50:33AM -0600, Ian Lepore wrote:> On Sun, 2019-08-25 at 15:03 +0300, Konstantin Belousov wrote: > > On Sun, Aug 25, 2019 at 12:40:22AM +0200, Trond Endrest?l wrote: > > > On Sun, 25 Aug 2019 01:28+0300, Konstantin Belousov wrote: > > > > > > > On Sun, Aug 25, 2019 at 12:19:43AM +0200, Trond Endrest?l wrote: > > > > > On Sat, 24 Aug 2019 23:41+0300, Konstantin Belousov wrote: > > > > > > > I tried changing command="/usr/sbin/${name}" to > > > > > > > command="/usr/bin/proccontrol -m aslr -s disable /usr/sbin/${name}" in > > > > > > > /etc/rc.d/ntpd, but that didn't go well. > > > > > > > > > > > > If you set kern.elf64.aslr.stack_gap to zero, does it help ? > > > > > > > > > > That helped. Thank you again. > > > > > > > > Can you verify is ntpd sets new rlimit(RLIMIT_STACK) for the main thread, > > > > and if yes, what this new limit is ? > > > > > > (gdb) > > > 5265 if (-1 == setrlimit(RLIMIT_STACK, &rl)) { > > > (gdb) print rl > > > $1 = {rlim_cur = 204800, rlim_max = 536870912} > > > > So they set the stack limit to 200K, am I right ? I suspect they do > > that because ntpd wires entire process address space, so 512M blows off > > all limits on wiring. > > > > I do not have a good idea how to make this behaviour compatible with > > the gap. Might be we can change the gap sizing parameter to KBs instead > > of percentage, and set the defaults in 64KB range. > > > > > > > > > aslr.stack_gap is the percentage for the gap on that stack, and since > > > > default size of the main stack limit is quite large 512M, even 3% > > > > (default gap upper limit) are whole 15M. If the new limit is less than > > > > 15M, there is a likely probability that only the gap is left after the > > > > rlimit(2) call, leaving no space for the program frames. > > > > > > > > At least this looks like a nice theory. > > So is the problem here that before ntpd is running and has the chance > to call setrlimit(), aslr has already created a large stack gap? If > so, it seems to me that aslr and setrlimit(RLIMIT_STACK, ...) are never > going to work right together. Even if the default stack gap were much > smaller, code using RLIMIT_STACK is going to end up with a stack > smaller than it asked for because the gap it has no way of knowing > about uses up some part (or all of) the limited space.Sort of, yes. There is a UI problem with the control for the gap, and I am not sure how to fix it.> > If the default gap were 64K or less, things would be much more likely > to work accidentally (and we might never have noticed this situtation), > but they still wouldn't be working correctly. Is it possible for the > code on the kernel side to add the requested limit to the gap size to > generate a result that gives the caller the usable stack size they > asked for?I do not see a way to account for the gap in RLIMIT_STACK adjustment. It should be handled in cooperation with the user code. When program does adjust RLIMIT_STACK in so radical way, e.g. setting it to 64k, it must know a lot about execution environment.