David Woodhouse via llvm-dev
2018-Feb-09 02:21 UTC
[llvm-dev] retpoline mitigation and 6.0
On Fri, 2018-02-09 at 01:18 +0000, David Woodhouse wrote:> > For now I'm just going to attempt to work around it like this in the > kernel, so I can concentrate on the retpoline bits: > http://david.woodhou.se/clang-percpu-hack.patch32-bit doesn't boot. Built without CONFIG_RETPOLINE and with Clang 5.0 (and the above patch) it does. I'm rebuilding a Release build of llvm/clang so that experimental kernel builds hopefully take less than an hour, and will prod further in the morning. -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5213 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180209/bf3fe0ad/attachment.bin>
David Woodhouse via llvm-dev
2018-Feb-09 08:26 UTC
[llvm-dev] retpoline mitigation and 6.0
On Fri, 2018-02-09 at 02:21 +0000, David Woodhouse wrote:> On Fri, 2018-02-09 at 01:18 +0000, David Woodhouse wrote: > > > > > > For now I'm just going to attempt to work around it like this in the > > kernel, so I can concentrate on the retpoline bits: > > http://david.woodhou.se/clang-percpu-hack.patch > > 32-bit doesn't boot. Built without CONFIG_RETPOLINE and with Clang 5.0 > (and the above patch) it does. I'm rebuilding a Release build of > llvm/clang so that experimental kernel builds hopefully take less than > an hour, and will prod further in the morning.What is the intended ABI of __x86_indirect_thunk which I have been calling the "ret-equivalent" retpoline? I see this happening (I ♥ 'qemu -d in_asm')... ---------------- IN: 0xc136feea: 89 d8 movl %ebx, %eax 0xc136feec: 89 f2 movl %esi, %edx 0xc136feee: 8b 75 f0 movl -0x10(%ebp), %esi 0xc136fef1: 89 f1 movl %esi, %ecx 0xc136fef3: ff 75 e0 pushl -0x20(%ebp) 0xc136fef6: e8 c5 f3 58 00 calll 0xc18ff2c0 # __x86_indirect_thunk ---------------- IN: 0xc18ff2c0: c3 retl # Early boot, so it hasn't been turned into a proper retpoline yet ---------------- IN: 0xc136fefb: 8d 34 7e leal (%esi, %edi, 2), %esi (gdb) list *0xc136fef6 0xc136fef6 is in sort (lib/sort.c:87). 82 if (c < n - size && 83 cmp_func(base + c, base + c + size) < 0) 84 c += size; 85 if (cmp_func(base + r, base + c) >= 0) 86 break; 87 swap_func(base + r, base + c, size); 88 } 89 } 90 91 /* sort */ You're pushing the target (-0x20(%ebp)) onto the stack and then *calling* __x86_indirect_thunk. So it looks like you're expecting __x86_indirect_thunk to do something like call *4(%esp) ret ... except that final 'ret' still leaves the target address on the stack, so there would also need to be a complicated dance, without using any registers, to pop that too. I expected the emitted code for a *call* using the thunk to look more like jmp 2f 1: pushl -0x20(%ebp) # cmp_func jmp __x86_thunk_indirect # jmp, not call 2: call 1b # set up address for cmp_func to return to -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5213 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180209/084f16c2/attachment.bin>
Chandler Carruth via llvm-dev
2018-Feb-09 08:45 UTC
[llvm-dev] retpoline mitigation and 6.0
On Fri, Feb 9, 2018 at 12:26 AM David Woodhouse <dwmw2 at infradead.org> wrote:> > > On Fri, 2018-02-09 at 02:21 +0000, David Woodhouse wrote: > > On Fri, 2018-02-09 at 01:18 +0000, David Woodhouse wrote: > > > > > > > > > For now I'm just going to attempt to work around it like this in the > > > kernel, so I can concentrate on the retpoline bits: > > > http://david.woodhou.se/clang-percpu-hack.patch > > > > 32-bit doesn't boot. Built without CONFIG_RETPOLINE and with Clang 5.0 > > (and the above patch) it does. I'm rebuilding a Release build of > > llvm/clang so that experimental kernel builds hopefully take less than > > an hour, and will prod further in the morning. > > What is the intended ABI of __x86_indirect_thunk which I have been > calling the "ret-equivalent" retpoline? I see this happening > (I ♥ 'qemu -d in_asm')... > > ---------------- > IN: > 0xc136feea: 89 d8 movl %ebx, %eax > 0xc136feec: 89 f2 movl %esi, %edx > 0xc136feee: 8b 75 f0 movl -0x10(%ebp), %esi > 0xc136fef1: 89 f1 movl %esi, %ecx > 0xc136fef3: ff 75 e0 pushl -0x20(%ebp) > 0xc136fef6: e8 c5 f3 58 00 calll 0xc18ff2c0 # > __x86_indirect_thunk > > ---------------- > IN: > 0xc18ff2c0: c3 retl # Early boot, so it hasn't > been turned into a proper retpoline yet > > ---------------- > IN: > 0xc136fefb: 8d 34 7e leal (%esi, %edi, 2), %esi > > > (gdb) list *0xc136fef6 > 0xc136fef6 is in sort (lib/sort.c:87). > 82 if (c < n - size && > 83 cmp_func(base + c, base + > c + size) < 0) > 84 c += size; > 85 if (cmp_func(base + r, base + c) >= 0) > 86 break; > 87 swap_func(base + r, base + c, size); > 88 } > 89 } > 90 > 91 /* sort */ > > You're pushing the target (-0x20(%ebp)) onto the stack and then > *calling* __x86_indirect_thunk. So it looks like you're expecting > __x86_indirect_thunk to do something like > > call *4(%esp) > ret > > ... except that final 'ret' still leaves the target address on the > stack, so there would also need to be a complicated dance, without > using any registers, to pop that too. >Yeah, we expect a complicated dance to re-order the stack to get the correct return address into the correct place. You can see the sequence in the comments here: https://github.com/llvm-project/llvm-project-20170507/blob/master/llvm/lib/Target/X86/X86RetpolineThunks.cpp#L179-L194> > I expected the emitted code for a *call* using the thunk to look more > like > > jmp 2f > 1: pushl -0x20(%ebp) # cmp_func > jmp __x86_thunk_indirect # jmp, not call > 2: call 1b # set up address for cmp_func to return to >Yeah, the specific goal was to minimize the code size footprint at the call site even though it means a few more instructions in the thunk. Our pattern also has a minor reduction in the dynamic branches taken at the cost of the push/pop churn. There was briefly a discussion of a different instruction sequence to minimize push/pop churn but it didn't end up happening. Anyways, it appears that we have the first case where my suspicions were borne out and we have somewhat reasonably different ABIs for some of the thunks. How should we name them to distinguish things? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180209/b5e64b88/attachment-0001.html>