thr3ads.net - llvm dev - [llvm-dev] [RFC] Zeroing Caller Saved Regs [Aug 2020]

If this information is useful, please help other people find it:
Share via:

Kees Cook via llvm-dev

2020-Aug-07 22:28 UTC

[llvm-dev] [RFC] Zeroing Caller Saved Regs

On Fri, Aug 7, 2020 at 1:18 AM David Chisnall
<David.Chisnall at cl.cam.ac.uk> wrote:> I think it would be useful for the discussion to have a clear threat model
that this intends to defend against and a rough analysis of the security
benefits that this is believed to bring.
I view this as being even more about a ROP defense. Dealing with spill
slots is, IMO, a separate issue, more related to the auto-var-init
work (though that would be stack erasure on function exit, rather than
entry, which addresses a different set of issues). I think this thread
from the GCC list has some good details on the ROP defense:

https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551607.html

-- 
Kees Cook

David Chisnall via llvm-dev

2020-Aug-10 10:34 UTC

head link

[llvm-dev] [RFC] Zeroing Caller Saved Regs

Thanks,

On 07/08/2020 23:28, Kees Cook wrote:> On Fri, Aug 7, 2020 at 1:18 AM David Chisnall
> <David.Chisnall at cl.cam.ac.uk> wrote:
>> I think it would be useful for the discussion to have a clear threat
model that this intends to defend against and a rough analysis of the security
benefits that this is believed to bring.
> 
> I view this as being even more about a ROP defense. Dealing with spill
> slots is, IMO, a separate issue, more related to the auto-var-init
> work (though that would be stack erasure on function exit, rather than
> entry, which addresses a different set of issues). I think this thread
> from the GCC list has some good details on the ROP defense:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551607.html
This link gives two motivations:

1. Reducing information leak (which I find unconvincing, because there's 
a lot more left on the stack than in caller-save registers).
2. Reducing ROP gadgets.

Unfortunately, for claim 2 they cite a paper that is behind a paywall, 
so I can't easily see what that's doing and I'll have to guess what
the
paper says:

Caller-save registers are intuitively useful in the first gadget in a 
ROP sequence, because the current frame will have put values into them 
(and so they are most likely to hold attacker-controlled values).  I can 
imagine quite easily a paper that shows that you break the first gadget 
in a chain with this mitigation.

It's possible that it would also significantly reduce the number of 
total gadgets if each ret is preceeded by the zeroing sequence, 
effectively denying the ability for the attacker to use these registers. 
  Unfortunately, to be able to make arbitrary calls they would just need 
one unguarded forward control-flow edge that loaded a function pointer 
and its arguments from the stack, and I can't imagine that such a gadget 
is absent from most nontrivial codebases.  I'd like to see an analysis 
of the gadgets remaining when this mitigation is used.

I don't object to adding a flag that makes the Linux kernel slower but 
if it is being advertised as a security feature then I would like to see 
some evidence that it does something other than require automated attack 
tools pick a different set of gadgets to use.

David

Bill Wendling via llvm-dev

2020-Aug-12 21:44 UTC

head link

[llvm-dev] [RFC] Zeroing Caller Saved Regs

On Mon, Aug 10, 2020 at 3:34 AM David Chisnall
<David.Chisnall at cl.cam.ac.uk> wrote:>
> Thanks,
>
> On 07/08/2020 23:28, Kees Cook wrote:
> > On Fri, Aug 7, 2020 at 1:18 AM David Chisnall
> > <David.Chisnall at cl.cam.ac.uk> wrote:
> >> I think it would be useful for the discussion to have a clear
threat model that this intends to defend against and a rough analysis of the
security benefits that this is believed to bring.
> >
> > I view this as being even more about a ROP defense. Dealing with spill
> > slots is, IMO, a separate issue, more related to the auto-var-init
> > work (though that would be stack erasure on function exit, rather than
> > entry, which addresses a different set of issues). I think this thread
> > from the GCC list has some good details on the ROP defense:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551607.html
>
> This link gives two motivations:
>
> 1. Reducing information leak (which I find unconvincing, because
there's
> a lot more left on the stack than in caller-save registers).
> 2. Reducing ROP gadgets.
>
> Unfortunately, for claim 2 they cite a paper that is behind a paywall,
> so I can't easily see what that's doing and I'll have to guess
what the
> paper says:
>
> Caller-save registers are intuitively useful in the first gadget in a
> ROP sequence, because the current frame will have put values into them
> (and so they are most likely to hold attacker-controlled values).  I can
> imagine quite easily a paper that shows that you break the first gadget
> in a chain with this mitigation.
>
> It's possible that it would also significantly reduce the number of
> total gadgets if each ret is preceeded by the zeroing sequence,
> effectively denying the ability for the attacker to use these registers.
>   Unfortunately, to be able to make arbitrary calls they would just need
> one unguarded forward control-flow edge that loaded a function pointer
> and its arguments from the stack, and I can't imagine that such a
gadget
> is absent from most nontrivial codebases.  I'd like to see an analysis
> of the gadgets remaining when this mitigation is used.
>
> I don't object to adding a flag that makes the Linux kernel slower but
> if it is being advertised as a security feature then I would like to see
> some evidence that it does something other than require automated attack
> tools pick a different set of gadgets to use.
>After reading the paper they link to, I'm rethinking this feature. :-)
>From what I can gather from the paper, they use a tool to determinewhich scratch (caller saved) registers are used in a function call.
They then use some type of instrumentation to zero out those scratch
registers. This can apparently break the change. For example, in line
17 below, RDI will be zeroed out as is RSI in line 19:

1: p = ''
2: p += pack('<Q', 0x0000000000401627) # pop rsi ; ret
3: p += pack('<Q', 0x00000000006ca080) # @ .data
4: p += pack('<Q', 0x00000000004784d6) # pop rax ; pop rdx ; pop rbx
; ret
5: p += '/bin//sh'
6: p += pack('<Q', 0x4141414141414141) # padding
7: p += pack('<Q', 0x4141414141414141) # padding
8: p += pack('<Q', 0x0000000000473f81) # mov qword ptr [rsi], rax ;
ret
9: p += pack('<Q', 0x0000000000401627) # pop rsi ; ret
10: p += pack('<Q', 0x00000000006ca088) # @ .data + 8
11: p += pack('<Q', 0x0000000000425e3f) # xor rax, rax ; ret
12: p += pack('<Q', 0x0000000000473f81) # mov qword ptr [rsi], rax ;
ret
13: p+= pack('<Q', 0x00000000004784d6) # pop rax ; pop rdx ; pop rbx
; ret
14: p += p64(59) # execve syscall number
15: p += pack('<Q', 0x4141414141414141) # padding
16: p += pack('<Q', 0x4141414141414141) # padding
17: p += pack('<Q', 0x0000000000401506) # pop rdi ; ret
18: p += pack('<Q', 0x00000000006ca080) # @ .data
19: p += pack('<Q', 0x0000000000401627) # pop rsi ; ret
20: p += pack('<Q', 0x00000000006ca088) # @ .data + 8
21: p += pack('<Q', 0x0000000000442636) # pop rdx ; ret
22: p += pack('<Q', 0x00000000006ca088) # @ .data + 8
23: p += pack('<Q', 0x0000000000467175) # syscall ; ret

Their instrumentation is impractical though as it increases the
runtime by over 16x.

My guess is that inserting zeroing instructions right before the "ret"
instruction can disable some of the hacks we see with ROP:

   `pop rdi ; ret` becomes `pop rdi ; xor rdi, rdi ; ret`

-bw

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Aug 2020 - [RFC] Zeroing Caller Saved Regs

[llvm-dev] [RFC] Zeroing Caller Saved Regs

[llvm-dev] [RFC] Zeroing Caller Saved Regs

[llvm-dev] [RFC] Zeroing Caller Saved Regs

Seemingly Similar Threads