Eric Dumazet
2017-Mar-16 16:58 UTC
[Bridge] [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t
On Thu, 2017-03-16 at 17:28 +0200, Elena Reshetova wrote:> refcount_t type and corresponding API should be > used instead of atomic_t when the variable is used as > a reference counter. This allows to avoid accidental > refcounter overflows that might lead to use-after-free > situations....> static __always_inline void sock_hold(struct sock *sk) > { > - atomic_inc(&sk->sk_refcnt); > + refcount_inc(&sk->sk_refcnt); > } >While I certainly see the value of these refcount_t, we have a very different behavior on these atomic_inc() which were doing a single inlined LOCK RMW on x86. We now call an external function performing a atomic_read(), various ops/tests, then atomic_cmpxchg_relaxed(), in a loop, loosing the nice ability for x86 of preventing live locks. Looks a lot of bloat, just to be able to chase hypothetical bugs in the kernel. I would love to have a way to enable extra debugging when I want a debug kernel, like LOCKDEP or KASAN. By adding all this bloat, we assert linux kernel is terminally buggy and every atomic_inc() we did was suspicious, and need to be always instrumented/validated.
Kees Cook
2017-Mar-16 17:38 UTC
[Bridge] [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t
On Thu, Mar 16, 2017 at 10:58 AM, Eric Dumazet <eric.dumazet at gmail.com> wrote:> On Thu, 2017-03-16 at 17:28 +0200, Elena Reshetova wrote: >> refcount_t type and corresponding API should be >> used instead of atomic_t when the variable is used as >> a reference counter. This allows to avoid accidental >> refcounter overflows that might lead to use-after-free >> situations. > > > ... > >> static __always_inline void sock_hold(struct sock *sk) >> { >> - atomic_inc(&sk->sk_refcnt); >> + refcount_inc(&sk->sk_refcnt); >> } >> > > While I certainly see the value of these refcount_t, we have a very > different behavior on these atomic_inc() which were doing a single > inlined LOCK RMW on x86.I think we can certainly investigate arch-specific ways to improve the performance, but the consensus seemed to be that getting the infrastructure in and doing the migration was the first set of steps.> We now call an external function performing a > atomic_read(), various ops/tests, then atomic_cmpxchg_relaxed(), in a > loop, loosing the nice ability for x86 of preventing live locks. > > Looks a lot of bloat, just to be able to chase hypothetical bugs in the > kernel. > > I would love to have a way to enable extra debugging when I want a debug > kernel, like LOCKDEP or KASAN. > > By adding all this bloat, we assert linux kernel is terminally buggy and > every atomic_inc() we did was suspicious, and need to be always > instrumented/validated.This IS the assertion, unfortunately. With average 5 year lifetimes on security flaws[1], and many of the last couple years' public exploits being refcount flaws[2], this is something we have to get done. We need the default kernel to be much more self-protective, and this is one of many places to make it happen. I am, of course, biased, but I think the evidence of actual refcounting attacks outweighs the theoretical performance cost of these changes. If there is a realistic workflow that shows actual problems, let's examine it and find a solution for that as a separate part of this work without blocking this migration. -Kees [1] https://outflux.net/blog/archives/2016/10/18/security-bug-lifetime/ [2] http://kernsec.org/wiki/index.php/Bug_Classes/Integer_overflow -- Kees Cook Pixel Security