thr3ads.net - Linux Ethernet Bridging - [Bridge] [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount

If this information is useful, please help other people find it:
Share via:

Kees Cook

2017-Mar-21 20:49 UTC

[Bridge] [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t

On Mon, Mar 20, 2017 at 6:40 AM, Peter Zijlstra <peterz at infradead.org>
wrote:> On Mon, Mar 20, 2017 at 09:27:13PM +0800, Herbert Xu wrote:
>> On Mon, Mar 20, 2017 at 02:23:57PM +0100, Peter Zijlstra wrote:
>> >
>> > So what bench/setup do you want ran?
>>
>> You can start by counting how many cycles an atomic op takes
>> vs. how many cycles this new code takes.
>
> On what uarch?
>
> I think I tested hand coded asm version and it ended up about double the
> cycles for a cmpxchg loop vs the direct instruction on an IVB-EX (until
> the memory bus saturated, at which point they took the same). Newer
> parts will of course have different numbers,
>
> Can't we run some iperf on a 40gbe fiber loop or something? It would be
> very useful to have an actual workload we can run.
Yeah, this is exactly what I'd like to find as well. Just comparing
cycles between refcount implementations, while interesting, doesn't
show us real-world performance changes, which is what we need to
measure.

Is Eric's "20 concurrent 'netperf -t UDP_STREAM'" example
(from
elsewhere in this email thread) real-world meaningful enough?

-Kees

-- 
Kees Cook
Pixel Security

Eric Dumazet

2017-Mar-21 21:23 UTC

head link

[Bridge] [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t

On Tue, 2017-03-21 at 13:49 -0700, Kees Cook wrote:
> Yeah, this is exactly what I'd like to find as well. Just comparing
> cycles between refcount implementations, while interesting, doesn't
> show us real-world performance changes, which is what we need to
> measure.
> 
> Is Eric's "20 concurrent 'netperf -t UDP_STREAM'"
example (from
> elsewhere in this email thread) real-world meaningful enough?
Not at all ;)

This was targeting the specific change I had in mind for
ip_idents_reserve(), which is not used by TCP flows.

Unfortunately there is no good test simulating real-world workloads,
which are mostly using TCP flows.

Most synthetic tools you can find are not using epoll(), and very often
hit bottlenecks in other layers.

It looks like our suggestion to get kernel builds with atomic_inc()
being exactly an atomic_inc() is not even discussed or implemented.

Coding this would require less time than running a typical Google kernel
qualification (roughly one month, thousands of hosts..., days of SWE).

Linux Ethernet Bridging - Mar 2017 - [Bridge] [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t

[Bridge] [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t

[Bridge] [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t