Arnd Bergmann
2020-Jun-30 19:25 UTC
[PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y
On Tue, Jun 30, 2020 at 7:39 PM Will Deacon <will at kernel.org> wrote:> +#define __READ_ONCE(x) \ > +({ \ > + int atomic = 1; \ > + union { __unqual_scalar_typeof(x) __val; char __c[1]; } __u; \ > + typeof(&(x)) __x = &(x); \ > + switch (sizeof(x)) { \...> + atomic ? (typeof(x))__u.__val : (*(volatile typeof(x) *)__x); \ > +})This expands (x) nine times (five in __unqual_scala_typeof()), which can lead to significant code bloat after preprocessing if something passes a compound expression into READ_ONCE(). The compiler works it out eventually, but we've seen an actual slowdown in compile speed from this recently, especially on clang. I think if you move the typeof(&(x)) __x = &(x); line first, all other instances can use typeof(*__x) instead of typeof(x) and avoid this problem. Once we make gcc-4.9 the minimum version, this could be further improved to __auto_type __x = &(x); Arnd
Will Deacon
2020-Jul-01 10:19 UTC
[PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y
On Tue, Jun 30, 2020 at 09:25:03PM +0200, Arnd Bergmann wrote:> On Tue, Jun 30, 2020 at 7:39 PM Will Deacon <will at kernel.org> wrote: > > +#define __READ_ONCE(x) \ > > +({ \ > > + int atomic = 1; \ > > + union { __unqual_scalar_typeof(x) __val; char __c[1]; } __u; \ > > + typeof(&(x)) __x = &(x); \ > > + switch (sizeof(x)) { \ > ... > > + atomic ? (typeof(x))__u.__val : (*(volatile typeof(x) *)__x); \ > > +}) > > This expands (x) nine times (five in __unqual_scala_typeof()), which can > lead to significant code bloat after preprocessing if something passes a > compound expression into READ_ONCE(). > The compiler works it out eventually, but we've seen an actual slowdown > in compile speed from this recently, especially on clang. > > I think if you move the > > typeof(&(x)) __x = &(x); > > line first, all other instances can use typeof(*__x) instead of typeof(x) > and avoid this problem.Cheers, I was only thinking about side-effects when I wrote this, but bloating built time is very unpopular, so I'll go with your suggestion.> Once we make gcc-4.9 the minimum version, > this could be further improved to > > __auto_type __x = &(x);Is anybody working on moving to 4.9? I've seen the mails from Linus championing it, but I thought there was a RHEL in support that people might care about? Will
Arnd Bergmann
2020-Jul-01 10:59 UTC
[PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y
On Wed, Jul 1, 2020 at 12:19 PM Will Deacon <will at kernel.org> wrote:> On Tue, Jun 30, 2020 at 09:25:03PM +0200, Arnd Bergmann wrote: > > On Tue, Jun 30, 2020 at 7:39 PM Will Deacon <will at kernel.org> wrote: > > Once we make gcc-4.9 the minimum version, > > this could be further improved to > > > > __auto_type __x = &(x); > > Is anybody working on moving to 4.9? I've seen the mails from Linus > championing it, but I thought there was a RHEL in support that people > might care about?I don't think there was a serious discussion about it so far, and we only just moved to gcc-4.8. I think moving to gnu11 (gcc-4.9 or clang) instead of gnu99 has other benefits as well, so we may well want to do it anyway when something else comes up. For __auto_type(), we could do it like #if (clang or gcc-4.9+) #define auto_typeof(x) __auto_type #else #define auto_typeof(x) typeof(x) #endif which could be used in a lot of macros. Arnd
Possibly Parallel Threads
- [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y
- [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y
- [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y
- [PATCH v3 19/19] arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y
- [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y