thr3ads.net - llvm dev - [llvm-dev] Out-of-line atomics implementation ways [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Pavel Iliin via llvm-dev

2020-Oct-15 16:52 UTC

[llvm-dev] Out-of-line atomics implementation ways

Greetings everyone,
I am working on Aarch64 LSE out-of-line atomics support in LLVM, porting this
GCC series: https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01034.html
After local design experiments I've got some questions about
upstream-suitable ways of implementation. More specifically:

1. Pass to expand atomics to library helper functions calls. 
These helpers test for the presence of LSE instructions and dispatch to
corresponding sequence of instructions.
There are 100 helpers resulting from various combinations of instruction = {
cas| swp | ldadd | ldset| ldclr| ldeor }, memory model = { relax, acq, rel,
acq_rel } and size = {1, 2, 4 , 8, 16}.
I am considering two possibilities:
i.  Atomic Expand pass: add new AtomicExpansionKind::OutOfLine, and if it was
set by target expand atomics to RTLIB libcalls. It will require to add 100 new
"standardized" library names to   RuntimeLibcalls.def and redefine
them for Aarch64 target to comply with libgcc implementation ( like this :
"cas4_relax" -> " __aarch64_cas4_relax" )
ii. Lower atomics in question later on Instruction Selection pass: for Aarch64
out-of-line atomics targets replace atomicrmw/cmpxchg  to __aarch64 helpers
libcalls. Then there is no need for runtime library calls extension, however
this approach potentially gives less opportunity for compiler optimizations and
appears to be more aarch64 specific.

2. Way of generating helpers code
To generate mentioned helpers and calls, preprocessor macros were used on gcc
side and 'foreach' make targets on libgcc part of interface.
Concerning LLVM, compiler-rt library builtins readme states: "Each function
is contained in its own file.  Each function has a corresponding unit test under
test/Unit." and compilation is controlled by cmake. In addition,
RuntimeLibcalls.def contains HANDLE_LIBCALL macros and would need to be
redesigned to include compile time names generation.
So I have a choice of: 
i.  Slightly change/review some LLVM concepts and generate helpers on the fly. 
ii. Prepare all code locally and commit 100+ lines of names and files.

I would greatly appreciate any thoughts, suggestions and comments to create
easy-to-go upstream patch,
Thanks in advance,
Pavel

James Y Knight via llvm-dev

2020-Oct-15 19:44 UTC

head link

[llvm-dev] Out-of-line atomics implementation ways

On Thu, Oct 15, 2020 at 12:53 PM Pavel Iliin via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Greetings everyone,
> I am working on Aarch64 LSE out-of-line atomics support in LLVM, porting
> this GCC series:
> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01034.html
> After local design experiments I've got some questions about
> upstream-suitable ways of implementation. More specifically:

Great!

> 1. Pass to expand atomics to library helper functions calls.
> These helpers test for the presence of LSE instructions and dispatch to
> corresponding sequence of instructions.
> There are 100 helpers resulting from various combinations of instruction
> { cas| swp | ldadd | ldset| ldclr| ldeor }, memory model = { relax, acq,
> rel, acq_rel } and size = {1, 2, 4 , 8, 16}.
> I am considering two possibilities:
> i.  Atomic Expand pass: add new AtomicExpansionKind::OutOfLine, and if it
> was set by target expand atomics to RTLIB libcalls. It will require to add
> 100 new "standardized" library names to   RuntimeLibcalls.def and
redefine
> them for Aarch64 target to comply with libgcc implementation ( like this :
> "cas4_relax" -> " __aarch64_cas4_relax" )
> ii. Lower atomics in question later on Instruction Selection pass: for
> Aarch64 out-of-line atomics targets replace atomicrmw/cmpxchg  to __aarch64
> helpers libcalls. Then there is no need for runtime library calls
> extension, however this approach potentially gives less opportunity for
> compiler optimizations and appears to be more aarch64 specific.

Acting like you have LSE instructions for AtomicExpandPass's shouldExpand*
functions, so that it doesn't do any IR expansion, and then handling the
resulting ATOMIC_LOAD_*/etc ISD nodes in isel is definitely the better
choice. Lowering to a runtime lib call in ISEL is simple and
straightforward, and exactly what you need.

2. Way of generating helpers code> To generate mentioned helpers and calls, preprocessor macros were used on
> gcc side and 'foreach' make targets on libgcc part of interface.
>
> Concerning LLVM, compiler-rt library builtins readme states: "Each
> function is contained in its own file.  Each function has a corresponding
> unit test under test/Unit." and compilation is controlled by cmake.

You'll definitely want to have each function end up in its own object-file.
Probably the simplest way to accomplish that is to have separate source
files for each function, each one containing simply some #defines to select
the variant, then #include "lse_impl.inc" -- or something along those
lines. Making the buildsystem responsible for that is likely also
possible...but meh...not sure it's worth it.

In addition, RuntimeLibcalls.def contains HANDLE_LIBCALL macros and
would> need to be redesigned to include compile time names generation.
>
No redesign would be needed here. This file is preprocessed; you can
#define an internal helper macro which calls HANDLE_LIBCALL multiple times,
if desirable. However, it's probably also fine to just list them all out.

So I have a choice of:> i.  Slightly change/review some LLVM concepts and generate helpers on the
> fly.
> ii. Prepare all code locally and commit 100+ lines of names and files.-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201015/be1695e6/attachment.html>

Eli Friedman via llvm-dev

2020-Oct-15 20:05 UTC

head link

[llvm-dev] Out-of-line atomics implementation ways

Current precent in the codebase is the __sync_* libcalls.  They have essentially
the semantics you want, except that they're all seq_cst.

On the LLVM side, I'd rather not have two ways to do the same thing, so
I'd prefer to extend the existing mechanism.  Adding 100 lines to
RuntimeLibcalls.def seems a bit unfortunate, but I think you can reduce that
using some C macros.

On the compiler-rt side, given the large number of functions, some CMake magic
might be appropriate.

-Eli

-----Original Message-----
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Pavel
Iliin via llvm-dev
Sent: Thursday, October 15, 2020 9:53 AM
To: llvm-dev at lists.llvm.org
Subject: [EXT] [llvm-dev] Out-of-line atomics implementation ways

Greetings everyone,
I am working on Aarch64 LSE out-of-line atomics support in LLVM, porting this
GCC series: https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01034.html
After local design experiments I've got some questions about
upstream-suitable ways of implementation. More specifically:

1. Pass to expand atomics to library helper functions calls.
These helpers test for the presence of LSE instructions and dispatch to
corresponding sequence of instructions.
There are 100 helpers resulting from various combinations of instruction = {
cas| swp | ldadd | ldset| ldclr| ldeor }, memory model = { relax, acq, rel,
acq_rel } and size = {1, 2, 4 , 8, 16}.
I am considering two possibilities:
i.  Atomic Expand pass: add new AtomicExpansionKind::OutOfLine, and if it was
set by target expand atomics to RTLIB libcalls. It will require to add 100 new
"standardized" library names to   RuntimeLibcalls.def and redefine
them for Aarch64 target to comply with libgcc implementation ( like this :
"cas4_relax" -> " __aarch64_cas4_relax" )
ii. Lower atomics in question later on Instruction Selection pass: for Aarch64
out-of-line atomics targets replace atomicrmw/cmpxchg  to __aarch64 helpers
libcalls. Then there is no need for runtime library calls extension, however
this approach potentially gives less opportunity for compiler optimizations and
appears to be more aarch64 specific.

2. Way of generating helpers code
To generate mentioned helpers and calls, preprocessor macros were used on gcc
side and 'foreach' make targets on libgcc part of interface.
Concerning LLVM, compiler-rt library builtins readme states: "Each function
is contained in its own file.  Each function has a corresponding unit test under
test/Unit." and compilation is controlled by cmake. In addition,
RuntimeLibcalls.def contains HANDLE_LIBCALL macros and would need to be
redesigned to include compile time names generation.
So I have a choice of:
i.  Slightly change/review some LLVM concepts and generate helpers on the fly.
ii. Prepare all code locally and commit 100+ lines of names and files.

I would greatly appreciate any thoughts, suggestions and comments to create
easy-to-go upstream patch,
Thanks in advance,
Pavel





_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Pavel Iliin via llvm-dev

2020-Nov-10 14:24 UTC

head link

[llvm-dev] Out-of-line atomics implementation ways

Thank you for useful advice,
Outline atomics are implemented in two patches:
https://reviews.llvm.org/D91156 - compiler-rt part
https://reviews.llvm.org/D91157 - llvm+clang part
I’ve tested it on aarch64+lse linux board (libc++ and libstdc++) and
checked compilation of Chromium and AOSP hacked build with updated libgcc
supporting outline atomics helpers
(to do fully Android testing libgcc should be updated to version 10 or replaced
by compiler-rt with my patch above)
I would be very grateful for review and comments,
Pavel
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201110/2909fd83/attachment.html>

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Oct 2020 - Out-of-line atomics implementation ways

[llvm-dev] Out-of-line atomics implementation ways

[llvm-dev] Out-of-line atomics implementation ways

[llvm-dev] Out-of-line atomics implementation ways

[llvm-dev] Out-of-line atomics implementation ways

Possibly Parallel Threads