thr3ads.net - llvm dev - [llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI) [Dec 2015]

If this information is useful, please help other people find it:
Share via:

Oliver Stannard via llvm-dev

2015-Dec-04 13:46 UTC

[llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

Hi,

We currently have a downstream patch (attached) which implements some new
addressing modes that enable position-independent code for small embedded
systems. Is this something that would be accepted upstream? I think the ARM
backend changes are fairly uncontroversial, but the clang changes introduce
a
lot of ROPI/RWPI specific changes in otherwise target-independent code. If
the
clang changes are not acceptable, it is still possible to use just the ARM
backend changes (with a smaller clang patch for command-line options only),
as
the C code which needs special lowering is rare, easy to work around and
easy
for a linker to detect.

This patch (along with the corresponding ARM backend patch) adds support
for some new relocation models:
- Read-only position independence (ROPI): Code and read-only data is
  accessed PC-relative. The offsets between all code and RO data
  sections are known at static link time.
- Read-write position independence (RWPI): Read-write data is accessed
  relative to a static base register (R9). The offsets between all writable
  data sections are known at static link time.

These two modes are independent (they specify how different objects
should be addressed), so they can be used individually or together. They are
otherwise the same as the "static" relocation model, and are not
compatible
with SysV-style PIC.

These modes are normally used by bare-metal systems or small real-time
operating systems. They are designed to avoid the need for a dynamic linker,
the only initialisation required is setting the static base register to an
appropriate value for RWPI code. They also minimise the size of the writable
portion of the executable, for systems with very limited RAM.

I have only added support to SelectionDAG, not FastISel, because FastISel is
currently disabled for bare-metal targets where these modes would be used.

On the clang side, the following command-line options are added:
  -fropi
  -frwpi
  -fropi-lowering
  -frwpi-lowering
The first two enable ROPI and RWPI modes, and the second two enable
lowering of static initialisers that are not compatible with ROPI/RWPI.
Most users will not need to use the second two options, as they are
turned on by default when the -fropi and -frwpi options are used. All of
these options have -fno-* equivalents.

In addition to passing the command-line options through to the backend,
clang must be changed to work around a limitation in these modes: since
there is no dynamic loader, if a variable is initialised to the address
of a global value, it's initial value is not known at static link time.
For example:

  extern int i;
  int *p = &a; // Initial value unknown at static link time

SysV-style PIC solves this by having the dynamic linker fix up any
relocations on the data segment. Since these modes are trying to avoid
the need for a dynamic linker, we instead have the compiler emit code to
initialise these variables at startup time. These initiailisers are
expected to be rare, so the dynamic initiaslisers will be smaller than
the equivalent dynamic linker plus relocation and symbol tables.

If a variable with an initialiser that needs lowering is declared with a
const-qualified type, we must emit it as a non-constant so that it gets
put into writable memory. I'm using the "externally_initialized"
flag to
prevent the optimiser from being able to turn dynamic initialisers back
into static ones.

Making a variable non-const can cause a chain of variables to need
initialisers in RWPI-only mode. For example:

  extern int a;
  static int * const b = &a;
  static int * const * const c = &b;

Here, "c" looks like is does not need an dynamic init, because
"b" is
declared const. However, "b" itself needs a dynamic init, so must be
made non-const, meaning that "c" now needs a dynamic init. My patch
handles this correctly, but there is a similar case where it does not:

  extern int a;
  static int * const b;
  static int * const * const c = &b;
  static int * const b = &a;

Due to the design of clang, the IR for "c" has already been emitted
(as a constant, with a static initialiser) when the initialiser for
"b"
is parsed, making "c"'s initialiser wrong. I haven't been able
to find a
good way to implement this properly, so for now I'm working around this
by enabling both the ROPI and RWPI lowering when in RWPI-only mode. This
means that "c" will be given a dynamic init, and making "b"
non-constant
does not change anything.

I have added some new warnings for cases where an ABI mismatch between
two translation units could be caused by ROPI/RWPI. These are:
- Extern global variables with const-qualified incomplete types. These
  are assumed to be constant, but may be put in a writable section by
  the TU which defines them if they have a non-trivial constructor or
  mutable member.
- Externally-visible variables with const-qualified types, where
  initialiser lowering makes them non-const. Other translation units
  will not know that the lowering has happened, and access them as RO
  rather than RW data.

I have also prohibited using ROPI with C++ (the vtables and RTTI are
read-only data, that must contain absolute pointers to other RO data),
but this can be overridden with -fallow-unsupported.

This also adds 3 new pre-defined macros for ARM targets:
  __APCS_FPIC
  __APCS_ROPI
  __APCS_RWPI

They are defined when building code with the -fpic, -fropi and -frwpi
options, respectively. __APCS_FPIC is also defined for AArch64 targets,
but the other two are not supported for AArch64. These macros are not
defined in the ACLE or any other standard, they are named to match the
macros defined by ARM Compiler 5.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ropi-rwpi-clang.patch
Type: application/octet-stream
Size: 69752 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151204/e9c59bc8/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ropi-rwpi-llvm.patch
Type: application/octet-stream
Size: 38205 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151204/e9c59bc8/attachment-0003.obj>

Joerg Sonnenberger via llvm-dev

2015-Dec-04 15:39 UTC

head link

[llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

On Fri, Dec 04, 2015 at 01:46:13PM -0000, Oliver Stannard via llvm-dev
wrote:> In addition to passing the command-line options through to the backend,
> clang must be changed to work around a limitation in these modes: since
> there is no dynamic loader, if a variable is initialised to the address
> of a global value, it's initial value is not known at static link time.
> For example:
> 
>   extern int i;
>   int *p = &a; // Initial value unknown at static link time
> 
> SysV-style PIC solves this by having the dynamic linker fix up any
> relocations on the data segment. Since these modes are trying to avoid
> the need for a dynamic linker, we instead have the compiler emit code to
> initialise these variables at startup time. These initiailisers are
> expected to be rare, so the dynamic initiaslisers will be smaller than
> the equivalent dynamic linker plus relocation and symbol tables.
You don't need a full blown dynamic linker to handle that, just that the
linker creates output that can be appropiately references by the init
code. I don't think that dynamic initialisers will work correctly at
all, since you can access "i" in a separate module that doesn't
know
about the initialiser at all.

Consider taking a look how most dynamic linkers operate themselve in the
ELF world. One of the first things they do is relocate themselve by
processing their own relocation table and applying the fixups. This
doesn't involve symbol tables at all, just patching up addresses.
As such, I don't think such transformation belongs into clang.

Joerg

Tim Northover via llvm-dev

2015-Dec-04 16:53 UTC

head link

[llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

On 4 December 2015 at 05:46, Oliver Stannard via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> SysV-style PIC solves this by having the dynamic linker fix up any
> relocations on the data segment. Since these modes are trying to avoid
> the need for a dynamic linker, we instead have the compiler emit code to
> initialise these variables at startup time. These initiailisers are
> expected to be rare, so the dynamic initiaslisers will be smaller than
> the equivalent dynamic linker plus relocation and symbol tables.
What does armcc do here? It's been a while but I thought it was part
of the scatter-loading initialisation, with some kind of compressed
representation in the final linked image.

Cheers.

Tim.

Oliver Stannard via llvm-dev

2015-Dec-04 17:41 UTC

head link

[llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

> On 4 December 2015 at 05:46, Oliver Stannard via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > SysV-style PIC solves this by having the dynamic linker fix up any
> > relocations on the data segment. Since these modes are trying to avoid
> > the need for a dynamic linker, we instead have the compiler emit code
to
> > initialise these variables at startup time. These initiailisers are
> > expected to be rare, so the dynamic initiaslisers will be smaller than
> > the equivalent dynamic linker plus relocation and symbol tables.
> 
> What does armcc do here? It's been a while but I thought it was part
> of the scatter-loading initialisation, with some kind of compressed
> representation in the final linked image.
armcc does the same thing as this patch: emit dynamic initialisers which get
called from .init_array at startup.

Oliver

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Dec 2015 - [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

[llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

[llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

[llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

[llvm-dev] [RFC][ARM] Add support for embedded position-independent code (ROPI/RWPI)

Possibly Parallel Threads