thr3ads.net - llvm dev - [llvm-dev] Proposal: arbitrary relocations in constant global initializers [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Peter Collingbourne via llvm-dev

2015-Aug-26 19:41 UTC

[llvm-dev] Proposal: arbitrary relocations in constant global initializers

On Wed, Aug 26, 2015 at 11:49:46AM -0400, Rafael Espíndola
wrote:> This is pr10368.
> 
> Do we really need to support hard coded relocation numbers? Looks like
> the examples above have a representation as constant expressions:
> 
>  (sub (add (ptrtoint @foo)  0xeafffffe) cur_pos)
> 
> no?
I'm not sure if this would be sufficient. The R_ARM_JUMP24 relocation
on ARM has specific semantics to implement ARM/Thumb interworking; see
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044e/IHI0044E_aaelf.pdf
Note that R_ARM_CALL has the same operation but different semantics.
I suppose that we could try looking at the addend to decide which relocation
to use, but this would mean adding more complexity to the assembler (along
with any pattern matching that would need to be done). It seems simpler,
both conceptually and in the implementation, for the client to directly say
what it wants in the object file.

There's also the point that if @foo is defined outside the current linkage
unit, or refers to a Thumb function, the above expression in a constant
initializer would refer to the function's PLT entry or a shim, but in a
function it would refer to the function's actual address, so the evaluation
of this expression would depend on whether it was constant folded. (Although
on the other hand we might just declare that by using such a constant in a
global initializer that may be constant folded the client is asserting that
it doesn't care which address is used.)
> Why do you need to be able to avoid them showing up in function
> bodies? It would be unusual but valid to pass the above value as an
> argument to a function.
This was part of the proposal mainly for the constant folding reasons mentioned
above, but if we did go with a reloc expression we'd need to encode the
original constant address in the reloc for PC-relative expressions, which
wouldn't be necessary if we disallow it.

Thanks,
-- 
Peter

Rafael Espíndola via llvm-dev

2015-Aug-26 19:53 UTC

head link

[llvm-dev] Proposal: arbitrary relocations in constant global initializers

> I'm not sure if this would be sufficient. The R_ARM_JUMP24 relocation
> on ARM has specific semantics to implement ARM/Thumb interworking; see
>
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044e/IHI0044E_aaelf.pdf
> Note that R_ARM_CALL has the same operation but different semantics.
> I suppose that we could try looking at the addend to decide which
relocation
> to use, but this would mean adding more complexity to the assembler (along
> with any pattern matching that would need to be done). It seems simpler,
> both conceptually and in the implementation, for the client to directly say
> what it wants in the object file.
>
> There's also the point that if @foo is defined outside the current
linkage
> unit, or refers to a Thumb function, the above expression in a constant
> initializer would refer to the function's PLT entry or a shim, but in a
> function it would refer to the function's actual address, so the
evaluation
> of this expression would depend on whether it was constant folded.
(Although
> on the other hand we might just declare that by using such a constant in a
> global initializer that may be constant folded the client is asserting that
> it doesn't care which address is used.)
I am pretty sure there is use for some target specific expressions, my
concerns are
* Using a target specific expression when it could be represented in a
target independent way (possibly a bit more verbose).
* Using the raw relocation values, instead of something like
thumb_addr_delta. With this the semantics of each constant expression
are still documented in the language reference.
>> Why do you need to be able to avoid them showing up in function
>> bodies? It would be unusual but valid to pass the above value as an
>> argument to a function.
>
> This was part of the proposal mainly for the constant folding reasons
mentioned
> above, but if we did go with a reloc expression we'd need to encode the
> original constant address in the reloc for PC-relative expressions, which
> wouldn't be necessary if we disallow it.
Seems better to make it explicit IMHO.

BTW, about the assembly change: Please check what the binutils guys
think of it. We do have extensions, but it is nice to at least let
them know so that we don't end up with two independent solutions in
the future.

Cheers,
Rafael

Peter Collingbourne via llvm-dev

2015-Aug-26 22:29 UTC

head link

[llvm-dev] Proposal: arbitrary relocations in constant global initializers

On Wed, Aug 26, 2015 at 03:53:33PM -0400, Rafael Espíndola
wrote:> > I'm not sure if this would be sufficient. The R_ARM_JUMP24
relocation
> > on ARM has specific semantics to implement ARM/Thumb interworking; see
> >
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044e/IHI0044E_aaelf.pdf
> > Note that R_ARM_CALL has the same operation but different semantics.
> > I suppose that we could try looking at the addend to decide which
relocation
> > to use, but this would mean adding more complexity to the assembler
(along
> > with any pattern matching that would need to be done). It seems
simpler,
> > both conceptually and in the implementation, for the client to
directly say
> > what it wants in the object file.
> >
> > There's also the point that if @foo is defined outside the current
linkage
> > unit, or refers to a Thumb function, the above expression in a
constant
> > initializer would refer to the function's PLT entry or a shim, but
in a
> > function it would refer to the function's actual address, so the
evaluation
> > of this expression would depend on whether it was constant folded.
(Although
> > on the other hand we might just declare that by using such a constant
in a
> > global initializer that may be constant folded the client is asserting
that
> > it doesn't care which address is used.)
> 
> I am pretty sure there is use for some target specific expressions, my
> concerns are
> * Using a target specific expression when it could be represented in a
> target independent way (possibly a bit more verbose).
Well I don't think there's a target independent way to write an
R_ARM_JUMP24
relocation, as there's no way to represent the PLT entry or interworking
shim in IR.
> * Using the raw relocation values, instead of something like
> thumb_addr_delta. With this the semantics of each constant expression
> are still documented in the language reference.
I guess there are two ways we can go here:

1) expose the raw relocation values
2) introduce new specific ConstantExpr subtypes for the target-specific things
we need

In this case I think we should do one or the other, I don't really think
it's
worth adding a half measure of flexibility (e.g. providing a way to specify
the addend of a R_ARM_JUMP24 when it will pretty much always be the same).

I like option 1 because it's more general purpose and ultimately less of an
impedance mismatch between what the client wants and what appears in the
object file, and we can solve the documentation problem with reference to
the object file format documentation, but it would require our documentation
to depend on sometimes poorly documented object file formats.

Option 2 could look something like this (produces the same bytes as "b
some_label" in every object format when targeting ARM, or "b.w
some_label"
when targeting Thumb):

i32 arm_b (void ()* @some_label)

and that would be easy to document on its own. The downside is that it's
pretty specific to my use case, but maybe that's ok.

2 seems like it would be less implementation work, and doesn't require any
changes to the assembly format (and ultimately could be upgraded to 1 later
if needed), so maybe it's best to start with that.
> >> Why do you need to be able to avoid them showing up in function
> >> bodies? It would be unusual but valid to pass the above value as
an
> >> argument to a function.
> >
> > This was part of the proposal mainly for the constant folding reasons
mentioned
> > above, but if we did go with a reloc expression we'd need to
encode the
> > original constant address in the reloc for PC-relative expressions,
which
> > wouldn't be necessary if we disallow it.
> 
> Seems better to make it explicit IMHO.
Okay, but if we do introduce a new constant kind, there doesn't seem to be
much point in teaching the backend to lower it in a function, other than
for completeness. If we can avoid having to do that, that seems preferable.
> BTW, about the assembly change: Please check what the binutils guys
> think of it. We do have extensions, but it is nice to at least let
> them know so that we don't end up with two independent solutions in
> the future.   
Yes if I ultimately go with 1.

Thanks,
-- 
Peter

llvm dev - Aug 2015 - Proposal: arbitrary relocations in constant global initializers

[llvm-dev] Proposal: arbitrary relocations in constant global initializers

[llvm-dev] Proposal: arbitrary relocations in constant global initializers

[llvm-dev] Proposal: arbitrary relocations in constant global initializers