Peter Collingbourne via llvm-dev
2016-Oct-07 22:11 UTC
[llvm-dev] Proposal: arbitrary relocations in constant global initializers
On Fri, Oct 7, 2016 at 2:48 PM, Evgenii Stepanov <eugeni.stepanov at gmail.com> wrote:> On Fri, Oct 7, 2016 at 2:28 PM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > On Fri, Oct 7, 2016 at 1:55 PM, Evgenii Stepanov < > eugeni.stepanov at gmail.com> > > wrote: > >> > >> On Fri, Oct 7, 2016 at 1:22 PM, Peter Collingbourne <peter at pcc.me.uk> > >> wrote: > >> > On Fri, Oct 7, 2016 at 12:20 PM, Evgenii Stepanov > >> > <eugeni.stepanov at gmail.com> wrote: > >> >> > >> >> I've tried implementing some of the alternatives mentioned in this > >> >> thread, and so far I like this syntax the most: > >> >> > >> >> i32 reloc (29, void ()* @f, 3925868544) > >> >> ; 29 = 0x1d = R_ARM_JUMP24 > >> >> ; 3925868544 = 0xea000000 > >> >> > >> >> Note the zeroes in the relocated data instead of 0xfffffe in the > >> >> original proposal. This is aligned with the way LLVM emits > relocations > >> >> in the backend, and avoids encoding the addend in a > >> >> relocation-specific way in the IR. > >> > > >> > > >> > I am confused by this statement. If the zeros aren't what appear in > the > >> > object file, it seems rather relocation specific to me. > >> > >> These bytes will always be zeroes, which makes them not relocation > >> specific. > >> Object file contents, on the other hand, are relocation specific. In > >> particular the constant 0xfffffe is ARM_JUMP24 encoding for zero > >> offset (from the start of the current instruction). > >> > >> Somehow I find this IR representation very natural - you've got data > >> bytes for anything that's not relocated, and the target expression > >> (possibly including addend). > > > > > > My point is that the addend mangling between the IR and the object file > > would be relocation specific. > > > > What happens if I want to start using some new type of relocation? Will I > > need to teach the MC layer about it? > > Yes. MC needs to know if it is pc-relative or not, at least.Why?> What's > the benefit in bypassing MC completely? >To reduce complexity. Rather than teaching MC about every relocation to be used with reloc, you can just teach the component that produces the reloc.> > >> > > >> >> > >> >> Instead, the addend can be > >> >> specified in the second argument with the regular IR expressions, > like > >> >> the following: > >> >> > >> >> @w = internal global [3 x i32] > >> >> [i32 reloc (29, void ()* @f, 3925868544), > >> >> i32 reloc (29, [3 x i32]* @w, 3925868544), > >> >> i32 reloc (29, i32* getelementptr (i32, i32* bitcast ([3 x i32]* > >> >> @w to i32*), i32 1), 3925868544) > >> >> ], align 4 > >> >> > >> >> > >> >> > >> >> we also get relocations for elements 1 and 2 of @w optimized out for > >> >> > >> >> free. If the "addend" (i.e. the third arg of reloc) was specified as > >> >> > >> >> 0xeafffffe, the backend would have had to decode this value first. > >> > > >> > > >> > I think it may be ok to allow non-global constants as the second > operand > >> > (the utility of this feature being the ability to freely RAUW a global > >> > without worrying about reloc constants). > >> > > >> > This doesn't necessarily need to act as an alternative means of > >> > specifying > >> > an addend, though. Instead, the backend could synthesise local symbols > >> > to > >> > act as relocation targets. For example, your example would > conceptually > >> > translate to: > >> > > >> > @w = internal global [3 x i32] > >> > [i32 reloc (29, void ()* @f, 3925868544), > >> > i32 reloc (29, [3 x i32]* @w, 3925868544), > >> > i32 reloc (29, i32* @dummy, i32 1), 3925868544) > >> > > >> > @dummy = internal alias i32* getelementptr (i32, i32* bitcast ([3 x > >> > i32]* @w > >> > to i32*), i32 1) > >> > > >> > This way, you save yourself from needing to worry about manipulating > >> > addends > >> > in the backend, the linker will take care of it for you. > >> > >> That's no worry at all, AsmPrinter::lowerConstant evaluates both > >> constant expressions to MCExpr: w + 4. > > > > > > But you still need to worry about how "w + 4" is represented in the > object > > file. > > It's a relocation with target "w" and addend "4". >Someone needs to implement how to apply the addend 4 to the addend 0xea000000. That's what I meant by manipulating addends. You could do that by relying on MC to do it (your proposal), or you can rely on the linker to do it (my proposal). With my proposal, the frontend/middleend controls section data> indirectly, meaning the actual final section data does not appear as > an IR constant, but we can still get whatever constant we want. On the > other hand, this representation is better for optimizations (instead > of a magic constant 0xfffffe you have a transparent expression w+4).To optimize 0xfffffe representation, the backend would need to decode> the constant, which is the new code that has to be written for any > relocation you'd like to use in reloc(). And if we allow such > optimizations, we may end up with section bytes that are different > from the reloc() constant anyway. >Reloc constants are not meant to be "optimized", they are a means of communicating from the compiler to the linker. Note that either way the frontend/middleend knows the size of the> relocated object (jump table entry). > > > > >> > >> Do you suggest we use this to limit "reloc" to accepting only > >> GlobalValue as the second argument instead of an arbitrary Constant? > > > > > > No, we would accept your example and conceptually translate it into my > > example. > >> > >> > >> > > >> >> On the other hand, it is possible for a constant expression in the IR > >> >> to be lowered to something that is not a valid relocation target, and > >> >> it is hard to detect this problem at the IR level. > >> > > >> > > >> > Right, this is of course a problem we already have for aliasees and > >> > constant > >> > initializers. > >> > > >> >> > >> >> Also, separating the addend from the section data allows the backend > >> >> to choose between .rel and .rela representations. > >> > > >> > > >> > Do you have an example of a rela relocation which uses both r_addend > and > >> > the > >> > underlying value in the object file? > >> > >> The point of .rela is to allow addends that do not fit into the > >> underlying value. Such addends can not be expressed as the third > >> argument of reloc(), either. And IMHO the middleend should not worry > >> about such details. > > > > > > Something has to worry about them at some point. If a frontend/pass is > > creating relocations, then it will need to know at least vaguely which > > addend it wants. If that's the case, we can make it the single component > > responsible for worrying about the whole addend, rather than the > > responsibility being diffuse over a number of components. > > > > Regarding width, I believe that no object format we support uses an > addend > > width wider than 64 bits, so we can just use a uint64_t. > > You mean as a fourth argument to reloc()? >No, I mean the third argument. Peter> > > > Peter > >> > >> > >> > > >> > Peter > >> > > >> >> > >> >> On Wed, Aug 26, 2015 at 3:29 PM, Peter Collingbourne via llvm-dev > >> >> <llvm-dev at lists.llvm.org> wrote: > >> >> > On Wed, Aug 26, 2015 at 03:53:33PM -0400, Rafael Espíndola wrote: > >> >> >> > I'm not sure if this would be sufficient. The R_ARM_JUMP24 > >> >> >> > relocation > >> >> >> > on ARM has specific semantics to implement ARM/Thumb > interworking; > >> >> >> > see > >> >> >> > > >> >> >> > > >> >> >> > http://infocenter.arm.com/help/topic/com.arm.doc. > ihi0044e/IHI0044E_aaelf.pdf > >> >> >> > Note that R_ARM_CALL has the same operation but different > >> >> >> > semantics. > >> >> >> > I suppose that we could try looking at the addend to decide > which > >> >> >> > relocation > >> >> >> > to use, but this would mean adding more complexity to the > >> >> >> > assembler > >> >> >> > (along > >> >> >> > with any pattern matching that would need to be done). It seems > >> >> >> > simpler, > >> >> >> > both conceptually and in the implementation, for the client to > >> >> >> > directly say > >> >> >> > what it wants in the object file. > >> >> >> > > >> >> >> > There's also the point that if @foo is defined outside the > current > >> >> >> > linkage > >> >> >> > unit, or refers to a Thumb function, the above expression in a > >> >> >> > constant > >> >> >> > initializer would refer to the function's PLT entry or a shim, > but > >> >> >> > in > >> >> >> > a > >> >> >> > function it would refer to the function's actual address, so the > >> >> >> > evaluation > >> >> >> > of this expression would depend on whether it was constant > folded. > >> >> >> > (Although > >> >> >> > on the other hand we might just declare that by using such a > >> >> >> > constant > >> >> >> > in a > >> >> >> > global initializer that may be constant folded the client is > >> >> >> > asserting that > >> >> >> > it doesn't care which address is used.) > >> >> >> > >> >> >> I am pretty sure there is use for some target specific > expressions, > >> >> >> my > >> >> >> concerns are > >> >> >> * Using a target specific expression when it could be represented > in > >> >> >> a > >> >> >> target independent way (possibly a bit more verbose). > >> >> > > >> >> > Well I don't think there's a target independent way to write an > >> >> > R_ARM_JUMP24 > >> >> > relocation, as there's no way to represent the PLT entry or > >> >> > interworking > >> >> > shim in IR. > >> >> > > >> >> >> * Using the raw relocation values, instead of something like > >> >> >> thumb_addr_delta. With this the semantics of each constant > >> >> >> expression > >> >> >> are still documented in the language reference. > >> >> > > >> >> > I guess there are two ways we can go here: > >> >> > > >> >> > 1) expose the raw relocation values > >> >> > 2) introduce new specific ConstantExpr subtypes for the > >> >> > target-specific > >> >> > things we need > >> >> > > >> >> > In this case I think we should do one or the other, I don't really > >> >> > think > >> >> > it's > >> >> > worth adding a half measure of flexibility (e.g. providing a way to > >> >> > specify > >> >> > the addend of a R_ARM_JUMP24 when it will pretty much always be the > >> >> > same). > >> >> > > >> >> > I like option 1 because it's more general purpose and ultimately > less > >> >> > of > >> >> > an > >> >> > impedance mismatch between what the client wants and what appears > in > >> >> > the > >> >> > object file, and we can solve the documentation problem with > >> >> > reference > >> >> > to > >> >> > the object file format documentation, but it would require our > >> >> > documentation > >> >> > to depend on sometimes poorly documented object file formats. > >> >> > > >> >> > Option 2 could look something like this (produces the same bytes as > >> >> > "b > >> >> > some_label" in every object format when targeting ARM, or "b.w > >> >> > some_label" > >> >> > when targeting Thumb): > >> >> > > >> >> > i32 arm_b (void ()* @some_label) > >> >> > > >> >> > and that would be easy to document on its own. The downside is that > >> >> > it's > >> >> > pretty specific to my use case, but maybe that's ok. > >> >> > > >> >> > 2 seems like it would be less implementation work, and doesn't > >> >> > require > >> >> > any > >> >> > changes to the assembly format (and ultimately could be upgraded > to 1 > >> >> > later > >> >> > if needed), so maybe it's best to start with that. > >> >> > > >> >> >> >> Why do you need to be able to avoid them showing up in function > >> >> >> >> bodies? It would be unusual but valid to pass the above value > as > >> >> >> >> an > >> >> >> >> argument to a function. > >> >> >> > > >> >> >> > This was part of the proposal mainly for the constant folding > >> >> >> > reasons > >> >> >> > mentioned > >> >> >> > above, but if we did go with a reloc expression we'd need to > >> >> >> > encode > >> >> >> > the > >> >> >> > original constant address in the reloc for PC-relative > >> >> >> > expressions, > >> >> >> > which > >> >> >> > wouldn't be necessary if we disallow it. > >> >> >> > >> >> >> Seems better to make it explicit IMHO. > >> >> > > >> >> > Okay, but if we do introduce a new constant kind, there doesn't > seem > >> >> > to > >> >> > be > >> >> > much point in teaching the backend to lower it in a function, other > >> >> > than > >> >> > for completeness. If we can avoid having to do that, that seems > >> >> > preferable. > >> >> > > >> >> >> BTW, about the assembly change: Please check what the binutils > guys > >> >> >> think of it. We do have extensions, but it is nice to at least let > >> >> >> them know so that we don't end up with two independent solutions > in > >> >> >> the future. > >> >> > > >> >> > Yes if I ultimately go with 1. > >> >> > > >> >> > Thanks, > >> >> > -- > >> >> > Peter > >> >> > _______________________________________________ > >> >> > LLVM Developers mailing list > >> >> > llvm-dev at lists.llvm.org > >> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > > >> > > >> > > >> > > >> > -- > >> > -- > >> > Peter > > > > > > > > > > -- > > -- > > Peter >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/9172fd8c/attachment.html>
Evgenii Stepanov via llvm-dev
2016-Oct-07 22:24 UTC
[llvm-dev] Proposal: arbitrary relocations in constant global initializers
On Fri, Oct 7, 2016 at 3:11 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:> > > On Fri, Oct 7, 2016 at 2:48 PM, Evgenii Stepanov <eugeni.stepanov at gmail.com> > wrote: >> >> On Fri, Oct 7, 2016 at 2:28 PM, Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> > On Fri, Oct 7, 2016 at 1:55 PM, Evgenii Stepanov >> > <eugeni.stepanov at gmail.com> >> > wrote: >> >> >> >> On Fri, Oct 7, 2016 at 1:22 PM, Peter Collingbourne <peter at pcc.me.uk> >> >> wrote: >> >> > On Fri, Oct 7, 2016 at 12:20 PM, Evgenii Stepanov >> >> > <eugeni.stepanov at gmail.com> wrote: >> >> >> >> >> >> I've tried implementing some of the alternatives mentioned in this >> >> >> thread, and so far I like this syntax the most: >> >> >> >> >> >> i32 reloc (29, void ()* @f, 3925868544) >> >> >> ; 29 = 0x1d = R_ARM_JUMP24 >> >> >> ; 3925868544 = 0xea000000 >> >> >> >> >> >> Note the zeroes in the relocated data instead of 0xfffffe in the >> >> >> original proposal. This is aligned with the way LLVM emits >> >> >> relocations >> >> >> in the backend, and avoids encoding the addend in a >> >> >> relocation-specific way in the IR. >> >> > >> >> > >> >> > I am confused by this statement. If the zeros aren't what appear in >> >> > the >> >> > object file, it seems rather relocation specific to me. >> >> >> >> These bytes will always be zeroes, which makes them not relocation >> >> specific. >> >> Object file contents, on the other hand, are relocation specific. In >> >> particular the constant 0xfffffe is ARM_JUMP24 encoding for zero >> >> offset (from the start of the current instruction). >> >> >> >> Somehow I find this IR representation very natural - you've got data >> >> bytes for anything that's not relocated, and the target expression >> >> (possibly including addend). >> > >> > >> > My point is that the addend mangling between the IR and the object file >> > would be relocation specific. >> > >> > What happens if I want to start using some new type of relocation? Will >> > I >> > need to teach the MC layer about it? >> >> Yes. MC needs to know if it is pc-relative or not, at least. > > > Why? > >> >> What's >> the benefit in bypassing MC completely? > > > To reduce complexity. Rather than teaching MC about every relocation to be > used with reloc, you can just teach the component that produces the reloc. > >> > >> >> > >> >> >> >> >> >> Instead, the addend can be >> >> >> specified in the second argument with the regular IR expressions, >> >> >> like >> >> >> the following: >> >> >> >> >> >> @w = internal global [3 x i32] >> >> >> [i32 reloc (29, void ()* @f, 3925868544), >> >> >> i32 reloc (29, [3 x i32]* @w, 3925868544), >> >> >> i32 reloc (29, i32* getelementptr (i32, i32* bitcast ([3 x i32]* >> >> >> @w to i32*), i32 1), 3925868544) >> >> >> ], align 4 >> >> >> >> >> >> >> >> >> >> >> >> we also get relocations for elements 1 and 2 of @w optimized out for >> >> >> >> >> >> free. If the "addend" (i.e. the third arg of reloc) was specified as >> >> >> >> >> >> 0xeafffffe, the backend would have had to decode this value first. >> >> > >> >> > >> >> > I think it may be ok to allow non-global constants as the second >> >> > operand >> >> > (the utility of this feature being the ability to freely RAUW a >> >> > global >> >> > without worrying about reloc constants). >> >> > >> >> > This doesn't necessarily need to act as an alternative means of >> >> > specifying >> >> > an addend, though. Instead, the backend could synthesise local >> >> > symbols >> >> > to >> >> > act as relocation targets. For example, your example would >> >> > conceptually >> >> > translate to: >> >> > >> >> > @w = internal global [3 x i32] >> >> > [i32 reloc (29, void ()* @f, 3925868544), >> >> > i32 reloc (29, [3 x i32]* @w, 3925868544), >> >> > i32 reloc (29, i32* @dummy, i32 1), 3925868544) >> >> > >> >> > @dummy = internal alias i32* getelementptr (i32, i32* bitcast ([3 x >> >> > i32]* @w >> >> > to i32*), i32 1) >> >> > >> >> > This way, you save yourself from needing to worry about manipulating >> >> > addends >> >> > in the backend, the linker will take care of it for you. >> >> >> >> That's no worry at all, AsmPrinter::lowerConstant evaluates both >> >> constant expressions to MCExpr: w + 4. >> > >> > >> > But you still need to worry about how "w + 4" is represented in the >> > object >> > file. >> >> It's a relocation with target "w" and addend "4". > > > Someone needs to implement how to apply the addend 4 to the addend > 0xea000000. That's what I meant by manipulating addends. You could do that > by relying on MC to do it (your proposal), or you can rely on the linker to > do it (my proposal).It's already implemented in the same code that emits regular branches on ARM (and 0xea000000 is not an addend; it's section data with space (zeroes) for the target- and relocation-specific encoding of the addend).> >> With my proposal, the frontend/middleend controls section data >> indirectly, meaning the actual final section data does not appear as >> an IR constant, but we can still get whatever constant we want. On the >> other hand, this representation is better for optimizations (instead >> of a magic constant 0xfffffe you have a transparent expression w+4). >> >> To optimize 0xfffffe representation, the backend would need to decode >> the constant, which is the new code that has to be written for any >> relocation you'd like to use in reloc(). And if we allow such >> optimizations, we may end up with section bytes that are different >> from the reloc() constant anyway. > > > Reloc constants are not meant to be "optimized", they are a means of > communicating from the compiler to the linker.Why not? We can have both. We still control output bytes pretty well, and we get a nice optimization in the case when jump offset can be calculated by the compiler and the relocation can be omitted.> >> Note that either way the frontend/middleend knows the size of the >> relocated object (jump table entry). >> >> > >> >> >> >> Do you suggest we use this to limit "reloc" to accepting only >> >> GlobalValue as the second argument instead of an arbitrary Constant? >> > >> > >> > No, we would accept your example and conceptually translate it into my >> > example. >> >> >> >> >> >> > >> >> >> On the other hand, it is possible for a constant expression in the >> >> >> IR >> >> >> to be lowered to something that is not a valid relocation target, >> >> >> and >> >> >> it is hard to detect this problem at the IR level. >> >> > >> >> > >> >> > Right, this is of course a problem we already have for aliasees and >> >> > constant >> >> > initializers. >> >> > >> >> >> >> >> >> Also, separating the addend from the section data allows the backend >> >> >> to choose between .rel and .rela representations. >> >> > >> >> > >> >> > Do you have an example of a rela relocation which uses both r_addend >> >> > and >> >> > the >> >> > underlying value in the object file? >> >> >> >> The point of .rela is to allow addends that do not fit into the >> >> underlying value. Such addends can not be expressed as the third >> >> argument of reloc(), either. And IMHO the middleend should not worry >> >> about such details. >> > >> > >> > Something has to worry about them at some point. If a frontend/pass is >> > creating relocations, then it will need to know at least vaguely which >> > addend it wants. If that's the case, we can make it the single component >> > responsible for worrying about the whole addend, rather than the >> > responsibility being diffuse over a number of components. >> > >> > Regarding width, I believe that no object format we support uses an >> > addend >> > width wider than 64 bits, so we can just use a uint64_t. >> >> You mean as a fourth argument to reloc()? > > > No, I mean the third argument.The third argument is not an addend. It's section data. For R_ARM_JUMP24, for example, the relocation size is 3 bytes, without 0xea. Rela exists exactly for the case where both can not fit in a fixed size space.> > Peter > > >> >> > >> > Peter >> >> >> >> >> >> > >> >> > Peter >> >> > >> >> >> >> >> >> On Wed, Aug 26, 2015 at 3:29 PM, Peter Collingbourne via llvm-dev >> >> >> <llvm-dev at lists.llvm.org> wrote: >> >> >> > On Wed, Aug 26, 2015 at 03:53:33PM -0400, Rafael Espíndola wrote: >> >> >> >> > I'm not sure if this would be sufficient. The R_ARM_JUMP24 >> >> >> >> > relocation >> >> >> >> > on ARM has specific semantics to implement ARM/Thumb >> >> >> >> > interworking; >> >> >> >> > see >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044e/IHI0044E_aaelf.pdf >> >> >> >> > Note that R_ARM_CALL has the same operation but different >> >> >> >> > semantics. >> >> >> >> > I suppose that we could try looking at the addend to decide >> >> >> >> > which >> >> >> >> > relocation >> >> >> >> > to use, but this would mean adding more complexity to the >> >> >> >> > assembler >> >> >> >> > (along >> >> >> >> > with any pattern matching that would need to be done). It seems >> >> >> >> > simpler, >> >> >> >> > both conceptually and in the implementation, for the client to >> >> >> >> > directly say >> >> >> >> > what it wants in the object file. >> >> >> >> > >> >> >> >> > There's also the point that if @foo is defined outside the >> >> >> >> > current >> >> >> >> > linkage >> >> >> >> > unit, or refers to a Thumb function, the above expression in a >> >> >> >> > constant >> >> >> >> > initializer would refer to the function's PLT entry or a shim, >> >> >> >> > but >> >> >> >> > in >> >> >> >> > a >> >> >> >> > function it would refer to the function's actual address, so >> >> >> >> > the >> >> >> >> > evaluation >> >> >> >> > of this expression would depend on whether it was constant >> >> >> >> > folded. >> >> >> >> > (Although >> >> >> >> > on the other hand we might just declare that by using such a >> >> >> >> > constant >> >> >> >> > in a >> >> >> >> > global initializer that may be constant folded the client is >> >> >> >> > asserting that >> >> >> >> > it doesn't care which address is used.) >> >> >> >> >> >> >> >> I am pretty sure there is use for some target specific >> >> >> >> expressions, >> >> >> >> my >> >> >> >> concerns are >> >> >> >> * Using a target specific expression when it could be represented >> >> >> >> in >> >> >> >> a >> >> >> >> target independent way (possibly a bit more verbose). >> >> >> > >> >> >> > Well I don't think there's a target independent way to write an >> >> >> > R_ARM_JUMP24 >> >> >> > relocation, as there's no way to represent the PLT entry or >> >> >> > interworking >> >> >> > shim in IR. >> >> >> > >> >> >> >> * Using the raw relocation values, instead of something like >> >> >> >> thumb_addr_delta. With this the semantics of each constant >> >> >> >> expression >> >> >> >> are still documented in the language reference. >> >> >> > >> >> >> > I guess there are two ways we can go here: >> >> >> > >> >> >> > 1) expose the raw relocation values >> >> >> > 2) introduce new specific ConstantExpr subtypes for the >> >> >> > target-specific >> >> >> > things we need >> >> >> > >> >> >> > In this case I think we should do one or the other, I don't really >> >> >> > think >> >> >> > it's >> >> >> > worth adding a half measure of flexibility (e.g. providing a way >> >> >> > to >> >> >> > specify >> >> >> > the addend of a R_ARM_JUMP24 when it will pretty much always be >> >> >> > the >> >> >> > same). >> >> >> > >> >> >> > I like option 1 because it's more general purpose and ultimately >> >> >> > less >> >> >> > of >> >> >> > an >> >> >> > impedance mismatch between what the client wants and what appears >> >> >> > in >> >> >> > the >> >> >> > object file, and we can solve the documentation problem with >> >> >> > reference >> >> >> > to >> >> >> > the object file format documentation, but it would require our >> >> >> > documentation >> >> >> > to depend on sometimes poorly documented object file formats. >> >> >> > >> >> >> > Option 2 could look something like this (produces the same bytes >> >> >> > as >> >> >> > "b >> >> >> > some_label" in every object format when targeting ARM, or "b.w >> >> >> > some_label" >> >> >> > when targeting Thumb): >> >> >> > >> >> >> > i32 arm_b (void ()* @some_label) >> >> >> > >> >> >> > and that would be easy to document on its own. The downside is >> >> >> > that >> >> >> > it's >> >> >> > pretty specific to my use case, but maybe that's ok. >> >> >> > >> >> >> > 2 seems like it would be less implementation work, and doesn't >> >> >> > require >> >> >> > any >> >> >> > changes to the assembly format (and ultimately could be upgraded >> >> >> > to 1 >> >> >> > later >> >> >> > if needed), so maybe it's best to start with that. >> >> >> > >> >> >> >> >> Why do you need to be able to avoid them showing up in >> >> >> >> >> function >> >> >> >> >> bodies? It would be unusual but valid to pass the above value >> >> >> >> >> as >> >> >> >> >> an >> >> >> >> >> argument to a function. >> >> >> >> > >> >> >> >> > This was part of the proposal mainly for the constant folding >> >> >> >> > reasons >> >> >> >> > mentioned >> >> >> >> > above, but if we did go with a reloc expression we'd need to >> >> >> >> > encode >> >> >> >> > the >> >> >> >> > original constant address in the reloc for PC-relative >> >> >> >> > expressions, >> >> >> >> > which >> >> >> >> > wouldn't be necessary if we disallow it. >> >> >> >> >> >> >> >> Seems better to make it explicit IMHO. >> >> >> > >> >> >> > Okay, but if we do introduce a new constant kind, there doesn't >> >> >> > seem >> >> >> > to >> >> >> > be >> >> >> > much point in teaching the backend to lower it in a function, >> >> >> > other >> >> >> > than >> >> >> > for completeness. If we can avoid having to do that, that seems >> >> >> > preferable. >> >> >> > >> >> >> >> BTW, about the assembly change: Please check what the binutils >> >> >> >> guys >> >> >> >> think of it. We do have extensions, but it is nice to at least >> >> >> >> let >> >> >> >> them know so that we don't end up with two independent solutions >> >> >> >> in >> >> >> >> the future. >> >> >> > >> >> >> > Yes if I ultimately go with 1. >> >> >> > >> >> >> > Thanks, >> >> >> > -- >> >> >> > Peter >> >> >> > _______________________________________________ >> >> >> > LLVM Developers mailing list >> >> >> > llvm-dev at lists.llvm.org >> >> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > >> >> > >> >> > >> >> > >> >> > -- >> >> > -- >> >> > Peter >> > >> > >> > >> > >> > -- >> > -- >> > Peter > > > > > -- > -- > Peter
Peter Collingbourne via llvm-dev
2016-Oct-07 22:44 UTC
[llvm-dev] Proposal: arbitrary relocations in constant global initializers
On Fri, Oct 7, 2016 at 3:24 PM, Evgenii Stepanov <eugeni.stepanov at gmail.com> wrote:> On Fri, Oct 7, 2016 at 3:11 PM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > > > On Fri, Oct 7, 2016 at 2:48 PM, Evgenii Stepanov < > eugeni.stepanov at gmail.com> > > wrote: > >> > >> On Fri, Oct 7, 2016 at 2:28 PM, Peter Collingbourne <peter at pcc.me.uk> > >> wrote: > >> > On Fri, Oct 7, 2016 at 1:55 PM, Evgenii Stepanov > >> > <eugeni.stepanov at gmail.com> > >> > wrote: > >> >> > >> >> On Fri, Oct 7, 2016 at 1:22 PM, Peter Collingbourne <peter at pcc.me.uk > > > >> >> wrote: > >> >> > On Fri, Oct 7, 2016 at 12:20 PM, Evgenii Stepanov > >> >> > <eugeni.stepanov at gmail.com> wrote: > >> >> >> > >> >> >> I've tried implementing some of the alternatives mentioned in this > >> >> >> thread, and so far I like this syntax the most: > >> >> >> > >> >> >> i32 reloc (29, void ()* @f, 3925868544) > >> >> >> ; 29 = 0x1d = R_ARM_JUMP24 > >> >> >> ; 3925868544 = 0xea000000 > >> >> >> > >> >> >> Note the zeroes in the relocated data instead of 0xfffffe in the > >> >> >> original proposal. This is aligned with the way LLVM emits > >> >> >> relocations > >> >> >> in the backend, and avoids encoding the addend in a > >> >> >> relocation-specific way in the IR. > >> >> > > >> >> > > >> >> > I am confused by this statement. If the zeros aren't what appear in > >> >> > the > >> >> > object file, it seems rather relocation specific to me. > >> >> > >> >> These bytes will always be zeroes, which makes them not relocation > >> >> specific. > >> >> Object file contents, on the other hand, are relocation specific. In > >> >> particular the constant 0xfffffe is ARM_JUMP24 encoding for zero > >> >> offset (from the start of the current instruction). > >> >> > >> >> Somehow I find this IR representation very natural - you've got data > >> >> bytes for anything that's not relocated, and the target expression > >> >> (possibly including addend). > >> > > >> > > >> > My point is that the addend mangling between the IR and the object > file > >> > would be relocation specific. > >> > > >> > What happens if I want to start using some new type of relocation? > Will > >> > I > >> > need to teach the MC layer about it? > >> > >> Yes. MC needs to know if it is pc-relative or not, at least. > > > > > > Why? > > > >> > >> What's > >> the benefit in bypassing MC completely? > > > > > > To reduce complexity. Rather than teaching MC about every relocation to > be > > used with reloc, you can just teach the component that produces the > reloc. > > > >> > > >> >> > > >> >> >> > >> >> >> Instead, the addend can be > >> >> >> specified in the second argument with the regular IR expressions, > >> >> >> like > >> >> >> the following: > >> >> >> > >> >> >> @w = internal global [3 x i32] > >> >> >> [i32 reloc (29, void ()* @f, 3925868544), > >> >> >> i32 reloc (29, [3 x i32]* @w, 3925868544), > >> >> >> i32 reloc (29, i32* getelementptr (i32, i32* bitcast ([3 x > i32]* > >> >> >> @w to i32*), i32 1), 3925868544) > >> >> >> ], align 4 > >> >> >> > >> >> >> > >> >> >> > >> >> >> we also get relocations for elements 1 and 2 of @w optimized out > for > >> >> >> > >> >> >> free. If the "addend" (i.e. the third arg of reloc) was specified > as > >> >> >> > >> >> >> 0xeafffffe, the backend would have had to decode this value first. > >> >> > > >> >> > > >> >> > I think it may be ok to allow non-global constants as the second > >> >> > operand > >> >> > (the utility of this feature being the ability to freely RAUW a > >> >> > global > >> >> > without worrying about reloc constants). > >> >> > > >> >> > This doesn't necessarily need to act as an alternative means of > >> >> > specifying > >> >> > an addend, though. Instead, the backend could synthesise local > >> >> > symbols > >> >> > to > >> >> > act as relocation targets. For example, your example would > >> >> > conceptually > >> >> > translate to: > >> >> > > >> >> > @w = internal global [3 x i32] > >> >> > [i32 reloc (29, void ()* @f, 3925868544), > >> >> > i32 reloc (29, [3 x i32]* @w, 3925868544), > >> >> > i32 reloc (29, i32* @dummy, i32 1), 3925868544) > >> >> > > >> >> > @dummy = internal alias i32* getelementptr (i32, i32* bitcast ([3 x > >> >> > i32]* @w > >> >> > to i32*), i32 1) > >> >> > > >> >> > This way, you save yourself from needing to worry about > manipulating > >> >> > addends > >> >> > in the backend, the linker will take care of it for you. > >> >> > >> >> That's no worry at all, AsmPrinter::lowerConstant evaluates both > >> >> constant expressions to MCExpr: w + 4. > >> > > >> > > >> > But you still need to worry about how "w + 4" is represented in the > >> > object > >> > file. > >> > >> It's a relocation with target "w" and addend "4". > > > > > > Someone needs to implement how to apply the addend 4 to the addend > > 0xea000000. That's what I meant by manipulating addends. You could do > that > > by relying on MC to do it (your proposal), or you can rely on the linker > to > > do it (my proposal). > > It's already implemented in the same code that emits regular branches > on ARM (and 0xea000000 is not an addend; it's section data with space > (zeroes) for the target- and relocation-specific encoding of the > addend). >Okay, in your proposal it's "section data".> > >> With my proposal, the frontend/middleend controls section data > >> indirectly, meaning the actual final section data does not appear as > >> an IR constant, but we can still get whatever constant we want. On the > >> other hand, this representation is better for optimizations (instead > >> of a magic constant 0xfffffe you have a transparent expression w+4). > >> > >> To optimize 0xfffffe representation, the backend would need to decode > >> the constant, which is the new code that has to be written for any > >> relocation you'd like to use in reloc(). And if we allow such > >> optimizations, we may end up with section bytes that are different > >> from the reloc() constant anyway. > > > > > > Reloc constants are not meant to be "optimized", they are a means of > > communicating from the compiler to the linker. > > Why not? We can have both. We still control output bytes pretty well, > and we get a nice optimization in the case when jump offset can be > calculated by the compiler and the relocation can be omitted. >That optimization is at the cost of complexity in MC, and it doesn't really matter in the end because (1) these constants are rare and (2) the final linked executable or DSO will have resolved the relocations anyway.> > >> Note that either way the frontend/middleend knows the size of the > >> relocated object (jump table entry). > >> > >> > > >> >> > >> >> Do you suggest we use this to limit "reloc" to accepting only > >> >> GlobalValue as the second argument instead of an arbitrary Constant? > >> > > >> > > >> > No, we would accept your example and conceptually translate it into my > >> > example. > >> >> > >> >> > >> >> > > >> >> >> On the other hand, it is possible for a constant expression in the > >> >> >> IR > >> >> >> to be lowered to something that is not a valid relocation target, > >> >> >> and > >> >> >> it is hard to detect this problem at the IR level. > >> >> > > >> >> > > >> >> > Right, this is of course a problem we already have for aliasees and > >> >> > constant > >> >> > initializers. > >> >> > > >> >> >> > >> >> >> Also, separating the addend from the section data allows the > backend > >> >> >> to choose between .rel and .rela representations. > >> >> > > >> >> > > >> >> > Do you have an example of a rela relocation which uses both > r_addend > >> >> > and > >> >> > the > >> >> > underlying value in the object file? > >> >> > >> >> The point of .rela is to allow addends that do not fit into the > >> >> underlying value. Such addends can not be expressed as the third > >> >> argument of reloc(), either. And IMHO the middleend should not worry > >> >> about such details. > >> > > >> > > >> > Something has to worry about them at some point. If a frontend/pass is > >> > creating relocations, then it will need to know at least vaguely which > >> > addend it wants. If that's the case, we can make it the single > component > >> > responsible for worrying about the whole addend, rather than the > >> > responsibility being diffuse over a number of components. > >> > > >> > Regarding width, I believe that no object format we support uses an > >> > addend > >> > width wider than 64 bits, so we can just use a uint64_t. > >> > >> You mean as a fourth argument to reloc()? > > > > > > No, I mean the third argument. > > The third argument is not an addend. It's section data. For > R_ARM_JUMP24, for example, the relocation size is 3 bytes, without > 0xea. Rela exists exactly for the case where both can not fit in a > fixed size space. >My proposal would be to store it inline or in r_addend. That makes it an addend (I know that it isn't always added to directly, but that's what ELF calls it). Peter> > > > > Peter > > > > > >> > >> > > >> > Peter > >> >> > >> >> > >> >> > > >> >> > Peter > >> >> > > >> >> >> > >> >> >> On Wed, Aug 26, 2015 at 3:29 PM, Peter Collingbourne via llvm-dev > >> >> >> <llvm-dev at lists.llvm.org> wrote: > >> >> >> > On Wed, Aug 26, 2015 at 03:53:33PM -0400, Rafael Espíndola > wrote: > >> >> >> >> > I'm not sure if this would be sufficient. The R_ARM_JUMP24 > >> >> >> >> > relocation > >> >> >> >> > on ARM has specific semantics to implement ARM/Thumb > >> >> >> >> > interworking; > >> >> >> >> > see > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > http://infocenter.arm.com/help/topic/com.arm.doc. > ihi0044e/IHI0044E_aaelf.pdf > >> >> >> >> > Note that R_ARM_CALL has the same operation but different > >> >> >> >> > semantics. > >> >> >> >> > I suppose that we could try looking at the addend to decide > >> >> >> >> > which > >> >> >> >> > relocation > >> >> >> >> > to use, but this would mean adding more complexity to the > >> >> >> >> > assembler > >> >> >> >> > (along > >> >> >> >> > with any pattern matching that would need to be done). It > seems > >> >> >> >> > simpler, > >> >> >> >> > both conceptually and in the implementation, for the client > to > >> >> >> >> > directly say > >> >> >> >> > what it wants in the object file. > >> >> >> >> > > >> >> >> >> > There's also the point that if @foo is defined outside the > >> >> >> >> > current > >> >> >> >> > linkage > >> >> >> >> > unit, or refers to a Thumb function, the above expression in > a > >> >> >> >> > constant > >> >> >> >> > initializer would refer to the function's PLT entry or a > shim, > >> >> >> >> > but > >> >> >> >> > in > >> >> >> >> > a > >> >> >> >> > function it would refer to the function's actual address, so > >> >> >> >> > the > >> >> >> >> > evaluation > >> >> >> >> > of this expression would depend on whether it was constant > >> >> >> >> > folded. > >> >> >> >> > (Although > >> >> >> >> > on the other hand we might just declare that by using such a > >> >> >> >> > constant > >> >> >> >> > in a > >> >> >> >> > global initializer that may be constant folded the client is > >> >> >> >> > asserting that > >> >> >> >> > it doesn't care which address is used.) > >> >> >> >> > >> >> >> >> I am pretty sure there is use for some target specific > >> >> >> >> expressions, > >> >> >> >> my > >> >> >> >> concerns are > >> >> >> >> * Using a target specific expression when it could be > represented > >> >> >> >> in > >> >> >> >> a > >> >> >> >> target independent way (possibly a bit more verbose). > >> >> >> > > >> >> >> > Well I don't think there's a target independent way to write an > >> >> >> > R_ARM_JUMP24 > >> >> >> > relocation, as there's no way to represent the PLT entry or > >> >> >> > interworking > >> >> >> > shim in IR. > >> >> >> > > >> >> >> >> * Using the raw relocation values, instead of something like > >> >> >> >> thumb_addr_delta. With this the semantics of each constant > >> >> >> >> expression > >> >> >> >> are still documented in the language reference. > >> >> >> > > >> >> >> > I guess there are two ways we can go here: > >> >> >> > > >> >> >> > 1) expose the raw relocation values > >> >> >> > 2) introduce new specific ConstantExpr subtypes for the > >> >> >> > target-specific > >> >> >> > things we need > >> >> >> > > >> >> >> > In this case I think we should do one or the other, I don't > really > >> >> >> > think > >> >> >> > it's > >> >> >> > worth adding a half measure of flexibility (e.g. providing a way > >> >> >> > to > >> >> >> > specify > >> >> >> > the addend of a R_ARM_JUMP24 when it will pretty much always be > >> >> >> > the > >> >> >> > same). > >> >> >> > > >> >> >> > I like option 1 because it's more general purpose and ultimately > >> >> >> > less > >> >> >> > of > >> >> >> > an > >> >> >> > impedance mismatch between what the client wants and what > appears > >> >> >> > in > >> >> >> > the > >> >> >> > object file, and we can solve the documentation problem with > >> >> >> > reference > >> >> >> > to > >> >> >> > the object file format documentation, but it would require our > >> >> >> > documentation > >> >> >> > to depend on sometimes poorly documented object file formats. > >> >> >> > > >> >> >> > Option 2 could look something like this (produces the same bytes > >> >> >> > as > >> >> >> > "b > >> >> >> > some_label" in every object format when targeting ARM, or "b.w > >> >> >> > some_label" > >> >> >> > when targeting Thumb): > >> >> >> > > >> >> >> > i32 arm_b (void ()* @some_label) > >> >> >> > > >> >> >> > and that would be easy to document on its own. The downside is > >> >> >> > that > >> >> >> > it's > >> >> >> > pretty specific to my use case, but maybe that's ok. > >> >> >> > > >> >> >> > 2 seems like it would be less implementation work, and doesn't > >> >> >> > require > >> >> >> > any > >> >> >> > changes to the assembly format (and ultimately could be upgraded > >> >> >> > to 1 > >> >> >> > later > >> >> >> > if needed), so maybe it's best to start with that. > >> >> >> > > >> >> >> >> >> Why do you need to be able to avoid them showing up in > >> >> >> >> >> function > >> >> >> >> >> bodies? It would be unusual but valid to pass the above > value > >> >> >> >> >> as > >> >> >> >> >> an > >> >> >> >> >> argument to a function. > >> >> >> >> > > >> >> >> >> > This was part of the proposal mainly for the constant folding > >> >> >> >> > reasons > >> >> >> >> > mentioned > >> >> >> >> > above, but if we did go with a reloc expression we'd need to > >> >> >> >> > encode > >> >> >> >> > the > >> >> >> >> > original constant address in the reloc for PC-relative > >> >> >> >> > expressions, > >> >> >> >> > which > >> >> >> >> > wouldn't be necessary if we disallow it. > >> >> >> >> > >> >> >> >> Seems better to make it explicit IMHO. > >> >> >> > > >> >> >> > Okay, but if we do introduce a new constant kind, there doesn't > >> >> >> > seem > >> >> >> > to > >> >> >> > be > >> >> >> > much point in teaching the backend to lower it in a function, > >> >> >> > other > >> >> >> > than > >> >> >> > for completeness. If we can avoid having to do that, that seems > >> >> >> > preferable. > >> >> >> > > >> >> >> >> BTW, about the assembly change: Please check what the binutils > >> >> >> >> guys > >> >> >> >> think of it. We do have extensions, but it is nice to at least > >> >> >> >> let > >> >> >> >> them know so that we don't end up with two independent > solutions > >> >> >> >> in > >> >> >> >> the future. > >> >> >> > > >> >> >> > Yes if I ultimately go with 1. > >> >> >> > > >> >> >> > Thanks, > >> >> >> > -- > >> >> >> > Peter > >> >> >> > _______________________________________________ > >> >> >> > LLVM Developers mailing list > >> >> >> > llvm-dev at lists.llvm.org > >> >> >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > -- > >> >> > Peter > >> > > >> > > >> > > >> > > >> > -- > >> > -- > >> > Peter > > > > > > > > > > -- > > -- > > Peter >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/9c6e1f99/attachment-0001.html>