Chris Lattner via llvm-dev
2016-Oct-27 04:48 UTC
[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands
On Oct 26, 2016, at 1:34 AM, Peter Collingbourne <peter at pcc.me.uk> wrote:> On Tue, Oct 25, 2016 at 10:48 PM, Chris Lattner <clattner at apple.com <mailto:clattner at apple.com>> wrote: > Responding to both of your emails in one, sorry for the delay: > >> On Oct 25, 2016, at 11:20 AM, Peter Collingbourne <peter at pcc.me.uk <mailto:peter at pcc.me.uk>> wrote: >> I think there are a couple of additional considerations we should make here: >> What are we trying to model? To me it's clear that GlobalConstant is for modelling integers, not pointers. That alone may not necessarily be enough to motivate a representational change, but… > I understand where you’re coming from, but I think we’re modeling three different things, and disagreeing about how to clump them together. The three things I see in flight are: > > 1) typical globals that are laid out in some unknown way in the address space. > 2) globals that may be tied to a specific knowable address range due to a limited compilation model (e.g. a deeply embedded core) that fits into an immedaite range (e.g. 0…255, 0…65536, etc). > 3) Immediates that are treated as symbolic for CFI’s perspective (so they can’t just be used as a literal immediate) that are resolved at link time, but are known to have limited range. > > There is also "4) immediates with an obvious known value”, but those are obviously ConstantInt’s and not interesting to discuss here. > > The design I’m arguing for is to clump #2 and #3 into the same group. > > I am not sure if this is sound if we want the no-alias assumption (see also below) to hold for #2 but not for #3. > > This can be done one of two different ways, but both ways use the same “declaration side” reference, which has a !range metadata attached to it. The three approaches I see are: > > a) Introduce a new GlobalConstant definition, whose value is the concrete address that the linker should resolve. > b) Use an alias as the definition, whose body is a ptrtoint constant of the same value. > c) Use a zero size globalvariable with a range metadata specifying the exact address decided. > > I’m not very knowledgable about why approach b won’t work, but if it could, it seems preferable because it fits in with our current model. > > b would work in that it would give us the right bits in the object file, but it would be a little odd to use a different type for declarations as for definitions. That said, I don't have a strong objection to it.I can understand what you’re saying here, but this is already the case for aliases. You can never have a “declaration side” for an alias that is an alias (you have to use an external global variable or a function with no body). From the discussion over the last day it sounds to me that “b” is the best approach, except for the (significant) annoyance that these things can be possibly aliased. However, I don’t understand how this works in practice today for aliases. By their very name, they are *all about* introducing aliases, so how is AA allowed to assume that two external global variable references are unaliased anyway? One may be resolved as an alias to the other afterall, completely independent of your proposal. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161026/8fc5b1d0/attachment.html>
Peter Collingbourne via llvm-dev
2016-Oct-27 05:45 UTC
[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands
On Wed, Oct 26, 2016 at 9:48 PM, Chris Lattner <clattner at apple.com> wrote:> On Oct 26, 2016, at 1:34 AM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > On Tue, Oct 25, 2016 at 10:48 PM, Chris Lattner <clattner at apple.com> > wrote: > >> Responding to both of your emails in one, sorry for the delay: >> >> On Oct 25, 2016, at 11:20 AM, Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> I think there are a couple of additional considerations we should make >> here: >> >> - What are we trying to model? To me it's clear that GlobalConstant >> is for modelling integers, not pointers. That alone may not necessarily be >> enough to motivate a representational change, but… >> >> I understand where you’re coming from, but I think we’re modeling three >> different things, and disagreeing about how to clump them together. The >> three things I see in flight are: >> >> 1) typical globals that are laid out in some unknown way in the address >> space. >> 2) globals that may be tied to a specific knowable address range due to a >> limited compilation model (e.g. a deeply embedded core) that fits into an >> immedaite range (e.g. 0…255, 0…65536, etc). >> 3) Immediates that are treated as symbolic for CFI’s perspective (so they >> can’t just be used as a literal immediate) that are resolved at link time, >> but are known to have limited range. >> >> There is also "4) immediates with an obvious known value”, but those are >> obviously ConstantInt’s and not interesting to discuss here. >> >> The design I’m arguing for is to clump #2 and #3 into the same group. >> > > I am not sure if this is sound if we want the no-alias assumption (see > also below) to hold for #2 but not for #3. > > >> This can be done one of two different ways, but both ways use the same >> “declaration side” reference, which has a !range metadata attached to it. >> The three approaches I see are: >> >> a) Introduce a new GlobalConstant definition, whose value is the concrete >> address that the linker should resolve. >> b) Use an alias as the definition, whose body is a ptrtoint constant of >> the same value. >> c) Use a zero size globalvariable with a range metadata specifying the >> exact address decided. >> >> I’m not very knowledgable about why approach b won’t work, but if it >> could, it seems preferable because it fits in with our current model. >> > > b would work in that it would give us the right bits in the object file, > but it would be a little odd to use a different type for declarations as > for definitions. That said, I don't have a strong objection to it. > > > I can understand what you’re saying here, but this is already the case for > aliases. You can never have a “declaration side” for an alias that is an > alias (you have to use an external global variable or a function with no > body). > > From the discussion over the last day it sounds to me that “b” is the best > approach, except for the (significant) annoyance that these things can be > possibly aliased. However, I don’t understand how this works in practice > today for aliases. By their very name, they are *all about* introducing > aliases, so how is AA allowed to assume that two external global variable > references are unaliased anyway? One may be resolved as an alias to the > other afterall, completely independent of your proposal. >I suppose that one way to think about it is that by using aliases you are stepping outside of the bounds of the language, i.e. no valid C/C++ declaration can be used to take the address of an alias without using reserved names or language extensions (clang uses aliases to implement some standard language features but they all have reserved names as far as I'm aware). Maybe this hasn't come up simply because language implementations (and users of language extensions) happen to never use aliases in a way that could expose the AA assumption. Thanks, -- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161026/f562dd59/attachment.html>
Peter Collingbourne via llvm-dev
2016-Nov-04 21:15 UTC
[llvm-dev] RFC: Absolute or "fixed address" symbols as immediate operands
On Wed, Oct 26, 2016 at 10:45 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:> On Wed, Oct 26, 2016 at 9:48 PM, Chris Lattner <clattner at apple.com> wrote: > >> On Oct 26, 2016, at 1:34 AM, Peter Collingbourne <peter at pcc.me.uk> wrote: >> >> On Tue, Oct 25, 2016 at 10:48 PM, Chris Lattner <clattner at apple.com> >> wrote: >> >>> Responding to both of your emails in one, sorry for the delay: >>> >>> On Oct 25, 2016, at 11:20 AM, Peter Collingbourne <peter at pcc.me.uk> >>> wrote: >>> I think there are a couple of additional considerations we should make >>> here: >>> >>> - What are we trying to model? To me it's clear that GlobalConstant >>> is for modelling integers, not pointers. That alone may not necessarily be >>> enough to motivate a representational change, but… >>> >>> I understand where you’re coming from, but I think we’re modeling three >>> different things, and disagreeing about how to clump them together. The >>> three things I see in flight are: >>> >>> 1) typical globals that are laid out in some unknown way in the address >>> space. >>> 2) globals that may be tied to a specific knowable address range due to >>> a limited compilation model (e.g. a deeply embedded core) that fits into an >>> immedaite range (e.g. 0…255, 0…65536, etc). >>> 3) Immediates that are treated as symbolic for CFI’s perspective (so >>> they can’t just be used as a literal immediate) that are resolved at link >>> time, but are known to have limited range. >>> >>> There is also "4) immediates with an obvious known value”, but those are >>> obviously ConstantInt’s and not interesting to discuss here. >>> >>> The design I’m arguing for is to clump #2 and #3 into the same group. >>> >> >> I am not sure if this is sound if we want the no-alias assumption (see >> also below) to hold for #2 but not for #3. >> >> >>> This can be done one of two different ways, but both ways use the same >>> “declaration side” reference, which has a !range metadata attached to it. >>> The three approaches I see are: >>> >>> a) Introduce a new GlobalConstant definition, whose value is the >>> concrete address that the linker should resolve. >>> b) Use an alias as the definition, whose body is a ptrtoint constant of >>> the same value. >>> c) Use a zero size globalvariable with a range metadata specifying the >>> exact address decided. >>> >>> I’m not very knowledgable about why approach b won’t work, but if it >>> could, it seems preferable because it fits in with our current model. >>> >> >> b would work in that it would give us the right bits in the object file, >> but it would be a little odd to use a different type for declarations as >> for definitions. That said, I don't have a strong objection to it. >> >> >> I can understand what you’re saying here, but this is already the case >> for aliases. You can never have a “declaration side” for an alias that is >> an alias (you have to use an external global variable or a function with no >> body). >> >> From the discussion over the last day it sounds to me that “b” is the >> best approach, except for the (significant) annoyance that these things can >> be possibly aliased. However, I don’t understand how this works in >> practice today for aliases. By their very name, they are *all about* >> introducing aliases, so how is AA allowed to assume that two external >> global variable references are unaliased anyway? One may be resolved as an >> alias to the other afterall, completely independent of your proposal. >> > > I suppose that one way to think about it is that by using aliases you are > stepping outside of the bounds of the language, i.e. no valid C/C++ > declaration can be used to take the address of an alias without using > reserved names or language extensions (clang uses aliases to implement some > standard language features but they all have reserved names as far as I'm > aware). > > Maybe this hasn't come up simply because language implementations (and > users of language extensions) happen to never use aliases in a way that > could expose the AA assumption. >Further to this, I think there are three things in play here: - "absolute": this primarily controls code generation, i.e. we need to know whether to emit absolute or relative relocations in PIC mode - "range" i.e. the range of the "address": this also controls code generation (used for selecting the narrowest possible relocation type) but could also affect midend and backend optimizers (e.g. computeKnownBits). This is !range metadata as in D25878 but also in principle could be modelled as a value with an integer type of a specific width. - "mayalias", i.e. whether the address may be an alias on the definition side. That could include an alias of a real global object or an absolute symbol formed with inttoptr. This is fundamentally a midend attribute that could be used by AA for example. In practice a global with mayalias should be treated by the midend like a pointer obtained by calling an external readnone function. As a preliminary step I think we can merge "absolute" and "range", i.e. we can't possibly know the range unless we also know that we can use absolute relocations. This is as implemented in D25878. So let's look at !range and mayalias and see how they interact: - !range alone: this is the "linker script" scenario where something external to the object provides some absolute memory mapping - mayalias alone: this could be used by language frontends that allow aliases at the language level - mayalias + !range: there are a couple of use cases a) the combination of the above two cases, or b) the sort of absolute constant references I'd like to have for CFI. Looking more closely at the third case, for part a pointers are a more accurate modelling and in part b integers are. Given that our model permits either pointers and integers in this specific case I would be prepared to accept that pointer modelling would be sufficient for b in order to avoid needing to model parts a and b differently. To be clear, what I think we should do at this point is to extend GlobalVariable with a mayalias attribute. This would overcome the modeling issue with D25878 and at that point we would be able to move forward with it. As Chris proposed, aliases would be used to model definitions. Thanks, -- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161104/9eb08678/attachment.html>
Reasonably Related Threads
- RFC: Absolute or "fixed address" symbols as immediate operands
- RFC: Absolute or "fixed address" symbols as immediate operands
- RFC: Absolute or "fixed address" symbols as immediate operands
- RFC: Absolute or "fixed address" symbols as immediate operands
- RFC: Absolute or "fixed address" symbols as immediate operands