David Chisnall via llvm-dev
2015-Sep-07 10:26 UTC
[llvm-dev] RFC: alloca -- specify address space for allocation
On 2 Sep 2015, at 02:54, Joseph Tremoulet via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Reading further, I see both that addrspacecast "can be a no-op cast or a complex value modification"[2] and that bitcast "may only be [used on pointers] with the same address space"[4]. > > So I'm getting the impression that it's ok to have a model with semantically meaningful aliasing between address spaces, but also that anywhere we want to reference a local's address with an addrspace(1) pointer (which is everywhere our source language takes its address), as things stand now we need either to use an addrspace cast which will be assumed to possibly have side-effects, or to round-trip through ptrtoint/inttoptr which I presume will obscure the aliasing information. It certainly gives us a correct place to start from, but (unless I'm misunderstanding and the "complex value modification" type of addrspacecast isn't assumed to have side-effects) I wouldn't be surprised if we come back to this wanting a way to represent a cast across address spaces that's as transparent as a bitcast.To give a bit of background on that: The use case for introducing AS casts as distinct from bitcasts (and not going via inttoptr / ptrtoint) is architectures that have different pointer representations. For example, some microcontrollers have a 16-bit PC and 32-bit address registers, allowing code pointers to be smaller than data pointers. Some GPUs (used to?) use different sized pointers for the various different places in the memory hierarchy. In our architecture, this is even more complicated, because we support two different pointer representations: - 256-bit (or 128-bit, on newer revisions) memory capabilities, that both identify and grant access to a region of memory and have unforgeability guaranteed by the hardware. In LLVM, we represent these as pointers with AS 0. - 64-bit legacy-compabible pointers that are implicitly relative to a global capability (and so are only dereferenceable within a restricted range of the process’ virtual address space). In LLVM, we represent these as pointers with AS 0. For us, an AS cast between AS 0 and AS 200 will succeed if and only if the address is within the current range of the global capability. Any address in AS 0 may alias any address in AS 200 (except in some trivial cases, it’s impossible to determine statically that they don’t), but one value is an integer interpreted as an address, whereas the other is a fat pointer with bounds and permissions enforced in hardware. David
David Chisnall via llvm-dev
2015-Sep-07 13:03 UTC
[llvm-dev] RFC: alloca -- specify address space for allocation
On 7 Sep 2015, at 11:26, David Chisnall via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > - 256-bit (or 128-bit, on newer revisions) memory capabilities, that both identify and grant access to a region of memory and have unforgeability guaranteed by the hardware. In LLVM, we represent these as pointers with AS 0.Sorry, this should have been AS 200. David
Joseph Tremoulet via llvm-dev
2015-Sep-08 15:05 UTC
[llvm-dev] RFC: alloca -- specify address space for allocation
Thanks, having that context is very helpful. I actually think our use case is somewhat similar in spirit, as one of the key points of our system is treating object identity as unforgeable/unforged (guaranteed by type rules in verified safe code, assumed but unverified in trusted unsafe code). So the main question left for me is how much of a hinderance the pervasive addrspacecasts we may have will be for optimization. Things like: - Can two addrspace casts be reordered past each other? - Can an addrspace cast be reordered across memory dereferences? - Can two addrspacecasts with the same input value be CSE'd? - Is an address presumed to be escaped when it is addrspacecasted? - If we load from a pointer `%p` and from a pointer `%q` which is an addrspacecast of `%p`, will those loads be seen as redundant? - Can a store to `%p` feeding a load from `%q`, or vice-versa, be replaced with an SSA value? - Can an addrspacecast be hoisted to where it will be speculatively executed? Popping up a level: when I have these sorts of questions about an opcode, and I don't see them spelled out in its entry in the LangRef, where should I look? Is there some central place describing such properties, or would I just need to read/test the relevant optimizations? Thanks -Joseph -----Original Message----- From: Dr D. Chisnall [mailto:dc552 at hermes.cam.ac.uk] On Behalf Of David Chisnall Sent: Monday, September 7, 2015 6:27 AM To: Joseph Tremoulet <jotrem at microsoft.com> Cc: Marcello Maggioni <mmaggioni at apple.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] RFC: alloca -- specify address space for allocation On 2 Sep 2015, at 02:54, Joseph Tremoulet via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Reading further, I see both that addrspacecast "can be a no-op cast or a complex value modification"[2] and that bitcast "may only be [used on pointers] with the same address space"[4]. > > So I'm getting the impression that it's ok to have a model with semantically meaningful aliasing between address spaces, but also that anywhere we want to reference a local's address with an addrspace(1) pointer (which is everywhere our source language takes its address), as things stand now we need either to use an addrspace cast which will be assumed to possibly have side-effects, or to round-trip through ptrtoint/inttoptr which I presume will obscure the aliasing information. It certainly gives us a correct place to start from, but (unless I'm misunderstanding and the "complex value modification" type of addrspacecast isn't assumed to have side-effects) I wouldn't be surprised if we come back to this wanting a way to represent a cast across address spaces that's as transparent as a bitcast.To give a bit of background on that: The use case for introducing AS casts as distinct from bitcasts (and not going via inttoptr / ptrtoint) is architectures that have different pointer representations. For example, some microcontrollers have a 16-bit PC and 32-bit address registers, allowing code pointers to be smaller than data pointers. Some GPUs (used to?) use different sized pointers for the various different places in the memory hierarchy. In our architecture, this is even more complicated, because we support two different pointer representations: - 256-bit (or 128-bit, on newer revisions) memory capabilities, that both identify and grant access to a region of memory and have unforgeability guaranteed by the hardware. In LLVM, we represent these as pointers with AS 200. - 64-bit legacy-compabible pointers that are implicitly relative to a global capability (and so are only dereferenceable within a restricted range of the process’ virtual address space). In LLVM, we represent these as pointers with AS 0. For us, an AS cast between AS 0 and AS 200 will succeed if and only if the address is within the current range of the global capability. Any address in AS 0 may alias any address in AS 200 (except in some trivial cases, it’s impossible to determine statically that they don’t), but one value is an integer interpreted as an address, whereas the other is a fat pointer with bounds and permissions enforced in hardware. David
Chandler Carruth via llvm-dev
2015-Sep-08 16:02 UTC
[llvm-dev] RFC: alloca -- specify address space for allocation
On Tue, Sep 8, 2015 at 8:05 AM Joseph Tremoulet via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Thanks, having that context is very helpful. > > I actually think our use case is somewhat similar in spirit, as one of the > key points of our system is treating object identity as > unforgeable/unforged (guaranteed by type rules in verified safe code, > assumed but unverified in trusted unsafe code). > > So the main question left for me is how much of a hinderance the pervasive > addrspacecasts we may have will be for optimization. Things like: > - Can two addrspace casts be reordered past each other? >Yes.> - Can an addrspace cast be reordered across memory dereferences? >Yes.> - Can two addrspacecasts with the same input value be CSE'd? >Yes.> - Is an address presumed to be escaped when it is addrspacecasted? >No.> - If we load from a pointer `%p` and from a pointer `%q` which is an > addrspacecast of `%p`, will those loads be seen as redundant? >According to the spec, yes. I'm not saying we implement this, but the spec is quite clear (to my surprise): "Note that if the address space conversion is legal then both result and operand refer to the same memory location."> - Can a store to `%p` feeding a load from `%q`, or vice-versa, be > replaced with an SSA value? >As above, the spec is quite clear: yes. - Can an addrspacecast be hoisted to where it will be speculatively> executed? >The spec is actually unclear about this, and that is true for almost every such instruction. This is one of the most egregious bugs in our spec (see below) but you can find the current answer the optimizer believes by consulting the implementation of isSafeToSpeculativelyExecute, and it seems to return "true" for addrspacecast. None of this is to say that these things couldn't be changed, or that the optimizer even currently leverages them, but that is what the spec currently says about addrspacecast.> > Popping up a level: when I have these sorts of questions about an opcode, > and I don't see them spelled out in its entry in the LangRef, where should > I look? Is there some central place describing such properties, or would I > just need to read/test the relevant optimizations? >It is useful to think of the LangRef as the *spec* for the IR. As a consequence, the LangRef *is* the only source of truth for these things. The only thing that may be more accurate is the implementation, but that is always a bug, it just may be a bug in the LangRef. We do have lots of bugs though, for example we don't specify whether instructions are speculatable very clearly. We also don't specify some other properties. In this case though, most of your questions are there. One way in which the LangRef is perhaps odd is that everything is generally assumed to be a pure operation on its arguments unless specified as something else. That may have led to the confusion, and why I see simple answers to the above questions. -Chandler -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150908/989cad60/attachment.html>