Jake Ehrlich via llvm-dev
2019-Sep-30 23:00 UTC
[llvm-dev] Proposal for llvm.experimental.gc intrinsics for inttoptr and ptrtoint
Hi All, I'm working on a project converting Dart to llvm. Dart uses a relocating GC but additionally uses pointer tagging. The first bit of a pointer is a tag. For integers a '0' bit is used and for pointers to objects a '1' bit is used. V8 apparently uses a similar technique. Generated code may need to check which bit is used when this information isn't statically known. Additionally a function might have a parameter which might be of a dynamic type so it might either pass an object or an integer for the same parameter meaning that this parameter type has to be of a single type in the llvm IR. I'd like to make use of the existing llvm.experimental.gc.statepoint intrinsics but they strictly use non-integral types. This is required to stop certain optimizations from making optimizations that conflict with finding base pointers. After speaking about this (primarily with Sanjoy Das) and gathering the set of issues involved it seems it might be possible to resolve this by adding two new intrinsics that mirror inttoptr and ptrtoint: llvm.experimental.gc.inttoptr and llvm.experimental.gc.ptrtoint. These will be opaque to all existing abstractions. An additional pass would be added as well that would lower these versions of inttoptr and ptrtoint to their standard forms. When this pass is added after other optimizations it should in theory be safe. Potentially safe optimizations might be possible to perform after this point but it isn't clear what optimizations would actually be both useful and safe at this point. The user of such a pass is responsible for not applying this pass before any optimizations that might alter the representation of a pointer in an invalid manner. So specifically the proposal is just the following 1) Add llvm.experimental.gc.inttoptr and llvm.experimental.gc.ptrtoint as opaque "semanticless" intrinsic calls. They will be defined as `IntrNoMem` operations since they won't ever be lowered to anything that may perform any memory operations. 2) Add a pass LowerOpaqueIntergalPointerOps to perform the specified lowering in order to allow these intrinsics to be compiled to code. Use of these intrinsics without using this lowering steps will fail in code generation since these intrinsics will not participate in code generation. Does this seem like a sound approach? Does this seem like an acceptable way forward to the community? What tweaks or alterations would people prefer? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190930/69d5ea9e/attachment.html>
Sanjoy Das via llvm-dev
2019-Oct-01 00:35 UTC
[llvm-dev] Proposal for llvm.experimental.gc intrinsics for inttoptr and ptrtoint
Adding some folks from Azul. On Mon, Sep 30, 2019 at 4:00 PM Jake Ehrlich via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi All, > > I'm working on a project converting Dart to llvm. Dart uses a relocating > GC but additionally uses pointer tagging. The first bit of a pointer is a > tag. For integers a '0' bit is used and for pointers to objects a '1' bit > is used. V8 apparently uses a similar technique. Generated code may need to > check which bit is used when this information isn't statically known. > Additionally a function might have a parameter which might be of a dynamic > type so it might either pass an object or an integer for the same parameter > meaning that this parameter type has to be of a single type in the llvm IR. > > I'd like to make use of the existing llvm.experimental.gc.statepoint > intrinsics but they strictly use non-integral types. This is required to > stop certain optimizations from making optimizations that conflict with > finding base pointers. > > After speaking about this (primarily with Sanjoy Das) and gathering the > set of issues involved it seems it might be possible to resolve this by > adding two new intrinsics that mirror inttoptr and ptrtoint: > llvm.experimental.gc.inttoptr and llvm.experimental.gc.ptrtoint. These will > be opaque to all existing abstractions. An additional pass would be added > as well that would lower these versions of inttoptr and ptrtoint to their > standard forms. When this pass is added after other optimizations it should > in theory be safe. Potentially safe optimizations might be possible to > perform after this point but it isn't clear what optimizations would > actually be both useful and safe at this point. The user of such a pass is > responsible for not applying this pass before any optimizations that might > alter the representation of a pointer in an invalid manner. > > So specifically the proposal is just the following > 1) Add llvm.experimental.gc.inttoptr and llvm.experimental.gc.ptrtoint as > opaque "semanticless" intrinsic calls. They will be defined as `IntrNoMem` > operations since they won't ever be lowered to anything that may perform > any memory operations. > > 2) Add a pass LowerOpaqueIntergalPointerOps to perform the specified > lowering in order to allow these intrinsics to be compiled to code. Use of > these intrinsics without using this lowering steps will fail in code > generation since these intrinsics will not participate in code generation. > > Does this seem like a sound approach? Does this seem like an acceptable > way forward to the community? What tweaks or alterations would people > prefer? > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190930/d1987578/attachment.html>
Jameson Nash via llvm-dev
2019-Oct-01 19:21 UTC
[llvm-dev] Proposal for llvm.experimental.gc intrinsics for inttoptr and ptrtoint
For a datapoint, Julia uses the following function description to implement approximately the capability of those functions. We then also verify that there's no direct inttoptr/ptrtoint into our gc-tracked AddressSpace with a custom verifier pass (among other sanity checks). I can provide additional details and pointers to our gc-root tracking algorithm implementation if desired (I also plan to be at the llvm-devmtg). It'd be great to know if there's opportunities for collaboration, or at least sharing insights and experiences! llvm.experimental.gc.ptrtoint: dropgcroot_type = FunctionType::get(PtrIntTy, makeArrayRef(PointerType::get(AddressSpace::Derived)), false); dropgcroot_func = Function::Create(dropgcroot_type, Function::ExternalLinkage, "julia.pointer_from_objref"); dropgcroot_func->addFnAttr(Attribute::ReadNone); dropgcroot_func->addFnAttr(Attribute::NoUnwind); declare void* @"julia.pointer_from_objref"(void addrspace(2)*) readnone unwind (AddressSpace::Derived in the signature means it doesn't need to be valid as a root itself, but needs to be traced back to locate the base object) llvm.experimental.gc.inttoptr: This didn't need a custom function, since doing "untracked -> inttoptr -> addrspacecast -> tracked" is considered a legal transform in Julia. We later have an optimization pass that may see this and decide to weaken a tracked object back into an untracked one (the root scanning pass can similarly also find that the base object is not tracked and ignore it). Non-moving GC means we can do this for many values, including those loaded from constants and arguments. In your case, this could also apply to integers that needed to get cast to a pointer for the calling convention. Note that the validity of introducing and allowing this can be pretty subtle, since it implies that it may be impossible to "take back" a value into the GC once it has released its gc root. This is true for several reasons, since we already can't guarantee the the object lifetime is appropriate after the object got hidden from the analysis passes (via the ptrtoint) as a means of allowing stronger optimizations (stack promotion, early freeing, memory reuse, etc). But it also may be true because of the IntrNoMem annotation suggested: this states that the instruction has no side-effects, but if you expect the value to resume being tracked by the gc, that would imply these instructions do have some sort of observable side effects on memory (possibly ReadOnly, as well as perhaps the absence of nosync and nofree). On Mon, Sep 30, 2019 at 8:35 PM Sanjoy Das via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Adding some folks from Azul. > > On Mon, Sep 30, 2019 at 4:00 PM Jake Ehrlich via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi All, >> >> I'm working on a project converting Dart to llvm. Dart uses a relocating >> GC but additionally uses pointer tagging. The first bit of a pointer is a >> tag. For integers a '0' bit is used and for pointers to objects a '1' bit >> is used. V8 apparently uses a similar technique. Generated code may need to >> check which bit is used when this information isn't statically known. >> Additionally a function might have a parameter which might be of a dynamic >> type so it might either pass an object or an integer for the same parameter >> meaning that this parameter type has to be of a single type in the llvm IR. >> >> I'd like to make use of the existing llvm.experimental.gc.statepoint >> intrinsics but they strictly use non-integral types. This is required to >> stop certain optimizations from making optimizations that conflict with >> finding base pointers. >> >> After speaking about this (primarily with Sanjoy Das) and gathering the >> set of issues involved it seems it might be possible to resolve this by >> adding two new intrinsics that mirror inttoptr and ptrtoint: >> llvm.experimental.gc.inttoptr and llvm.experimental.gc.ptrtoint. These will >> be opaque to all existing abstractions. An additional pass would be added >> as well that would lower these versions of inttoptr and ptrtoint to their >> standard forms. When this pass is added after other optimizations it should >> in theory be safe. Potentially safe optimizations might be possible to >> perform after this point but it isn't clear what optimizations would >> actually be both useful and safe at this point. The user of such a pass is >> responsible for not applying this pass before any optimizations that might >> alter the representation of a pointer in an invalid manner. >> >> So specifically the proposal is just the following >> 1) Add llvm.experimental.gc.inttoptr and llvm.experimental.gc.ptrtoint as >> opaque "semanticless" intrinsic calls. They will be defined as `IntrNoMem` >> operations since they won't ever be lowered to anything that may perform >> any memory operations. >> >> 2) Add a pass LowerOpaqueIntergalPointerOps to perform the specified >> lowering in order to allow these intrinsics to be compiled to code. Use of >> these intrinsics without using this lowering steps will fail in code >> generation since these intrinsics will not participate in code generation. >> >> Does this seem like a sound approach? Does this seem like an acceptable >> way forward to the community? What tweaks or alterations would people >> prefer? >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191001/9cdea779/attachment.html>