Tim Northover via llvm-dev
2019-Feb-01 20:00 UTC
[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64
On Fri, 1 Feb 2019 at 19:25, Eli Friedman <efriedma at quicinc.com> wrote:> > Alternate address-spaces still have just one pointer size per space as > > far as I'm aware. If that's 64-bits we get efficient CodeGen but > > loading or storing a pointer clobbers more data than it should, if > > that's 32-bits then we get poor CodeGen. > > I was thinking of a model something like this: 32-bit pointers are addrspace 0, 64-bit pointers are addrspace 1. ISD::LOAD/STORE in addrspace 0 are not legal: they're custom-lowered to operations in addrspace 1. (An addrspacecast from 0 to 1 is just zero-extension.) At that point, since the cast from 32 bits to 64 bits is explicitly represented, we can optimize it in the DAG or IR. For example, we can transform a load of an inbounds gep in addrspace 0 into to a load of an inbounds gep in addrspace 1.That would have to be an IR-level pass I think; otherwise the default MVT for any J. Random Pointer Value is still i32, leading to the same efficiency issues when you eventually use that on a load/store. With a pass, within a function you ought to be able to promote all uses of addrspace(0) to addrspace(1), leaving (as you say) addrspacecasts at opaque sources and sinks (loads, stores, args, return, ...). Structs containing pointers would be (very?) messy. And you'd probably want it earlyish to recombine things. I do like LLVM passes as a solution for most problems, and it ought to give a big head start to GlobalISel implementation too. I'll definitely give it a go as an alternative next week. Cheers. Tim.
Matt Arsenault via llvm-dev
2019-Feb-01 20:08 UTC
[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64
> On Feb 1, 2019, at 3:00 PM, Tim Northover via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Fri, 1 Feb 2019 at 19:25, Eli Friedman <efriedma at quicinc.com> wrote: >>> Alternate address-spaces still have just one pointer size per space as >>> far as I'm aware. If that's 64-bits we get efficient CodeGen but >>> loading or storing a pointer clobbers more data than it should, if >>> that's 32-bits then we get poor CodeGen. >> >> I was thinking of a model something like this: 32-bit pointers are addrspace 0, 64-bit pointers are addrspace 1. ISD::LOAD/STORE in addrspace 0 are not legal: they're custom-lowered to operations in addrspace 1. (An addrspacecast from 0 to 1 is just zero-extension.) At that point, since the cast from 32 bits to 64 bits is explicitly represented, we can optimize it in the DAG or IR. For example, we can transform a load of an inbounds gep in addrspace 0 into to a load of an inbounds gep in addrspace 1. > > That would have to be an IR-level pass I think; otherwise the default > MVT for any J. Random Pointer Value is still i32, leading to the same > efficiency issues when you eventually use that on a load/store.I don’t see why this would need to be an IR pass. There aren’t all that many places left using the default argument to the various pointer function that can mostly be fixed. iPTR is hopelessly broken on the tablegen side, but you wouldn’t get to that point with this.> > With a pass, within a function you ought to be able to promote all > uses of addrspace(0) to addrspace(1), leaving (as you say) > addrspacecasts at opaque sources and sinks (loads, stores, args, > return, ...). Structs containing pointers would be (very?) messy. And > you'd probably want it earlyish to recombine things.You can specify the ABI alignment to 8-bytes in the data layout for the 32-bit pointer for struct layout -Matt
Tim Northover via llvm-dev
2019-Feb-01 20:24 UTC
[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64
On Fri, 1 Feb 2019 at 20:08, Matt Arsenault <arsenm2 at gmail.com> wrote:> I don’t see why this would need to be an IR pass. There aren’t all that many places left using the default argument to the various pointer function that can mostly be fixed. iPTR is hopelessly broken on the tablegen side, but you wouldn’t get to that point with this.The difficulty I'm seeing is that we need GEP to be lowered to i64 arithmetic, but that happens in SelectionDAGBuilder before the target has any real opportunity to override anything. Once the GEP has been converted to DAG, the critical information is already gone and we just have i32 ADD/MUL trees. The two options I see for making that happen favourably are an IR pass or deep surgery on Clang, which seems even less appealing.> > With a pass, within a function you ought to be able to promote all > > uses of addrspace(0) to addrspace(1), leaving (as you say) > > addrspacecasts at opaque sources and sinks (loads, stores, args, > > return, ...). Structs containing pointers would be (very?) messy. And > > you'd probably want it earlyish to recombine things. > > You can specify the ABI alignment to 8-bytes in the data layout for the 32-bit pointer for struct layoutI was more thinking in terms of the pass converting all value representations of pointers to addrspace(1). That means that when a struct gets loaded or stored directly it needs to be repacked. Completely tractable, but not pretty. Also, we couldn't do that anyway because the ABI is now very much set in stone (actually has been in that regard since the very first watch came out -- we translate bitcode for armv7k to arm64_32 which is hopelessly doomed if the DataLayouts don't match). And thanks for the pointers on AMD; I'll take a look at those properly and see what we can learn. Cheers. Tim.
Maybe Matching Threads
- [RFC] arm64_32: upstreaming ILP32 support for AArch64
- [EXT] [RFC] arm64_32: upstreaming ILP32 support for AArch64
- [RFC] arm64_32: upstreaming ILP32 support for AArch64
- [RFC] arm64_32: upstreaming ILP32 support for AArch64
- [RFC] arm64_32: upstreaming ILP32 support for AArch64