thr3ads.net - llvm dev - [llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64 [Feb 2019]

If this information is useful, please help other people find it:
Share via:

Tim Northover via llvm-dev

2019-Feb-01 20:00 UTC

[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64

On Fri, 1 Feb 2019 at 19:25, Eli Friedman <efriedma at quicinc.com>
wrote:> > Alternate address-spaces still have just one pointer size per space as
> > far as I'm aware. If that's 64-bits we get efficient CodeGen
but
> > loading or storing a pointer clobbers more data than it should, if
> > that's 32-bits then we get poor CodeGen.
>
> I was thinking of a model something like this: 32-bit pointers are
addrspace 0, 64-bit pointers are addrspace 1.  ISD::LOAD/STORE in addrspace 0
are not legal: they're custom-lowered to operations in addrspace 1.  (An
addrspacecast from 0 to 1 is just zero-extension.)  At that point, since the
cast from 32 bits to 64 bits is explicitly represented, we can optimize it in
the DAG or IR. For example, we can transform a load of an inbounds gep in
addrspace 0 into to a load of an inbounds gep in addrspace 1.
That would have to be an IR-level pass I think; otherwise the default
MVT for any J. Random Pointer Value is still i32, leading to the same
efficiency issues when you eventually use that on a load/store.

With a pass, within a function you ought to be able to promote all
uses of addrspace(0) to addrspace(1), leaving (as you say)
addrspacecasts at opaque sources and sinks (loads, stores, args,
return, ...). Structs containing pointers would be (very?) messy. And
you'd probably want it earlyish to recombine things.

 I do like LLVM passes as a solution for most problems, and it ought
to give a big head start to GlobalISel implementation too. I'll
definitely give it a go as an alternative next week.

Cheers.

Tim.

Matt Arsenault via llvm-dev

2019-Feb-01 20:08 UTC

head link

[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64

> On Feb 1, 2019, at 3:00 PM, Tim Northover via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On Fri, 1 Feb 2019 at 19:25, Eli Friedman <efriedma at quicinc.com>
wrote:
>>> Alternate address-spaces still have just one pointer size per space
as
>>> far as I'm aware. If that's 64-bits we get efficient
CodeGen but
>>> loading or storing a pointer clobbers more data than it should, if
>>> that's 32-bits then we get poor CodeGen.
>> 
>> I was thinking of a model something like this: 32-bit pointers are
addrspace 0, 64-bit pointers are addrspace 1.  ISD::LOAD/STORE in addrspace 0
are not legal: they're custom-lowered to operations in addrspace 1.  (An
addrspacecast from 0 to 1 is just zero-extension.)  At that point, since the
cast from 32 bits to 64 bits is explicitly represented, we can optimize it in
the DAG or IR. For example, we can transform a load of an inbounds gep in
addrspace 0 into to a load of an inbounds gep in addrspace 1.
> 
> That would have to be an IR-level pass I think; otherwise the default
> MVT for any J. Random Pointer Value is still i32, leading to the same
> efficiency issues when you eventually use that on a load/store.
I don’t see why this would need to be an IR pass. There aren’t all that many
places left using the default argument to the various pointer function that can
mostly be fixed. iPTR is hopelessly broken on the tablegen side, but you
wouldn’t get to that point with this.
> 
> With a pass, within a function you ought to be able to promote all
> uses of addrspace(0) to addrspace(1), leaving (as you say)
> addrspacecasts at opaque sources and sinks (loads, stores, args,
> return, ...). Structs containing pointers would be (very?) messy. And
> you'd probably want it earlyish to recombine things.
You can specify the ABI alignment to 8-bytes in the data layout for the 32-bit
pointer for struct layout

-Matt

Tim Northover via llvm-dev

2019-Feb-01 20:24 UTC

head link

[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64

On Fri, 1 Feb 2019 at 20:08, Matt Arsenault <arsenm2 at gmail.com>
wrote:> I don’t see why this would need to be an IR pass. There aren’t all that
many places left using the default argument to the various pointer function that
can mostly be fixed. iPTR is hopelessly broken on the tablegen side, but you
wouldn’t get to that point with this.
The difficulty I'm seeing is that we need GEP to be lowered to i64
arithmetic, but that happens in SelectionDAGBuilder before the target
has any real opportunity to override anything. Once the GEP has been
converted to DAG, the critical information is already gone and we just
have i32 ADD/MUL trees.

The two options I see for making that happen favourably are an IR pass
or deep surgery on Clang, which seems even less appealing.
> > With a pass, within a function you ought to be able to promote all
> > uses of addrspace(0) to addrspace(1), leaving (as you say)
> > addrspacecasts at opaque sources and sinks (loads, stores, args,
> > return, ...). Structs containing pointers would be (very?) messy. And
> > you'd probably want it earlyish to recombine things.
>
> You can specify the ABI alignment to 8-bytes in the data layout for the
32-bit pointer for struct layout
I was more thinking in terms of the pass converting all value
representations of pointers to addrspace(1). That means that when a
struct gets loaded or stored directly it needs to be repacked.
Completely tractable, but not pretty.

Also, we couldn't do that anyway because the ABI is now very much set
in stone (actually has been in that regard since the very first watch
came out -- we translate bitcode for armv7k to arm64_32 which is
hopelessly doomed if the DataLayouts don't match).

And thanks for the pointers on AMD; I'll take a look at those properly
and see what we can learn.

Cheers.

Tim.

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Feb 2019 - [RFC] arm64_32: upstreaming ILP32 support for AArch64

[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64

[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64

[llvm-dev] [RFC] arm64_32: upstreaming ILP32 support for AArch64

Possibly Parallel Threads