thr3ads.net - llvm dev - [llvm-dev] Design issues in LLVM IR [Jun 2021]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2021-Jun-10 17:18 UTC

[llvm-dev] Design issues in LLVM IR

Re: Opaque pointers - yeah, sorry I've left that lingering for years.
+Arthur
Eubanks <aeubanks at google.com> has picked that up recently (& credit
to a
few others too - +James Y Knight <jyknight at google.com>, +Tim Northover
<t.p.northover at gmail.com>, +Matt Arsenault <arsenm2 at gmail.com>
etc along
the way) & seems to be making good progress.

(& agreed - it's crossed my mind that gep starts to look
"strange" once
pointers are typeless - but I wouldn't want to get ahead of ourselves and
start removing gep in favor of more raw pointer arithmetic while we still
haven't fully transitioned to opaque pointers)

On Wed, Jun 9, 2021 at 9:19 AM Chris Lattner via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Nikita Popov wrote a great block post last week: “Design issues in LLVM IR
> <https://www.npopov.com/2021/06/02/Design-issues-in-LLVM-IR.html>”
that I
> just found.  It is well framed and nicely written, it seems like a good
> idea to discuss this on llvm-dev.  :-)
>
> Here are my 2c for what it is worth:
>
> a) I completely agree we should continue to invest in fixing the core of
> LLVM.  There are long standing issues that we should fix, and not doing so
> slows things down, leads to worse quality of results, etc.
>
> b) I completely agree with his framing on canonicalization and its value.
> I think that LLVM has historically taken this a bit too far (e.g. loop
> transformations, the old IndVar/LSR dichotomy among others) but many of
> those have already been walked back.
>
> c) I completely agree we need to continue to march towards opaque
> pointers, I’m a fan of this work.
>
> d) I’m less enthused about eliminating type based GEP.  The post is right
> that indexing computations are expensive, but that is largely due to the
> algorithms used, not the IR structure.  If this was the thing to fix, then
> we should fix other aspects of the design.  The thing that I’m particularly
> concerned about is array indexes: I think we need to preserve the ability
> to do simple dependence analysis and other array subscript indexing
> analyses in the middle end.  I think the sweet spot is to drop types from
> pointers, but keep them on GEPs.  Alternatively, finish the typeless
> pointer migration and then evaluate what to do with GEPs only when that
> completes.
>
> e) Constant Expressions are a disaster.  In addition to the problem
> identified, there are also many annoying cases to deal with, eg. When
> constexprs exist in phi nodes, trapping constexprs, etc.  In my opinion,
> the fix is to eliminate them entirely, in a few steps:
>
>     1) Introduce a new “RelocatableConstant” object which is *not* a
> mirror of all the IR operations in LLVM, but is instead designed to be used
> in global variables and allows the standard “globalpointer+offset” pattern
> that object files support, and we should add a new MachoRelocatableConstant
> class to represent the “(gv1-gv2+offset)” relocations macho supports.  The
> presence of this would make codegen and frontends easier to write, and get
> rid of all the fiddly pattern matching stuff.  I think we need to talk
> about whether “offset” is a byte offset, or whether it is a series of
> (constant integer) field indexes in a GEP like operation.  I would argue
> for the later to make inter procedural optimizations easier to write, but
> it is debatable.
>
>     2) Move the general constant folding API off of ConstantExpr to
> somewhere else, it never should have been there for reasons pointed out in
> the blog.
>
>     3) Eliminate ConstExpr: after #1, we don’t need a mirror of the LLVM
> IR in constant nodes.  Constant folding should be a failable operation and
> would return the primitive nodes like ConstantInt.  The asmparser / byte
> code parser could auto upgrade general unfolded constexprs to instructions
> when in a function and to [Macho]RelocatableConstant
>
> In any case, I’d love to see progress on any of these.  I’d personally
> love to see the typeless pointers land because we’re in an unfortunate
> in-between state, and we should close off partial transitions.
>
> -Chris
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210610/7174a3af/attachment-0001.html>

Madhur Amilkanthwar via llvm-dev

2021-Jun-10 17:25 UTC

head link

[llvm-dev] Design issues in LLVM IR

Speaking of which, I think it would be useful if we can document the
progress of migration to opaque pointers somewhere. Going one step ahead,
if we have identified fine level items to do (than high level items on
opaque poonters page), it would be easy for people to pick up.

On Thu, Jun 10, 2021, 10:48 PM David Blaikie via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Re: Opaque pointers - yeah, sorry I've left that lingering for years.
+Arthur
> Eubanks <aeubanks at google.com> has picked that up recently (&
credit to a
> few others too - +James Y Knight <jyknight at google.com>, +Tim
Northover
> <t.p.northover at gmail.com>, +Matt Arsenault <arsenm2 at
gmail.com> etc along
> the way) & seems to be making good progress.
>
> (& agreed - it's crossed my mind that gep starts to look
"strange" once
> pointers are typeless - but I wouldn't want to get ahead of ourselves
and
> start removing gep in favor of more raw pointer arithmetic while we still
> haven't fully transitioned to opaque pointers)
>
> On Wed, Jun 9, 2021 at 9:19 AM Chris Lattner via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Nikita Popov wrote a great block post last week: “Design issues in LLVM
>> IR
<https://www.npopov.com/2021/06/02/Design-issues-in-LLVM-IR.html>”
>> that I just found.  It is well framed and nicely written, it seems like
a
>> good idea to discuss this on llvm-dev.  :-)
>>
>> Here are my 2c for what it is worth:
>>
>> a) I completely agree we should continue to invest in fixing the core
of
>> LLVM.  There are long standing issues that we should fix, and not doing
so
>> slows things down, leads to worse quality of results, etc.
>>
>> b) I completely agree with his framing on canonicalization and its
>> value.  I think that LLVM has historically taken this a bit too far
(e.g.
>> loop transformations, the old IndVar/LSR dichotomy among others) but
many
>> of those have already been walked back.
>>
>> c) I completely agree we need to continue to march towards opaque
>> pointers, I’m a fan of this work.
>>
>> d) I’m less enthused about eliminating type based GEP.  The post is
right
>> that indexing computations are expensive, but that is largely due to
the
>> algorithms used, not the IR structure.  If this was the thing to fix,
then
>> we should fix other aspects of the design.  The thing that I’m
particularly
>> concerned about is array indexes: I think we need to preserve the
ability
>> to do simple dependence analysis and other array subscript indexing
>> analyses in the middle end.  I think the sweet spot is to drop types
from
>> pointers, but keep them on GEPs.  Alternatively, finish the typeless
>> pointer migration and then evaluate what to do with GEPs only when that
>> completes.
>>
>> e) Constant Expressions are a disaster.  In addition to the problem
>> identified, there are also many annoying cases to deal with, eg. When
>> constexprs exist in phi nodes, trapping constexprs, etc.  In my
opinion,
>> the fix is to eliminate them entirely, in a few steps:
>>
>>     1) Introduce a new “RelocatableConstant” object which is *not* a
>> mirror of all the IR operations in LLVM, but is instead designed to be
used
>> in global variables and allows the standard “globalpointer+offset”
pattern
>> that object files support, and we should add a new
MachoRelocatableConstant
>> class to represent the “(gv1-gv2+offset)” relocations macho supports. 
The
>> presence of this would make codegen and frontends easier to write, and
get
>> rid of all the fiddly pattern matching stuff.  I think we need to talk
>> about whether “offset” is a byte offset, or whether it is a series of
>> (constant integer) field indexes in a GEP like operation.  I would
argue
>> for the later to make inter procedural optimizations easier to write,
but
>> it is debatable.
>>
>>     2) Move the general constant folding API off of ConstantExpr to
>> somewhere else, it never should have been there for reasons pointed out
in
>> the blog.
>>
>>     3) Eliminate ConstExpr: after #1, we don’t need a mirror of the
LLVM
>> IR in constant nodes.  Constant folding should be a failable operation
and
>> would return the primitive nodes like ConstantInt.  The asmparser /
byte
>> code parser could auto upgrade general unfolded constexprs to
instructions
>> when in a function and to [Macho]RelocatableConstant
>>
>> In any case, I’d love to see progress on any of these.  I’d personally
>> love to see the typeless pointers land because we’re in an unfortunate
>> in-between state, and we should close off partial transitions.
>>
>> -Chris
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210610/30bb100e/attachment.html>

llvm dev - Jun 2021 - Design issues in LLVM IR

[llvm-dev] Design issues in LLVM IR

[llvm-dev] Design issues in LLVM IR