thr3ads.net - llvm dev - [llvm-dev] GEP with a null pointer base [Jul 2017]

If this information is useful, please help other people find it:
Share via:

Marcin Słowik via llvm-dev

2017-Jul-09 20:10 UTC

[llvm-dev] GEP with a null pointer base

Can we go back a little?

1) Add a new transformation to InstCombine that will replace
'getelementptr> i8, i8* null, <ty> %n' with 'inttoptr <ty> %n to
i8*' when <ty> has the
> same size as a pointer for the target architecture.

What's the actual problem with this approach? I personally find it the most
compelling - it is well-defined (well, somewhat), front-end agnostic (and
assume some front ends may find this kind of pointer arithmetic to be
well-defined) and predictable.
I would even extend it to allow offsets of different types to be used, with
additional zero-extension when applicable.

Cheers,
Marcin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/082091d2/attachment.html>

David Majnemer via llvm-dev

2017-Jul-10 01:23 UTC

head link

[llvm-dev] GEP with a null pointer base

On Sun, Jul 9, 2017 at 1:10 PM, Marcin Słowik via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Can we go back a little?
>
> 1) Add a new transformation to InstCombine that will replace
>> 'getelementptr i8, i8* null, <ty> %n' with 'inttoptr
<ty> %n to i8*' when
>> <ty> has the same size as a pointer for the target architecture.
>
>
> What's the actual problem with this approach? I personally find it the
> most compelling - it is well-defined (well, somewhat), front-end agnostic
> (and assume some front ends may find this kind of pointer arithmetic to be
> well-defined) and predictable.
> I would even extend it to allow offsets of different types to be used,
> with additional zero-extension when applicable.
>
This would make correctness of a program dependent on running a particular
optimization pass, something which is not sound from a semantics point of
view (what if another pass sees the gep of null before InstCombine does,
etc.).

LLVM IR has semantics, properties which we are supposed to use to reason
about what a particular piece of IR does. This proposed transformation,
while legal, is not mandatory. Making it mandatory means more than adding
one particular change to InstCombine: it means a change to the semantics of
LLVM IR. This way we require that all passes, analysis, etc. treat gep null
in an appropriate way.

There are many reasons why such a semantic shift would be undesirable:
- It opens up a pandora's box with regard to the semantics of
transformations on GEPs when commuted and combined with other GEPs
- It results in less expresivity: frontends should emit the IR that match
the semantics of their source language. Constraining GEP semantics would
constrain it for frontends which do not want or need this semantic shift.

If the goal is to make (char*)0 + n work in clang, clang should be the
bearer of that burden. It is not difficult to implement this in AST->IR
lowering and has several benefits:
- No change to GEP semantics which means that existing optimizations are
sound.
- Easy to explain why, when and how clang's behavior shifts with regards to
particular source expressions and their lowering.

>
> Cheers,
> Marcin
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/2f6b2cb3/attachment.html>

Chandler Carruth via llvm-dev

2017-Jul-10 05:45 UTC

head link

[llvm-dev] GEP with a null pointer base

On Sun, Jul 9, 2017 at 9:24 PM David Majnemer via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Sun, Jul 9, 2017 at 1:10 PM, Marcin Słowik via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Can we go back a little?
>>
>> 1) Add a new transformation to InstCombine that will replace
>>> 'getelementptr i8, i8* null, <ty> %n' with
'inttoptr <ty> %n to i8*' when
>>> <ty> has the same size as a pointer for the target
architecture.
>>
>>
>> What's the actual problem with this approach? I personally find it
the
>> most compelling - it is well-defined (well, somewhat), front-end
agnostic
>> (and assume some front ends may find this kind of pointer arithmetic to
be
>> well-defined) and predictable.
>> I would even extend it to allow offsets of different types to be used,
>> with additional zero-extension when applicable.
>>
>
> This would make correctness of a program dependent on running a particular
> optimization pass, something which is not sound from a semantics point of
> view (what if another pass sees the gep of null before InstCombine does,
> etc.).
>
> LLVM IR has semantics, properties which we are supposed to use to reason
> about what a particular piece of IR does. This proposed transformation,
> while legal, is not mandatory. Making it mandatory means more than adding
> one particular change to InstCombine: it means a change to the semantics of
> LLVM IR. This way we require that all passes, analysis, etc. treat gep null
> in an appropriate way.
>
> There are many reasons why such a semantic shift would be undesirable:
> - It opens up a pandora's box with regard to the semantics of
> transformations on GEPs when commuted and combined with other GEPs
> - It results in less expresivity: frontends should emit the IR that match
> the semantics of their source language. Constraining GEP semantics would
> constrain it for frontends which do not want or need this semantic shift.
>
> If the goal is to make (char*)0 + n work in clang, clang should be the
> bearer of that burden. It is not difficult to implement this in AST->IR
> lowering and has several benefits:
> - No change to GEP semantics which means that existing optimizations are
> sound.
> - Easy to explain why, when and how clang's behavior shifts with
regards
> to particular source expressions and their lowering.
>
Just wanted to say that I emphatically agree with all of this.

(And with making the above craziness work in Clang as a pragmatic way to
support real code in the wild even if it is undesirable code in the wild.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170710/220d51c2/attachment.html>

llvm dev - Jul 2017 - GEP with a null pointer base

[llvm-dev] GEP with a null pointer base

[llvm-dev] GEP with a null pointer base

[llvm-dev] GEP with a null pointer base