thr3ads.net - llvm dev - [llvm-dev] [RFC] Introducing a byte type to LLVM [Jun 2021]

If this information is useful, please help other people find it:
Share via:

John McCall via llvm-dev

2021-Jun-14 05:33 UTC

[llvm-dev] [RFC] Introducing a byte type to LLVM

On 13 Jun 2021, at 11:26, Ralf Jung wrote:
> Hi Johannes,
>
>>> I think Joshua gave a very nice motivation already.
>>
>> I don't dispute that but I am still not understanding the need for 
>> bytes. None of the examples I have seen so far
>> clearly made the point that it is the byte types that provide a 
>> substantial benefit. The AA example below does neither.
>
> I hope 
> <https://lists.llvm.org/pipermail/llvm-dev/2021-June/151110.html> 
> makes a convincing case that under the current semantics, when one 
> does an "i64" load of a value that was stored at pointer type, we
have
> to say that this load returns poison. In particular, saying that this 
> implicitly performs a "ptrtoint" is inconsistent with
optimizations
> that are probably too important to be changed to accommodate this 
> implicit "ptrtoint".
I think it is fact rather obvious that, if this optimization as 
currently written is indeed in conflict with the current semantics, it 
is the optimization that will have to give.  If the optimization is too 
important for performance to give up entirely, we will simply have to 
find some more restricted pattern that wee can still soundly optimize.

Perhaps the clearest reason is that, if we did declare that integer 
types cannot carry pointers and so introduced byte types that could, C 
frontends would have to switch to byte types for their integer types, 
and so we would immediately lose this supposedly important optimization 
for C-like languages, and so, since optimizing C is very important, we 
would immediately need to find some restricted pattern under which we 
could soundly apply this optimization to byte types.  That’s assuming 
that this optimization is actually significant, of course.

John.

Nicolai Hähnle via llvm-dev

2021-Jun-14 06:22 UTC

head link

[llvm-dev] [RFC] Introducing a byte type to LLVM

On Mon, Jun 14, 2021 at 7:34 AM John McCall via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
>
> On 13 Jun 2021, at 11:26, Ralf Jung wrote:
>
> > Hi Johannes,
> >
> >>> I think Joshua gave a very nice motivation already.
> >>
> >> I don't dispute that but I am still not understanding the need
for
> >> bytes. None of the examples I have seen so far
> >> clearly made the point that it is the byte types that provide a
> >> substantial benefit. The AA example below does neither.
> >
> > I hope
> >
<https://lists.llvm.org/pipermail/llvm-dev/2021-June/151110.html>
> > makes a convincing case that under the current semantics, when one
> > does an "i64" load of a value that was stored at pointer
type, we have
> > to say that this load returns poison. In particular, saying that this
> > implicitly performs a "ptrtoint" is inconsistent with
optimizations
> > that are probably too important to be changed to accommodate this
> > implicit "ptrtoint".
>
> I think it is fact rather obvious that, if this optimization as
> currently written is indeed in conflict with the current semantics, it
> is the optimization that will have to give.  If the optimization is too
> important for performance to give up entirely, we will simply have to
> find some more restricted pattern that wee can still soundly optimize.
>
I tend to agree. I don't think Ralf's example alone is convincing
evidence
that pointer-load of integer-store must be poison, i.e. memory must be
typed.

FWIW, the least important optimization in that example's chain, and the one
that is most obviously incorrect in an untyped memory world, is eliminating
a store of a previously loaded value. How much would we actually lose if we
disable this particular optimization? Note that this is only a small
special case of dead store elimination. The more common case where there
are two stores to memory in a row, and the first one is eliminated, is
still correct.

Cheers,
Nicolai


>
> Perhaps the clearest reason is that, if we did declare that integer
> types cannot carry pointers and so introduced byte types that could, C
> frontends would have to switch to byte types for their integer types,
> and so we would immediately lose this supposedly important optimization
> for C-like languages, and so, since optimizing C is very important, we
> would immediately need to find some restricted pattern under which we
> could soundly apply this optimization to byte types.  That’s assuming
> that this optimization is actually significant, of course.
>
> John.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210614/fc676523/attachment.html>

Ralf Jung via llvm-dev

2021-Jun-14 11:04 UTC

head link

[llvm-dev] [RFC] Introducing a byte type to LLVM

Hi,
>>> I don't dispute that but I am still not understanding the need
for bytes.
>>> None of the examples I have seen so far
>>> clearly made the point that it is the byte types that provide a
substantial
>>> benefit. The AA example below does neither.
>>
>> I hope
<https://lists.llvm.org/pipermail/llvm-dev/2021-June/151110.html> makes
>> a convincing case that under the current semantics, when one does an
"i64"
>> load of a value that was stored at pointer type, we have to say that
this load
>> returns poison. In particular, saying that this implicitly performs a 
>> "ptrtoint" is inconsistent with optimizations that are
probably too important
>> to be changed to accommodate this implicit "ptrtoint".
> 
> I think it is fact rather obvious that, if this optimization as currently 
> written is indeed in conflict with the current semantics, it is the
optimization
> that will have to give.  If the optimization is too important for
performance to
> give up entirely, we will simply have to find some more restricted pattern
that
> wee can still soundly optimize.
That is certainly a reasonable approach.
However, judging from how reluctant LLVM is to remove optimizations that are 
much more convincingly wrong [1], my impression was that it is easier to 
complicate the semantics than to remove an optimization that LLVM already
performs.

[1]: https://bugs.llvm.org/show_bug.cgi?id=34548,
      https://bugs.llvm.org/show_bug.cgi?id=35229;
      see https://www.ralfj.de/blog/2020/12/14/provenance.html for a
      more detailed explanation
> Perhaps the clearest reason is that, if we did declare that integer types
cannot
> carry pointers and so introduced byte types that could, C frontends would
have
> to switch to byte types for their integer types, and so we would
immediately
> lose this supposedly important optimization for C-like languages, and so,
since
> optimizing C is very important, we would immediately need to find some 
> restricted pattern under which we could soundly apply this optimization to
byte
> types.  That’s assuming that this optimization is actually significant, of
course.
At least C with strict aliasing enabled (i.e., standard C) only needs to use the
byte type for "(un)signed char". The other integer types remain
unaffected.
There is no arithmetic on these types ("char + char" is subject to
integer
promotion), so the IR overhead would consist in a few "bytecast"
instructions
next to / replacing the existing sign extensions that convert "char"
to "int"
before performing the arithmetic.

Kind regards,
Ralf

llvm dev - Jun 2021 - [RFC] Introducing a byte type to LLVM

[llvm-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [RFC] Introducing a byte type to LLVM