thr3ads.net - llvm dev - [llvm-dev] [RFC] Introducing a byte type to LLVM [Jun 2021]

If this information is useful, please help other people find it:
Share via:

Juneyoung Lee via llvm-dev

2021-Jun-23 08:36 UTC

[llvm-dev] [RFC] Introducing a byte type to LLVM

On Wed, Jun 23, 2021 at 2:27 PM Nicolai Hähnle <nhaehnle at gmail.com>
wrote:
> On Tue, Jun 22, 2021 at 11:59 AM Ralf Jung via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I don't think it makes sense for LLVM to adopt an explicit
"exposed" flag
>> in its
>> semantics. Reasoning based on non-determinism works fine, and has the
>> advantage
>> of keeping ptr-to-int casts a pure, side-effect-free operation. This is
>> the
>> model we explored in <
>> https://people.mpi-sws.org/~jung/twinsem/twinsem.pdf>, and
>> we were able to show quite a few of LLVM's standard optimizations
correct
>> formally. Some changes are still needed as you noted, but those changes
>> will be
>> required anyway even if LLVM were to adopt PNVI-ae:
>> - No removal of ptr-int-ptr roundtrips.
>> (https://bugs.llvm.org/show_bug.cgi?id=34548)
>> - No GVN replacement of pointer-typed values.
>> (https://bugs.llvm.org/show_bug.cgi?id=35229)
>>
>
> I've read this paper now, and it makes good sense to me as something to
> adopt in LLVM.
>
> I do have one question about a point that doesn't seem sufficiently
> justified, though. In the semantics of the paper,
> store-pointer-then-load-as-integer results in poison. This seems to be the
> root cause for being forced to introduce a "byte" type for
correctness, but
> it is only really justified by an optimization that eliminates a store that
> writes back a previously loaded value. That optimization doesn't seem
all
> that important (but feel free to point out why it is...), while introducing
> a "byte" type is a massive change. On the face of it, that
doesn't seem
> like a good trade-off to me.
>
> Has the alternative of allowing type punning through memory at the cost of
> removing that optimization been studied sufficiently elsewhere?
>
The transformation is analogous to removing memcpy-like code with the same
dst and src.
Such code might not be written by humans frequently, but I believe C++'s
template instantiation or optimizations like inlining can expose such a
case.

Juneyoung

> Cheers,
> Nicolai
>
> --
> Lerne, wie die Welt wirklich ist,
> aber vergiss niemals, wie sie sein sollte.
>

-- 

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210623/bab60d49/attachment.html>

Ralf Jung via llvm-dev

2021-Jun-23 12:52 UTC

head link

[llvm-dev] [RFC] Introducing a byte type to LLVM

Hi Nicolai,
>     I've read this paper now, and it makes good sense to me as
something to
>     adopt in LLVM.
:)
>     I do have one question about a point that doesn't seem sufficiently
>     justified, though. In the semantics of the paper,
>     store-pointer-then-load-as-integer results in poison. This seems to be
the
>     root cause for being forced to introduce a "byte" type for
correctness, but
>     it is only really justified by an optimization that eliminates a store
that
>     writes back a previously loaded value. That optimization doesn't
seem all
>     that important (but feel free to point out why it is...), while
introducing
>     a "byte" type is a massive change. On the face of it, that
doesn't seem like
>     a good trade-off to me.
> 
>     Has the alternative of allowing type punning through memory at the cost
of
>     removing that optimization been studied sufficiently elsewhere?
> 
> 
> The transformation is analogous to removing memcpy-like code with the same
dst
> and src.
> Such code might not be written by humans frequently, but I believe
C++'s
> template instantiation or optimizations like inlining can expose such a
case.
To add to what Juneyoung said:
I don't think that experiment has been made. From what I can see, the 
alternative you propose leads to an internally consistent model -- one
"just"
has to account for the fact that a "load i64" might do some
transformation on
the data to actually obtain an integer result (namely, it might to ptrtoint).

However, I am a bit worried about what happens when we eventually add proper 
support for 'restrict'/'noalias': the only models I know for
that one actually
make 'ptrtoint' have side-effects on the memory state (similar to
setting the
'exposed' flag in the C provenance TS). I can't (currently)
demonstrate that
this is *required*, but I also don't know an alternative. So if this remains
the
case, and if we say "load i64" performs a ptrtoint when needed, then
that would
mean we could not do dead load elimination any more as that would remove the 
ptrtoint side-effect.

There also is the somewhat conceptual concern that LLVM ought to have a type 
that can loslessly hold all kinds of data that exist in LLVM. Currently, that is
not the case -- 'iN' cannot hold data with provenance.

Kind regards,
Ralf
> 
> Juneyoung
> 
> 
>     Cheers,
>     Nicolai
> 
>     -- 
>     Lerne, wie die Welt wirklich ist,
>     aber vergiss niemals, wie sie sein sollte.
> 
> 
> 
> -- 
> 
> Juneyoung Lee
> Software Foundation Lab, Seoul National University
-- 
Website: https://people.mpi-sws.org/~jung/

llvm dev - Jun 2021 - [RFC] Introducing a byte type to LLVM

[llvm-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [RFC] Introducing a byte type to LLVM