Ralf Jung via llvm-dev
2021-Jun-23 12:52 UTC
[llvm-dev] [RFC] Introducing a byte type to LLVM
Hi Nicolai,> I've read this paper now, and it makes good sense to me as something to > adopt in LLVM.:)> I do have one question about a point that doesn't seem sufficiently > justified, though. In the semantics of the paper, > store-pointer-then-load-as-integer results in poison. This seems to be the > root cause for being forced to introduce a "byte" type for correctness, but > it is only really justified by an optimization that eliminates a store that > writes back a previously loaded value. That optimization doesn't seem all > that important (but feel free to point out why it is...), while introducing > a "byte" type is a massive change. On the face of it, that doesn't seem like > a good trade-off to me. > > Has the alternative of allowing type punning through memory at the cost of > removing that optimization been studied sufficiently elsewhere? > > > The transformation is analogous to removing memcpy-like code with the same dst > and src. > Such code might not be written by humans frequently, but I believe C++'s > template instantiation or optimizations like inlining can expose such a case.To add to what Juneyoung said: I don't think that experiment has been made. From what I can see, the alternative you propose leads to an internally consistent model -- one "just" has to account for the fact that a "load i64" might do some transformation on the data to actually obtain an integer result (namely, it might to ptrtoint). However, I am a bit worried about what happens when we eventually add proper support for 'restrict'/'noalias': the only models I know for that one actually make 'ptrtoint' have side-effects on the memory state (similar to setting the 'exposed' flag in the C provenance TS). I can't (currently) demonstrate that this is *required*, but I also don't know an alternative. So if this remains the case, and if we say "load i64" performs a ptrtoint when needed, then that would mean we could not do dead load elimination any more as that would remove the ptrtoint side-effect. There also is the somewhat conceptual concern that LLVM ought to have a type that can loslessly hold all kinds of data that exist in LLVM. Currently, that is not the case -- 'iN' cannot hold data with provenance. Kind regards, Ralf> > Juneyoung > > > Cheers, > Nicolai > > -- > Lerne, wie die Welt wirklich ist, > aber vergiss niemals, wie sie sein sollte. > > > > -- > > Juneyoung Lee > Software Foundation Lab, Seoul National University-- Website: https://people.mpi-sws.org/~jung/
Jeroen Dobbelaere via llvm-dev
2021-Jun-23 14:17 UTC
[llvm-dev] [RFC] Introducing a byte type to LLVM
Hi Ralf, [..]> > To add to what Juneyoung said: > I don't think that experiment has been made. From what I can see, the > alternative you propose leads to an internally consistent model -- one "just" > has to account for the fact that a "load i64" might do some transformation on > the data to actually obtain an integer result (namely, it might to ptrtoint). > > However, I am a bit worried about what happens when we eventually add proper > support for 'restrict'/'noalias': the only models I know for that one actually > make 'ptrtoint' have side-effects on the memory state (similar to setting the > 'exposed' flag in the C provenance TS). I can't (currently) demonstrate thatFor the 'c standard', it is undefined behavior to convert a restrict pointer to an integer and back to a pointer type. (At least, that is my interpretation of n2573 6.7.3.1 para 3: Note that "based" is defined only for expressions with pointer types. ) For the full restrict patches, we do not track restrict provenance across a ptr2int, except for the 'int2ptr(ptr2int %P)' (which we do, as llvm sometimes introduced these pairs; not sure if this is still valid). Greetings, Jeroen Dobbelaere> this is *required*, but I also don't know an alternative. So if this remains > the > case, and if we say "load i64" performs a ptrtoint when needed, then that > would > mean we could not do dead load elimination any more as that would remove the > ptrtoint side-effect. > > There also is the somewhat conceptual concern that LLVM ought to have a type > that can loslessly hold all kinds of data that exist in LLVM. Currently, that > is > not the case -- 'iN' cannot hold data with provenance. > > Kind regards, > Ralf
On Wed, Jun 23, 2021 at 8:52 AM Ralf Jung via llvm-dev <llvm-dev at lists.llvm.org> wrote:> However, I am a bit worried about what happens when we eventually add proper > support for 'restrict'/'noalias': the only models I know for that one actually > make 'ptrtoint' have side-effects on the memory state (similar to setting the > 'exposed' flag in the C provenance TS). I can't (currently) demonstrate that > this is *required*, but I also don't know an alternative. So if this remains the > case, and if we say "load i64" performs a ptrtoint when needed, then that would > mean we could not do dead load elimination any more as that would remove the > ptrtoint side-effect.Though, of course, dead loads could be replaced with some hypothetical new instruction that has *only* the ptrtoint side-effect (and doesn't produce any assembly). And such an instruction could be subject to further transformations.
Arthur Eubanks via llvm-dev
2021-Sep-13 17:13 UTC
[llvm-dev] [RFC] Introducing a byte type to LLVM
I just wanted to say that removing CSE'ing pointers based on an icmp would simplify the -fstrict-vtable-pointers model. We try to CSE loads and stores from the same pointer (modulo zero GEPs and bitcasts) that have the !invariant.group metadata. But we may have two different pointers with different provenances that compare equal (e.g. the pointer before and after a C++ placement new). In those cases we don't want to CSE the two different pointers, so we add the llvm.strip.invariant.group <https://llvm.org/docs/LangRef.html#llvm-strip-invariant-group-intrinsic> intrinsic before an icmp to prevent GVN from CSE'ing the pointers given a true icmp. If we prevented CSE'ing based on icmp, we wouldn't need these strip intrinsics. Perhaps to recover some lost optimizations (e.g. I've seen (a == b ? a : b => b) missed due to adding intrinsics due to -fstrict-vtable-pointers) we could do a late pass (right before the codegen pipeline?) where we allow CSE'ing pointers based on icmp and ptrtoint/inttoptr simplification but disallow any pointer provenance optimizations, and make sure no later optimizations take advantage of those. The codegen pipeline probably doesn't have any optimizations based on pointer provenance? On Sun, Jun 27, 2021 at 9:15 PM comex via llvm-dev <llvm-dev at lists.llvm.org> wrote:> On Wed, Jun 23, 2021 at 8:52 AM Ralf Jung via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > However, I am a bit worried about what happens when we eventually add > proper > > support for 'restrict'/'noalias': the only models I know for that one > actually > > make 'ptrtoint' have side-effects on the memory state (similar to > setting the > > 'exposed' flag in the C provenance TS). I can't (currently) demonstrate > that > > this is *required*, but I also don't know an alternative. So if this > remains the > > case, and if we say "load i64" performs a ptrtoint when needed, then > that would > > mean we could not do dead load elimination any more as that would remove > the > > ptrtoint side-effect. > > Though, of course, dead loads could be replaced with some hypothetical > new instruction that has *only* the ptrtoint side-effect (and doesn't > produce any assembly). And such an instruction could be subject to > further transformations. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210913/c4fe48ea/attachment.html>