David Chisnall via llvm-dev
2021-May-11 09:19 UTC
[llvm-dev] [RFC] Introducing the opaque pointer type
On 11/05/2021 07:59, pawel k. via llvm-dev wrote:> I am very much beginner in opaque pointers but I am also minimalist too > in a sense entities shouldnt be multiplied but rather divided where > applicable. > > Can someone point me to article(s) describing what problems opaque > pointers solve that cant be solved with forward declaractions and typed > pointers etc? > > My first gutfeeling was when learning on idea of opaque pointers, theyre > not much more than void* with all its issues from static analysis, > compiler design, code readability, code quality, code security > perspective. Can someone correct a newbie? Very open to change my mind.There are a few problems with the current representation and they largely mirror the old problem with signed vs unsigned integers in the IR 15 years ago. In early versions of LLVM, integers were explicitly signed. This meant that the IR was cluttered with bitcasts from signed to unsigned integers, which slowed down analysis and didn't convey any useful semantics. Worse, there were a bunch of things conflated, for example does unsigned imply wrapping? Some time in the 2.x series (2.0? My memory is fuzzy here), LLVM moved to just i{size} types for integer and moved all of the semantics to the operations. It's now explicit whether an operation is signed or unsigned, whether overflow wraps or has undefined behaviour, and so on. Pointers have a similar set of problems. Pointers carry a type, but that type doesn't actually carry any semantics. There are a lot of things that don't care about the type of the pointer, but they have no way of specifying this and generally use i8*. This means that the IR is full of bitcasts from {something}* to i8* and then back again. This is particularly important for code that wants to use non-zero address spaces, because a lot of code does casts via i8* and forgets to change this to i8*-in-another-address-space. The fact that a pointer is a pointer to some struct type currently doesn't imply anything about whether the pointed-to data and it's completely valid to bitcast a pointer to a random type and back again in an optimisation. The real type info (where applicable) is carried by TBAA metadata, dereferencability info by attributes, and so on. TL;DR: The pointee type has no (or worse, misleading) semantics and forces a load of bitcasts. Opaque pointers remove this. David
pawel k. via llvm-dev
2021-May-11 13:23 UTC
[llvm-dev] [RFC] Introducing the opaque pointer type
Ok. Cool. Im starting to understand now. ThankYou. -Pawel wt., 11.05.2021, 11:19 użytkownik David Chisnall via llvm-dev < llvm-dev at lists.llvm.org> napisał:> On 11/05/2021 07:59, pawel k. via llvm-dev wrote: > > I am very much beginner in opaque pointers but I am also minimalist too > > in a sense entities shouldnt be multiplied but rather divided where > > applicable. > > > > Can someone point me to article(s) describing what problems opaque > > pointers solve that cant be solved with forward declaractions and typed > > pointers etc? > > > > My first gutfeeling was when learning on idea of opaque pointers, theyre > > not much more than void* with all its issues from static analysis, > > compiler design, code readability, code quality, code security > > perspective. Can someone correct a newbie? Very open to change my mind. > > There are a few problems with the current representation and they > largely mirror the old problem with signed vs unsigned integers in the > IR 15 years ago. In early versions of LLVM, integers were explicitly > signed. This meant that the IR was cluttered with bitcasts from signed > to unsigned integers, which slowed down analysis and didn't convey any > useful semantics. Worse, there were a bunch of things conflated, for > example does unsigned imply wrapping? Some time in the 2.x series (2.0? > My memory is fuzzy here), LLVM moved to just i{size} types for integer > and moved all of the semantics to the operations. It's now explicit > whether an operation is signed or unsigned, whether overflow wraps or > has undefined behaviour, and so on. > > Pointers have a similar set of problems. Pointers carry a type, but > that type doesn't actually carry any semantics. There are a lot of > things that don't care about the type of the pointer, but they have no > way of specifying this and generally use i8*. This means that the IR is > full of bitcasts from {something}* to i8* and then back again. > > This is particularly important for code that wants to use non-zero > address spaces, because a lot of code does casts via i8* and forgets to > change this to i8*-in-another-address-space. > > The fact that a pointer is a pointer to some struct type currently > doesn't imply anything about whether the pointed-to data and it's > completely valid to bitcast a pointer to a random type and back again in > an optimisation. The real type info (where applicable) is carried by > TBAA metadata, dereferencability info by attributes, and so on. > > TL;DR: The pointee type has no (or worse, misleading) semantics and > forces a load of bitcasts. Opaque pointers remove this. > > David > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210511/8116f4e2/attachment.html>