pawel k. via llvm-dev
2021-May-11 06:59 UTC
[llvm-dev] [RFC] Introducing the opaque pointer type
I am very much beginner in opaque pointers but I am also minimalist too in a sense entities shouldnt be multiplied but rather divided where applicable. Can someone point me to article(s) describing what problems opaque pointers solve that cant be solved with forward declaractions and typed pointers etc? My first gutfeeling was when learning on idea of opaque pointers, theyre not much more than void* with all its issues from static analysis, compiler design, code readability, code quality, code security perspective. Can someone correct a newbie? Very open to change my mind. -Pawel wt., 11.05.2021, 02:35 użytkownik Duncan P. N. Exon Smith via llvm-dev < llvm-dev at lists.llvm.org> napisał:> I agree. I think it would be a mistake to add an unnecessary difference > vs. typed pointers along some other axis (address space, or > otherwise). Opaque pointers have enough of their own challenges to solve. > > On 2021 May 10, at 15:28, Arthur Eubanks via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > If there's a larger effort to make address spaces then I'd be happy to > change the representation since mass updating tests once is better than > twice, but I'm worried that this may start becoming intertwined with more > address space work, and the opaque pointers project has gone on long enough > (like many other LLVM projects). > > And of course, there's always time before we do mass test updates to > easily change the textual representation. > > On Fri, May 7, 2021 at 11:27 AM David Blaikie <dblaikie at gmail.com> wrote: > >> On Fri, May 7, 2021 at 11:20 AM Arthur Eubanks via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> > >> > >> > >> > On Fri, May 7, 2021 at 8:40 AM David Chisnall via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> >> >> On 04/05/2021 19:32, Tom Stellard via llvm-dev wrote: >> >> > I think requiring an address space would be too confusing for a >> majority >> >> > of use >> >> > cases. Would it help if instead of defaulting to 0, the default >> address >> >> > space >> >> > was target dependent? >> >> >> >> For CHERI targets, the default address space is ABI dependent: AS0 is a >> >> 64-bit integer that's relative to the default data capability, AS200 is >> >> a 128-bit capability (on 64-bit platforms). It can also differ between >> >> code, heap, and stack. >> >> >> >> If this is purely a syntactic thing in the text serialisation, would it >> >> be possible to put something in the DataLayout that is ignored by >> >> everything except the pretty-printer / parser? >> > >> > Could you give an example? >> > >> > >> > Also, perhaps we should separate the opaque pointer types transition >> from any changes to address spaces. Currently the proposal is basically >> unchanged from the current status quo in terms of pointer address spaces. >> We definitely should have a "default" pointer type in some shape or form >> which is represented by "ptr", or else writing IR tests is too cumbersome. >> Currently that means AS0, but we can change that in the future if we want >> independently of opaque pointers. >> >> +1 to this - pointers already carry their address space with explicit >> syntax and I think it's OK to do that for this transition. Though I >> wouldn't be opposed to a change in the future to roll it into the >> pointer type name if that seems suitable. >> >> - Dave >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210511/d3162c6c/attachment.html>
David Blaikie via llvm-dev
2021-May-11 07:20 UTC
[llvm-dev] [RFC] Introducing the opaque pointer type
On Mon, May 10, 2021 at 11:59 PM pawel k. via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I am very much beginner in opaque pointers but I am also minimalist too in > a sense entities shouldnt be multiplied but rather divided where applicable. > > Can someone point me to article(s) describing what problems opaque > pointers solve that cant be solved with forward declaractions and typed > pointers etc? > > My first gutfeeling was when learning on idea of opaque pointers, theyre > not much more than void* >Yep, that's basically what they are. Though this is only relative to the IR design, not source language design.> with all its issues from static analysis, compiler design, code > readability, code quality, code security perspective. Can someone correct a > newbie? Very open to change my mind. >LLVM doesn't provide any guarantees about pointer types (unlike, say, C++ that has type based aliasing guarantees about pointers - if you have an int* you know it can't hold the same value as a float* in C++, but this property isn't true in LLVM IR (this information can be carried separately in type based alias analysis metadata - but it's not inherent in the LLVM IR of pointers themselves)) - so the type information provides limited value (somewhat useful for frontends generating IR to be able to have some intended type information carried around in the IR as it's being constructed) and inhibits optimizations - converting between pointer types involves instructions (geps or bitcasts) - instructions that optimizations have to know to skip over/look through. So instead, we're moving to a model where pointers don't have a type (since it's not informative to optimizations anyway) - and operations carry type information (instead of "load from this int pointer" it'll be "load an integer from this opaque pointer"). If you look at the LLVM IR today, you'll see these explicit types on operations (eg: the load instruction has an explicit type parameter to it, which currently looks redundant with the type of the pointer parameter that's passed to the load instruction - but in the future that pointer parameter won't carry any pointee type information and the load will rely entirely on the explicit type parameter it has). - Dave> -Pawel > > wt., 11.05.2021, 02:35 użytkownik Duncan P. N. Exon Smith via llvm-dev < > llvm-dev at lists.llvm.org> napisał: > >> I agree. I think it would be a mistake to add an unnecessary difference >> vs. typed pointers along some other axis (address space, or >> otherwise). Opaque pointers have enough of their own challenges to solve. >> >> On 2021 May 10, at 15:28, Arthur Eubanks via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> If there's a larger effort to make address spaces then I'd be happy to >> change the representation since mass updating tests once is better than >> twice, but I'm worried that this may start becoming intertwined with more >> address space work, and the opaque pointers project has gone on long enough >> (like many other LLVM projects). >> >> And of course, there's always time before we do mass test updates to >> easily change the textual representation. >> >> On Fri, May 7, 2021 at 11:27 AM David Blaikie <dblaikie at gmail.com> wrote: >> >>> On Fri, May 7, 2021 at 11:20 AM Arthur Eubanks via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> > >>> > >>> > >>> > On Fri, May 7, 2021 at 8:40 AM David Chisnall via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >> >>> >> On 04/05/2021 19:32, Tom Stellard via llvm-dev wrote: >>> >> > I think requiring an address space would be too confusing for a >>> majority >>> >> > of use >>> >> > cases. Would it help if instead of defaulting to 0, the default >>> address >>> >> > space >>> >> > was target dependent? >>> >> >>> >> For CHERI targets, the default address space is ABI dependent: AS0 is >>> a >>> >> 64-bit integer that's relative to the default data capability, AS200 >>> is >>> >> a 128-bit capability (on 64-bit platforms). It can also differ >>> between >>> >> code, heap, and stack. >>> >> >>> >> If this is purely a syntactic thing in the text serialisation, would >>> it >>> >> be possible to put something in the DataLayout that is ignored by >>> >> everything except the pretty-printer / parser? >>> > >>> > Could you give an example? >>> > >>> > >>> > Also, perhaps we should separate the opaque pointer types transition >>> from any changes to address spaces. Currently the proposal is basically >>> unchanged from the current status quo in terms of pointer address spaces. >>> We definitely should have a "default" pointer type in some shape or form >>> which is represented by "ptr", or else writing IR tests is too cumbersome. >>> Currently that means AS0, but we can change that in the future if we want >>> independently of opaque pointers. >>> >>> +1 to this - pointers already carry their address space with explicit >>> syntax and I think it's OK to do that for this transition. Though I >>> wouldn't be opposed to a change in the future to roll it into the >>> pointer type name if that seems suitable. >>> >>> - Dave >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210511/453da756/attachment.html>
David Chisnall via llvm-dev
2021-May-11 09:19 UTC
[llvm-dev] [RFC] Introducing the opaque pointer type
On 11/05/2021 07:59, pawel k. via llvm-dev wrote:> I am very much beginner in opaque pointers but I am also minimalist too > in a sense entities shouldnt be multiplied but rather divided where > applicable. > > Can someone point me to article(s) describing what problems opaque > pointers solve that cant be solved with forward declaractions and typed > pointers etc? > > My first gutfeeling was when learning on idea of opaque pointers, theyre > not much more than void* with all its issues from static analysis, > compiler design, code readability, code quality, code security > perspective. Can someone correct a newbie? Very open to change my mind.There are a few problems with the current representation and they largely mirror the old problem with signed vs unsigned integers in the IR 15 years ago. In early versions of LLVM, integers were explicitly signed. This meant that the IR was cluttered with bitcasts from signed to unsigned integers, which slowed down analysis and didn't convey any useful semantics. Worse, there were a bunch of things conflated, for example does unsigned imply wrapping? Some time in the 2.x series (2.0? My memory is fuzzy here), LLVM moved to just i{size} types for integer and moved all of the semantics to the operations. It's now explicit whether an operation is signed or unsigned, whether overflow wraps or has undefined behaviour, and so on. Pointers have a similar set of problems. Pointers carry a type, but that type doesn't actually carry any semantics. There are a lot of things that don't care about the type of the pointer, but they have no way of specifying this and generally use i8*. This means that the IR is full of bitcasts from {something}* to i8* and then back again. This is particularly important for code that wants to use non-zero address spaces, because a lot of code does casts via i8* and forgets to change this to i8*-in-another-address-space. The fact that a pointer is a pointer to some struct type currently doesn't imply anything about whether the pointed-to data and it's completely valid to bitcast a pointer to a random type and back again in an optimisation. The real type info (where applicable) is carried by TBAA metadata, dereferencability info by attributes, and so on. TL;DR: The pointee type has no (or worse, misleading) semantics and forces a load of bitcasts. Opaque pointers remove this. David