thr3ads.net - llvm dev - [llvm-dev] [RFC] Introducing the opaque pointer type [May 2021]

If this information is useful, please help other people find it:
Share via:

David Chisnall via llvm-dev

2021-May-11 09:19 UTC

[llvm-dev] [RFC] Introducing the opaque pointer type

On 11/05/2021 07:59, pawel k. via llvm-dev wrote:> I am very much beginner in opaque pointers but I am also minimalist too 
> in a sense entities shouldnt be multiplied but rather divided where 
> applicable.
> 
> Can someone point me to article(s) describing what problems opaque 
> pointers solve that cant be solved with forward declaractions and typed 
> pointers etc?
> 
> My first gutfeeling was when learning on idea of opaque pointers, theyre 
> not much more than void* with all its issues from static analysis, 
> compiler design, code readability, code quality, code security 
> perspective. Can someone correct a newbie? Very open to change my mind.
There are a few problems with the current representation and they 
largely mirror the old problem with signed vs unsigned integers in the 
IR 15 years ago.  In early versions of LLVM, integers were explicitly 
signed.  This meant that the IR was cluttered with bitcasts from signed 
to unsigned integers, which slowed down analysis and didn't convey any 
useful semantics.  Worse, there were a bunch of things conflated, for 
example does unsigned imply wrapping?  Some time in the 2.x series (2.0? 
  My memory is fuzzy here), LLVM moved to just i{size} types for integer 
and moved all of the semantics to the operations.  It's now explicit 
whether an operation is signed or unsigned, whether overflow wraps or 
has undefined behaviour, and so on.

Pointers have a similar set of problems.  Pointers carry a type, but 
that type doesn't actually carry any semantics.  There are a lot of 
things that don't care about the type of the pointer, but they have no 
way of specifying this and generally use i8*.  This means that the IR is 
full of bitcasts from {something}* to i8* and then back again.

This is particularly important for code that wants to use non-zero 
address spaces, because a lot of code does casts via i8* and forgets to 
change this to i8*-in-another-address-space.

The fact that a pointer is a pointer to some struct type currently 
doesn't imply anything about whether the pointed-to data and it's 
completely valid to bitcast a pointer to a random type and back again in 
an optimisation.  The real type info (where applicable) is carried by 
TBAA metadata, dereferencability info by attributes, and so on.

TL;DR: The pointee type has no (or worse, misleading) semantics and 
forces a load of bitcasts.  Opaque pointers remove this.

David

pawel k. via llvm-dev

2021-May-11 13:23 UTC

head link

[llvm-dev] [RFC] Introducing the opaque pointer type

Ok. Cool. Im starting to understand now. ThankYou.

-Pawel

wt., 11.05.2021, 11:19 użytkownik David Chisnall via llvm-dev <
llvm-dev at lists.llvm.org> napisał:
> On 11/05/2021 07:59, pawel k. via llvm-dev wrote:
> > I am very much beginner in opaque pointers but I am also minimalist
too
> > in a sense entities shouldnt be multiplied but rather divided where
> > applicable.
> >
> > Can someone point me to article(s) describing what problems opaque
> > pointers solve that cant be solved with forward declaractions and
typed
> > pointers etc?
> >
> > My first gutfeeling was when learning on idea of opaque pointers,
theyre
> > not much more than void* with all its issues from static analysis,
> > compiler design, code readability, code quality, code security
> > perspective. Can someone correct a newbie? Very open to change my
mind.
>
> There are a few problems with the current representation and they
> largely mirror the old problem with signed vs unsigned integers in the
> IR 15 years ago.  In early versions of LLVM, integers were explicitly
> signed.  This meant that the IR was cluttered with bitcasts from signed
> to unsigned integers, which slowed down analysis and didn't convey any
> useful semantics.  Worse, there were a bunch of things conflated, for
> example does unsigned imply wrapping?  Some time in the 2.x series (2.0?
>   My memory is fuzzy here), LLVM moved to just i{size} types for integer
> and moved all of the semantics to the operations.  It's now explicit
> whether an operation is signed or unsigned, whether overflow wraps or
> has undefined behaviour, and so on.
>
> Pointers have a similar set of problems.  Pointers carry a type, but
> that type doesn't actually carry any semantics.  There are a lot of
> things that don't care about the type of the pointer, but they have no
> way of specifying this and generally use i8*.  This means that the IR is
> full of bitcasts from {something}* to i8* and then back again.
>
> This is particularly important for code that wants to use non-zero
> address spaces, because a lot of code does casts via i8* and forgets to
> change this to i8*-in-another-address-space.
>
> The fact that a pointer is a pointer to some struct type currently
> doesn't imply anything about whether the pointed-to data and it's
> completely valid to bitcast a pointer to a random type and back again in
> an optimisation.  The real type info (where applicable) is carried by
> TBAA metadata, dereferencability info by attributes, and so on.
>
> TL;DR: The pointee type has no (or worse, misleading) semantics and
> forces a load of bitcasts.  Opaque pointers remove this.
>
> David
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210511/8116f4e2/attachment.html>

llvm dev - May 2021 - [RFC] Introducing the opaque pointer type

[llvm-dev] [RFC] Introducing the opaque pointer type

[llvm-dev] [RFC] Introducing the opaque pointer type