thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM [Jun 2021]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2021-Jun-09 23:14 UTC

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

On 6/9/21 12:03, Chris Lattner wrote:> On Jun 6, 2021, at 8:52 AM, Hal Finkel <hal.finkel.llvm at gmail.com>
wrote:
>> I'll take this opportunity to point out that, at least
historically,
>> the reason why a desire to optimize around ptrtoint keeps resurfacing 
>> is because:
>>
>>  1. Common optimizations introduce them into code that did not 
>> otherwise have them (SROA, for example, see convertValue in SROA.cpp).
>>
>>  2. They're generated by some of the ABI code for argument passing 
>> (see clang/lib/CodeGen/TargetInfo.cpp).
>>
>>  3. They're present in certain performance-sensitive code idioms 
>> (see, for example, ADT/PointerIntPair.h).
>>
>> It seems to me that, if there's design work to do in this area, one
>> should consider addressing these now-long-standing issues where we 
>> introduce ptrtoint by replacing this mechanism with some other one.
>>
> I completely agree.  These all have different solutions, I’d prefer to 
> tackle them one by one.
>
> -Chris
>
I agree, these different problems have three different solutions. Also, 
let me add that I see three quasi-separable discussions here (accounting 
for past discussions on the same topic):

  1. Do we have a consistency problem with how we treat pointers and 
their provenance information? The answer here is yes (see, e.g., the GVN 
examples from this thread).

  2. Do we need to do more than be as conservative as possible around 
ptrtoint/inttoptr usages? This is relevant because trying to be clever 
here is often where inconsistencies around our pointer semantics are 
exposed, although it's not always the case that problems involve 
inttoptr. Addressing the points I raised above will lessen the 
motivation to be more aggressive here (although, in itself, that will 
not fix the semantic inconsistencies around pointers).

  3. Does introducing a byte type help resolve the semantic issues 
around pointers? I don't yet understand why this might help.

Thanks again,

Hal

Juneyoung Lee via llvm-dev

2021-Jun-10 09:05 UTC

head link

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

I created https://reviews.llvm.org/D104013 - please feel free to leave
comments.

Thanks,
Juneyoung

On Thu, Jun 10, 2021 at 8:15 AM Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> On 6/9/21 12:03, Chris Lattner wrote:
> > On Jun 6, 2021, at 8:52 AM, Hal Finkel <hal.finkel.llvm at
gmail.com>
> wrote:
> >> I'll take this opportunity to point out that, at least
historically,
> >> the reason why a desire to optimize around ptrtoint keeps
resurfacing
> >> is because:
> >>
> >>  1. Common optimizations introduce them into code that did not
> >> otherwise have them (SROA, for example, see convertValue in
SROA.cpp).
> >>
> >>  2. They're generated by some of the ABI code for argument
passing
> >> (see clang/lib/CodeGen/TargetInfo.cpp).
> >>
> >>  3. They're present in certain performance-sensitive code
idioms
> >> (see, for example, ADT/PointerIntPair.h).
> >>
> >> It seems to me that, if there's design work to do in this
area, one
> >> should consider addressing these now-long-standing issues where we
> >> introduce ptrtoint by replacing this mechanism with some other
one.
> >>
> > I completely agree.  These all have different solutions, I’d prefer to
> > tackle them one by one.
> >
> > -Chris
> >
>
> I agree, these different problems have three different solutions. Also,
> let me add that I see three quasi-separable discussions here (accounting
> for past discussions on the same topic):
>
>   1. Do we have a consistency problem with how we treat pointers and
> their provenance information? The answer here is yes (see, e.g., the GVN
> examples from this thread).
>
>   2. Do we need to do more than be as conservative as possible around
> ptrtoint/inttoptr usages? This is relevant because trying to be clever
> here is often where inconsistencies around our pointer semantics are
> exposed, although it's not always the case that problems involve
> inttoptr. Addressing the points I raised above will lessen the
> motivation to be more aggressive here (although, in itself, that will
> not fix the semantic inconsistencies around pointers).
>
>   3. Does introducing a byte type help resolve the semantic issues
> around pointers? I don't yet understand why this might help.
>
> Thanks again,
>
> Hal
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210610/16e06396/attachment.html>

Nicolai Hähnle via llvm-dev

2021-Jun-11 05:47 UTC

head link

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

I have written a longer article that resulted as a byproduct of thinking
through the problem space of this proposal:
https://nhaehnle.blogspot.com/2021/06/can-memcpy-be-implemented-in-llvm-ir.html

What happened is that I ended up questioning some really fundamental
things, like, can we even implement memcpy? :) The answer is a qualified
Yes, but I found it to be a good framework for thinking about the
fundamentals of what is discussed here, so I published this in the hope
that others find it useful.

tl;dr: This discussion is ultimately all about pointer provenance. There is
a gap in the expressiveness of LLVM IR when it comes to that, with
surprising consequences for memcpy (and similar operations). From an
aesthetics point of view, filling this gap has a lot of appeal, and the
"byte" proposal points in that direction. However, I have some issues
with
the details of the proposal, and it is so intrusive that it needs to be
justified by more than just aesthetics.

The correctness issues in the problem space can be solved by much less
intrusive means. The justification for the more intrusive means would be
better alias analysis, but I don't think this case has been built well
enough so far. We should also consider alternatives (though I don't think
there are any that are truly simple).

Apart from that, we need to be much more precise in our documentation of
pointer provenance in LangRef (e.g.: what does llvm.memcpy do, exactly --
the mentioned bug 37469 could technically be a bug in the loop idiom
recognizer!), and I like the idea of an `unrestrict(p)` instruction as a
simpler and more evocative spelling of `inttoptr(ptrtoint(p))`.

I would also like to better understand how this interacts with the C99
"restrict" work that Jeroen pointed out. Overall, this is an important
discussion to have but I feel we're only at the very beginning.

tl;dr of the tl;dr: It's complicated :)

Cheers,
Nicolai

On Thu, Jun 10, 2021 at 1:15 AM Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> On 6/9/21 12:03, Chris Lattner wrote:
> > On Jun 6, 2021, at 8:52 AM, Hal Finkel <hal.finkel.llvm at
gmail.com>
> wrote:
> >> I'll take this opportunity to point out that, at least
historically,
> >> the reason why a desire to optimize around ptrtoint keeps
resurfacing
> >> is because:
> >>
> >>  1. Common optimizations introduce them into code that did not
> >> otherwise have them (SROA, for example, see convertValue in
SROA.cpp).
> >>
> >>  2. They're generated by some of the ABI code for argument
passing
> >> (see clang/lib/CodeGen/TargetInfo.cpp).
> >>
> >>  3. They're present in certain performance-sensitive code
idioms
> >> (see, for example, ADT/PointerIntPair.h).
> >>
> >> It seems to me that, if there's design work to do in this
area, one
> >> should consider addressing these now-long-standing issues where we
> >> introduce ptrtoint by replacing this mechanism with some other
one.
> >>
> > I completely agree.  These all have different solutions, I’d prefer to
> > tackle them one by one.
> >
> > -Chris
> >
>
> I agree, these different problems have three different solutions. Also,
> let me add that I see three quasi-separable discussions here (accounting
> for past discussions on the same topic):
>
>   1. Do we have a consistency problem with how we treat pointers and
> their provenance information? The answer here is yes (see, e.g., the GVN
> examples from this thread).
>
>   2. Do we need to do more than be as conservative as possible around
> ptrtoint/inttoptr usages? This is relevant because trying to be clever
> here is often where inconsistencies around our pointer semantics are
> exposed, although it's not always the case that problems involve
> inttoptr. Addressing the points I raised above will lessen the
> motivation to be more aggressive here (although, in itself, that will
> not fix the semantic inconsistencies around pointers).
>
>   3. Does introducing a byte type help resolve the semantic issues
> around pointers? I don't yet understand why this might help.
>
> Thanks again,
>
> Hal
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210611/5e8c87e3/attachment-0001.html>

llvm dev - Jun 2021 - [cfe-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM

[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM