Hal Finkel via llvm-dev
2021-Jun-09 23:14 UTC
[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM
On 6/9/21 12:03, Chris Lattner wrote:> On Jun 6, 2021, at 8:52 AM, Hal Finkel <hal.finkel.llvm at gmail.com> wrote: >> I'll take this opportunity to point out that, at least historically, >> the reason why a desire to optimize around ptrtoint keeps resurfacing >> is because: >> >> 1. Common optimizations introduce them into code that did not >> otherwise have them (SROA, for example, see convertValue in SROA.cpp). >> >> 2. They're generated by some of the ABI code for argument passing >> (see clang/lib/CodeGen/TargetInfo.cpp). >> >> 3. They're present in certain performance-sensitive code idioms >> (see, for example, ADT/PointerIntPair.h). >> >> It seems to me that, if there's design work to do in this area, one >> should consider addressing these now-long-standing issues where we >> introduce ptrtoint by replacing this mechanism with some other one. >> > I completely agree. These all have different solutions, I’d prefer to > tackle them one by one. > > -Chris >I agree, these different problems have three different solutions. Also, let me add that I see three quasi-separable discussions here (accounting for past discussions on the same topic): 1. Do we have a consistency problem with how we treat pointers and their provenance information? The answer here is yes (see, e.g., the GVN examples from this thread). 2. Do we need to do more than be as conservative as possible around ptrtoint/inttoptr usages? This is relevant because trying to be clever here is often where inconsistencies around our pointer semantics are exposed, although it's not always the case that problems involve inttoptr. Addressing the points I raised above will lessen the motivation to be more aggressive here (although, in itself, that will not fix the semantic inconsistencies around pointers). 3. Does introducing a byte type help resolve the semantic issues around pointers? I don't yet understand why this might help. Thanks again, Hal
Juneyoung Lee via llvm-dev
2021-Jun-10 09:05 UTC
[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM
I created https://reviews.llvm.org/D104013 - please feel free to leave comments. Thanks, Juneyoung On Thu, Jun 10, 2021 at 8:15 AM Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > On 6/9/21 12:03, Chris Lattner wrote: > > On Jun 6, 2021, at 8:52 AM, Hal Finkel <hal.finkel.llvm at gmail.com> > wrote: > >> I'll take this opportunity to point out that, at least historically, > >> the reason why a desire to optimize around ptrtoint keeps resurfacing > >> is because: > >> > >> 1. Common optimizations introduce them into code that did not > >> otherwise have them (SROA, for example, see convertValue in SROA.cpp). > >> > >> 2. They're generated by some of the ABI code for argument passing > >> (see clang/lib/CodeGen/TargetInfo.cpp). > >> > >> 3. They're present in certain performance-sensitive code idioms > >> (see, for example, ADT/PointerIntPair.h). > >> > >> It seems to me that, if there's design work to do in this area, one > >> should consider addressing these now-long-standing issues where we > >> introduce ptrtoint by replacing this mechanism with some other one. > >> > > I completely agree. These all have different solutions, I’d prefer to > > tackle them one by one. > > > > -Chris > > > > I agree, these different problems have three different solutions. Also, > let me add that I see three quasi-separable discussions here (accounting > for past discussions on the same topic): > > 1. Do we have a consistency problem with how we treat pointers and > their provenance information? The answer here is yes (see, e.g., the GVN > examples from this thread). > > 2. Do we need to do more than be as conservative as possible around > ptrtoint/inttoptr usages? This is relevant because trying to be clever > here is often where inconsistencies around our pointer semantics are > exposed, although it's not always the case that problems involve > inttoptr. Addressing the points I raised above will lessen the > motivation to be more aggressive here (although, in itself, that will > not fix the semantic inconsistencies around pointers). > > 3. Does introducing a byte type help resolve the semantic issues > around pointers? I don't yet understand why this might help. > > Thanks again, > > Hal > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Juneyoung Lee Software Foundation Lab, Seoul National University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210610/16e06396/attachment.html>
Nicolai Hähnle via llvm-dev
2021-Jun-11 05:47 UTC
[llvm-dev] [cfe-dev] [RFC] Introducing a byte type to LLVM
I have written a longer article that resulted as a byproduct of thinking through the problem space of this proposal: https://nhaehnle.blogspot.com/2021/06/can-memcpy-be-implemented-in-llvm-ir.html What happened is that I ended up questioning some really fundamental things, like, can we even implement memcpy? :) The answer is a qualified Yes, but I found it to be a good framework for thinking about the fundamentals of what is discussed here, so I published this in the hope that others find it useful. tl;dr: This discussion is ultimately all about pointer provenance. There is a gap in the expressiveness of LLVM IR when it comes to that, with surprising consequences for memcpy (and similar operations). From an aesthetics point of view, filling this gap has a lot of appeal, and the "byte" proposal points in that direction. However, I have some issues with the details of the proposal, and it is so intrusive that it needs to be justified by more than just aesthetics. The correctness issues in the problem space can be solved by much less intrusive means. The justification for the more intrusive means would be better alias analysis, but I don't think this case has been built well enough so far. We should also consider alternatives (though I don't think there are any that are truly simple). Apart from that, we need to be much more precise in our documentation of pointer provenance in LangRef (e.g.: what does llvm.memcpy do, exactly -- the mentioned bug 37469 could technically be a bug in the loop idiom recognizer!), and I like the idea of an `unrestrict(p)` instruction as a simpler and more evocative spelling of `inttoptr(ptrtoint(p))`. I would also like to better understand how this interacts with the C99 "restrict" work that Jeroen pointed out. Overall, this is an important discussion to have but I feel we're only at the very beginning. tl;dr of the tl;dr: It's complicated :) Cheers, Nicolai On Thu, Jun 10, 2021 at 1:15 AM Hal Finkel via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > On 6/9/21 12:03, Chris Lattner wrote: > > On Jun 6, 2021, at 8:52 AM, Hal Finkel <hal.finkel.llvm at gmail.com> > wrote: > >> I'll take this opportunity to point out that, at least historically, > >> the reason why a desire to optimize around ptrtoint keeps resurfacing > >> is because: > >> > >> 1. Common optimizations introduce them into code that did not > >> otherwise have them (SROA, for example, see convertValue in SROA.cpp). > >> > >> 2. They're generated by some of the ABI code for argument passing > >> (see clang/lib/CodeGen/TargetInfo.cpp). > >> > >> 3. They're present in certain performance-sensitive code idioms > >> (see, for example, ADT/PointerIntPair.h). > >> > >> It seems to me that, if there's design work to do in this area, one > >> should consider addressing these now-long-standing issues where we > >> introduce ptrtoint by replacing this mechanism with some other one. > >> > > I completely agree. These all have different solutions, I’d prefer to > > tackle them one by one. > > > > -Chris > > > > I agree, these different problems have three different solutions. Also, > let me add that I see three quasi-separable discussions here (accounting > for past discussions on the same topic): > > 1. Do we have a consistency problem with how we treat pointers and > their provenance information? The answer here is yes (see, e.g., the GVN > examples from this thread). > > 2. Do we need to do more than be as conservative as possible around > ptrtoint/inttoptr usages? This is relevant because trying to be clever > here is often where inconsistencies around our pointer semantics are > exposed, although it's not always the case that problems involve > inttoptr. Addressing the points I raised above will lessen the > motivation to be more aggressive here (although, in itself, that will > not fix the semantic inconsistencies around pointers). > > 3. Does introducing a byte type help resolve the semantic issues > around pointers? I don't yet understand why this might help. > > Thanks again, > > Hal > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Lerne, wie die Welt wirklich ist, aber vergiss niemals, wie sie sein sollte. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210611/5e8c87e3/attachment-0001.html>