On Thu, Dec 8, 2016 at 5:37 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:> So IIUC basically the *only* reason for this IR change is that we don’t > want to pattern match in debug build? > I don't understand right now why we wouldn’t want to do this? >If we need to pattern match such a basic construct, it suggests to me that we have the wrong representation, and we should instead make our representation more accurately model reality. To me, it feels like this representation allows several good things to just "fall out" without any additional work, and that suggests it's a good representation. On Thu, Dec 8, 2016 at 5:47 PM, Friedman, Eli <efriedma at codeaurora.org> wrote:> > One other questionable side benefit of doing this is that it would make it >> possible to implement va_start by taking the address of a parameter and >> doing pointer arithmetic. While that code is fairly invalid, it’s exactly >> what the MSVC STL headers do for 32-bit x86. If we make this work in Clang, >> we can remove our stdarg.h and vadefs.h wrapper headers. Users often pass >> flags that cause clang to skip these wrapper headers, and then they file >> bugs complaining that va lists don't work. >> > > This ignores the other reason why this doesn't work at the moment: reading > off the end of an alloca returns undef, whether or not the actual pointer > value points at the data you want. I mean, it's not impossible to make it > work, but it involves a bunch of unrelated changes which I don't think we > want to pursue.I guess I'm more willing to pursue those changes at some vague unspecified point in the future. It doesn't seem that crazy to add a special case to allow OOB GEPs of off the last argument alloca of a variadic function. Other OOB GEPs can continue to be UB as they are today. We have other loopholes for these kinds of OOB GEP optimizations around zero-sized globals so that people can use __start_section/__stop_section symbols to implement .init_array style registration systems. This doesn't seem that different. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161209/2dd39d81/attachment.html>
On 12/9/2016 8:45 AM, Reid Kleckner wrote:> On Thu, Dec 8, 2016 at 5:37 PM, Mehdi Amini <mehdi.amini at apple.com > <mailto:mehdi.amini at apple.com>> wrote: > > So IIUC basically the *only* reason for this IR change is that we > don’t want to pattern match in debug build? > I don't understand right now why we wouldn’t want to do this? > > > If we need to pattern match such a basic construct, it suggests to me > that we have the wrong representation, and we should instead make our > representation more accurately model reality. To me, it feels like > this representation allows several good things to just "fall out" > without any additional work, and that suggests it's a good representation.Mmm... maybe. The part I really don't like is the implied store: there are a lot of transformations which specifically reason about store instructions, and they would all need to be fixed to specifically deal with "alloca with an argument" so it doesn't block optimizations. Every feature we add to the IR makes it more difficult to write IR transformations, and this really doesn't seem to carry its own weight. A couple of other thoughts I had: 1. We could use metadata, e.g. "%px = alloca i32, align 4 !argument !i32 %x", followed by a store. There's still a little pattern-matching involved because the metadata is only a hint, but the intent in the IR is more explicit. 2. We could change the way clang generates function definitions: instead of "define void @f(i32 %x)", generate "define void @f(i32* byval %px)" if the value is going to end up on the stack. This should work without any changes to the IR. This is a bad idea if we're trying to run optimizations because it obscures data flow, but it seems reasonable at -O0. -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161209/1f98407d/attachment.html>
On Fri, Dec 9, 2016 at 10:30 AM, Friedman, Eli <efriedma at codeaurora.org> wrote:> Mmm... maybe. The part I really don't like is the implied store: there > are a lot of transformations which specifically reason about store > instructions, and they would all need to be fixed to specifically deal with > "alloca with an argument" so it doesn't block optimizations. Every feature > we add to the IR makes it more difficult to write IR transformations, and > this really doesn't seem to carry its own weight. >I don't feel like it complicates our model that much, though. Passes already have to know that uninitialized allocas contain undef, and with this change they can contain a value. That doesn't seem very surprising. It is a fair point that we'd have to add new cases to code like llvm::FindAvailableLoadedValue, which looks for StoreInst but not AllocaInst.> A couple of other thoughts I had: > 1. We could use metadata, e.g. "%px = alloca i32, align 4 !argument !i32 > %x", followed by a store. There's still a little pattern-matching involved > because the metadata is only a hint, but the intent in the IR is more > explicit. >This works, but it doesn't feel like it keeps simple things simple. I'll definitely implement this if people don't like my proposal.> 2. We could change the way clang generates function definitions: instead > of "define void @f(i32 %x)", generate "define void @f(i32* byval %px)" if > the value is going to end up on the stack. This should work without any > changes to the IR. This is a bad idea if we're trying to run optimizations > because it obscures data flow, but it seems reasonable at -O0. >This is problematic because it would break out "bitcode ABI" between optimized and debug builds. We've had problems in the past when the caller and callee disagree on whether an argument is a byval pointer or not, and instcombine will come along and try to simplify the casts: https://llvm.org/bugs/show_bug.cgi?id=21573 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161209/2e28d7b8/attachment.html>
On 12/09/2016 08:45 AM, Reid Kleckner via llvm-dev wrote:> On Thu, Dec 8, 2016 at 5:37 PM, Mehdi Amini <mehdi.amini at apple.com > <mailto:mehdi.amini at apple.com>> wrote: > > So IIUC basically the *only* reason for this IR change is that we > don’t want to pattern match in debug build? > I don't understand right now why we wouldn’t want to do this? > > > If we need to pattern match such a basic construct, it suggests to me > that we have the wrong representation, and we should instead make our > representation more accurately model reality. To me, it feels like > this representation allows several good things to just "fall out" > without any additional work, and that suggests it's a good representation.I'm concerned by this response on multiple levels. I agree Mehdi that the proposed IR change seems to be solving a (relatively minor) optimization problem with an IR change. We generally expect our IR changes to be well justified and this doesn't even come close to me. More than that, I'm really concerned about the assumption that the IR should be a close fit *for a particular frontend* in a *particular mode of operation*. We have tried very hard to keep the IR generic enough to be useful by many language frontends and this would seem to give up on that goal. That is deeply concerning to me. As it stands right now, based on what I've seen to date in discussion, I would be strongly opposed to this proposal. Philip -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161209/088c24ef/attachment.html>
On Fri, Dec 9, 2016 at 11:56 AM, Philip Reames <listmail at philipreames.com> wrote:> On 12/09/2016 08:45 AM, Reid Kleckner via llvm-dev wrote: > > On Thu, Dec 8, 2016 at 5:37 PM, Mehdi Amini <mehdi.amini at apple.com> wrote: > >> So IIUC basically the *only* reason for this IR change is that we don’t >> want to pattern match in debug build? >> I don't understand right now why we wouldn’t want to do this? >> > > If we need to pattern match such a basic construct, it suggests to me that > we have the wrong representation, and we should instead make our > representation more accurately model reality. To me, it feels like this > representation allows several good things to just "fall out" without any > additional work, and that suggests it's a good representation. > > I'm concerned by this response on multiple levels. I agree Mehdi that the > proposed IR change seems to be solving a (relatively minor) optimization > problem with an IR change. We generally expect our IR changes to be well > justified and this doesn't even come close to me. More than that, I'm > really concerned about the assumption that the IR should be a close fit > *for a particular frontend* in a *particular mode of operation*. We have > tried very hard to keep the IR generic enough to be useful by many language > frontends and this would seem to give up on that goal. That is deeply > concerning to me. >I wouldn't be doing this if I thought it was a minor optimization, but in fairness, I haven't measured how much of a win this is yet. The fact that this change solves two separate but related problems in optimized code and unoptimized code is what makes me think this is a good idea. I don't see how my proposal favors a particular frontend. The pattern of creating an alloca and storing to it is the recommended way to create mutable local variables ( http://llvm.org/docs/Frontend/PerformanceTips.html#use-of-allocas). We can document the new more efficient idiom if we decide to implement it. Frontends that generate SSA directly are interesting, but I think it's fair for us to optimize for frontends using allocas. This should also be a generic improvement across all targets, even if it is more of a win on targets with fewer register parameters. If you have enough arguments, eventually some of them will be passed in memory, unless you have a virtual ISA with an infinite register file.> As it stands right now, based on what I've seen to date in discussion, I > would be strongly opposed to this proposal. >Darn. :) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161209/4017aded/attachment.html>
On Fri, Dec 9, 2016 at 1:30 PM, Friedman, Eli via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 12/9/2016 8:45 AM, Reid Kleckner wrote: > > On Thu, Dec 8, 2016 at 5:37 PM, Mehdi Amini <mehdi.amini at apple.com> wrote: > >> So IIUC basically the *only* reason for this IR change is that we don’t >> want to pattern match in debug build? >> I don't understand right now why we wouldn’t want to do this? >> > > If we need to pattern match such a basic construct, it suggests to me that > we have the wrong representation, and we should instead make our > representation more accurately model reality. To me, it feels like this > representation allows several good things to just "fall out" without any > additional work, and that suggests it's a good representation. > > > Mmm... maybe. The part I really don't like is the implied store: there > are a lot of transformations which specifically reason about store > instructions, and they would all need to be fixed to specifically deal with > "alloca with an argument" so it doesn't block optimizations. Every feature > we add to the IR makes it more difficult to write IR transformations, and > this really doesn't seem to carry its own weight. > > A couple of other thoughts I had: > 1. We could use metadata, e.g. "%px = alloca i32, align 4 !argument !i32 > %x", followed by a store. There's still a little pattern-matching involved > because the metadata is only a hint, but the intent in the IR is more > explicit. > 2. We could change the way clang generates function definitions: instead > of "define void @f(i32 %x)", generate "define void @f(i32* byval %px)" if > the value is going to end up on the stack. This should work without any > changes to the IR. This is a bad idea if we're trying to run optimizations > because it obscures data flow, but it seems reasonable at -O0. >IMO, the LLVM function definitions should be a straightforward transformation from the C function signatures, and clang should stop mangling the function signatures with its own intimate knowledge of the calling convention rules. Instead, clang could emit (still ABI-specific!) annotations on the arguments that communicates the *properties* of the source-language types which affect the ABI. LLVM, then, would use that information to implement the complete calling convention. No counting of numbers of registers in Clang to put "inreg" markers in the right places, or byval, or any of that sort of stuff it has to do today. For example, on x86-64 SysV ABI, a complex classification algorithm is run over the fields in an aggregate, to classify each eightbyte quantity into one of several classes (MEMORY, INTEGER, SSE, etc). That must run on the clang side, using source types, but it could send the direct result of the classification to llvm, rather than using it to mangle the prototype of the function in a secret undocumented handshake using the few seemingly-but-not-really-generic attributes that are available now. For that reason, I really don't like the second option you mention above, as it requires clang to know more about exactly what the calling convention decisions made by llvm are going to be. Actually, I think byval should be deleted entirely. If you want to pass by value...then, pass by value... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161209/83a4b15a/attachment.html>
----- Original Message -----> From: "Eli via llvm-dev Friedman" <llvm-dev at lists.llvm.org> > To: "Reid Kleckner" <rnk at google.com> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Friday, December 9, 2016 12:30:34 PM > Subject: Re: [llvm-dev] RFC: Adding argument allocas> On 12/9/2016 8:45 AM, Reid Kleckner wrote:> > On Thu, Dec 8, 2016 at 5:37 PM, Mehdi Amini < mehdi.amini at apple.com > > > > > wrote: >> > > So IIUC basically the *only* reason for this IR change is that we > > > don’t want to pattern match in debug build? > > > > > > I don't understand right now why we wouldn’t want to do this? > > > > > If we need to pattern match such a basic construct, it suggests to > > me > > that we have the wrong representation, and we should instead make > > our representation more accurately model reality. To me, it feels > > like this representation allows several good things to just "fall > > out" without any additional work, and that suggests it's a good > > representation. > > Mmm... maybe. The part I really don't like is the implied store: > there are a lot of transformations which specifically reason about > store instructions, and they would all need to be fixed to > specifically deal with "alloca with an argument" so it doesn't block > optimizations.I'm not sure this is true. The instruction does write to memory, at least formally, but by definition the address of the alloca cannot yet have been used by anything else, and so no other memory access can alias with it. The write also can't trap. The alloca, as a result, does not even need to be tagged as writing to memory. -Hal> Every feature we add to the IR makes it more difficult to write IR > transformations, and this really doesn't seem to carry its own > weight.> A couple of other thoughts I had: > 1. We could use metadata, e.g. "%px = alloca i32, align 4 !argument > !i32 %x", followed by a store. There's still a little > pattern-matching involved because the metadata is only a hint, but > the intent in the IR is more explicit. > 2. We could change the way clang generates function definitions: > instead of "define void @f(i32 %x)", generate "define void @f(i32* > byval %px)" if the value is going to end up on the stack. This > should work without any changes to the IR. This is a bad idea if > we're trying to run optimizations because it obscures data flow, but > it seems reasonable at -O0.> -Eli > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a > Linux Foundation Collaborative Project > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161210/53c5436d/attachment.html>