John McCall via llvm-dev
2020-Jun-04 21:45 UTC
[llvm-dev] [cfe-dev] Clang/LLVM function ABI lowering (was: Re: [RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM)
On 4 Jun 2020, at 0:54, James Y Knight via llvm-dev wrote:> While MLIR may be one part of the solution, I think it's also the case > that > the function-ABI interface between Clang and LLVM is just wrong and > should > be fixed -- independently of whether Clang might use MLIR in the > future. > > I've mentioned this idea before, I think, but never got around to > writing > up a real proposal. And I still haven't. Maybe this email could > inspire > someone else to work on that. > > Essentially, I'd like to see the code in Clang responsible for > function > parameter-type mangling as part of its ABI lowering deleted. > Currently, > there is a secret "LLVM IR" ABI used between Clang and LLVM, which > involves > expanding some arguments into multiple arguments, adding a smattering > of > "inreg" or "byval" attributes, and converting some types into other > types. > All in a completely target-dependent, complex, and undocumented > manner. > > So, while the IR function syntax appears at first glance to be generic > and > target-independent, that's not at all true. Sadly, in some cases, > clang > must even know how many registers different calling conventions use, > and > count numbers of available registers left, in order to choose the > right set > of those "generic" attributes to put on a parameter. > > So: not only does a frontend need to understand the C ABI rules, they > also > need to understand that complex dance for how to convert that into > LLVM IR > -- and that's both completely undocumented, and a huge mess. > > Instead, I believe clang should always pass function parameters in a > "naive" fashion. E.g. if a parameter type is "struct X", the llvm > function > should be lowered to LLVM IR with a function parameter of type > %struct.X. > The decision on whether to then pass that in a register (or multiple > registers), on the stack, padded and then passed on the stack, etc, > should > be the responsibility of LLVM. Only in the case of C++ types which > *must* be > passed indirectly for correctness, independent of calling convention > ABI, > should clang be explicitly making the decision to pass indirectly. > > Of course, the tricky part is that LLVM doesn't -- and shouldn't -- > have > the full C type system available to it, and the full C type system > typically is required to evaluate the ABI rules (e.g., distinguishing > a > "_Complex float" from a struct containing two floats). > > Therefore, in order to communicate the correct ABI information to > LLVM, I'd > like clang to also emit *explicitly-ABI-specific* data (metadata?), > reflecting the extra information that the ABI rules require the > backend to > know about the type. E.g., for X86_64, clang needs to inform LLVM of > the > classification for each parameter's type into MEMORY, INTEGER, SSE, > SSEUP, > X87, X87UP, COMPLEX_X87. Or, for PPC64 elfv2, Clang needs to inform > LLVM > when a structure should be treated as a "homogenous aggregate" of > floating-point or vector type. (In both cases, that information cannot > correctly be extracted from the LLVM IR struct type, only from the C > type > system.)These attributes would have to spell out the exact expected treatment by the backend in essentially every aggregate case, and the frontend would have to carefully select that treatment, and for many ABIs that would still require counting registers and so on. I do actually like this approach in many ways, because it provides a path to a world where the backend stop permissively compiling everything the frontend throws at it and instead emits an error if the frontend asks for something that can’t be done, but it’s not going to make things more abstract. Having worked in this space for years, I am convinced that there are two meaningful points for ABI lowering: (1) the high-level source-language information and (2) the low-level register and stack conventions. (1), for C interop, is always going to be duplicative of Clang. You can introduce an intermediate library and make Clang copy all relevant information out of its AST into that library’s type system, but fundamentally “all relevant information” is going to just keep expanding and expanding, and Clang is still going to have a ton of target-specific ABI lowering code to do that propagation. John.
James Y Knight via llvm-dev
2020-Jun-05 02:45 UTC
[llvm-dev] [cfe-dev] Clang/LLVM function ABI lowering (was: Re: [RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM)
On Thu, Jun 4, 2020 at 5:45 PM John McCall <rjmccall at apple.com> wrote:> These attributes would have to spell out the exact expected treatment by > the backend in essentially every aggregate case, and the frontend would > have to carefully select that treatment, and for many ABIs that would > still require counting registers and so on.I don't have all the ABIs memorized, but I don't think it would be the case that the frontend would need to count registers for any of the ABIs I know of. I see this as consisting of two independent pieces: 1. Examining the parameter types, and distilling the important information about each type, *for a given ABI*, into a blob of ABI-specific data. 2. Actually choosing whether to pass a given parameter in a register, or on the stack, or split up the parameter into multiple registers, etc. Step 1 should be done within Clang. The amount of data generated from this step, for the ABIs I'm familiar with, is small, and can be derived based only on the frontend type (not location in parameter list, etc). Step 2 should be done within LLVM, based on the data passed down in the IR. This of course does need to count registers, among other things. So, taking an example from the RISC-V ABI. Given an argument of type: struct X { short s; double d; }; Or, similarly, struct X __attribute__((packed)) { struct { short i; } s[1]; double d; }; struct X { short s; double __attribute__((aligned(256))) d; }; Clang would need to encode metadata saying that this type may be able to be passed via "INT+FLOAT" register-passing, having the INT of size 2 at offset 0, and FLOAT of size 8 at offset 8/2/256 respectively, for the 3 types above. (Or maybe the metadata should store a GEP path, rather than size+offset?) Then, LLVM, seeing an argument with the INT+FLOAT ABI rule, would allocate it to registers/stack as follows: 1. If you're using hardware float, and FLEN >= 8, and XLEN >= 2, and if there is at least one floating point and one integer register available, then: Copy the data at the provided offsets into one floating point register and one integer register (with bits beyond the integer size undefined). 2. Otherwise, fallback to common aggregate handling rules: a. If size is < XLEN, i. and if there's 1 integer register available: Pass the struct (as laid out in memory) in an integer register. ii. otherwise: Pass on stack, with alignment min(stack_alignment, max(type_alignment, XLEN)) b. If size < XLEN*2, i. and there are 2 registers available: Pass the struct (as laid out in memory) in two integer registers. ii. and there is 1 integer register available: Pass XLEN-sized half the struct in a register, and the other XLEN-sized half on the stack. iii. otherwise: Pass the aggregate on stack, with alignment as before. c. Otherwise, "pass by reference" -- make a copy on the stack outside the parameter-passing area, aligned appropriately for its type and then pass a pointer to that memory in the usual way for passing a scalar. (leaving out the varargs rules for simplicity). There's a lot of rules there, but the frontend shouldn't need to know about almost all of it -- the frontend only needs to evaluate whether the struct type matches the specification for INT+FLOAT (and so on, for the other categories of special handling), and encode that categorization into the IR. Unfortunately, today, Clang *does* know all those rules I listed above -- and LLVM *also* has to know most of them! This is not a good situation. I do actually like this> approach in many ways, because it provides a path to a world where the > backend stop permissively compiling everything the frontend throws at it > and instead emits an error if the frontend asks for something that > can’t be done, but it’s not going to make things more abstract.It doesn't make things more abstract, no. There's still going to be ABI-specific code in the frontend. But, it separates the concerns better, and can make the IR required from a frontend more clearly derived from the ABI.> Having worked in this space for years, I am convinced that there are two > meaningful points for ABI lowering: (1) the high-level source-language > information and (2) the low-level register and stack conventions. (1), > for C interop, is always going to be duplicative of Clang. You can > introduce an intermediate library and make Clang copy all relevant > information out of its AST into that library’s type system, but > fundamentally “all relevant information” is going to just keep > expanding and expanding, and Clang is still going to have a ton of > target-specific ABI lowering code to do that propagation.I definitely think it's infeasible to provide all possibly-relevant information about the frontend language type to LLVM in a ABI-independent manner. But, providing ABI-specific metadata makes the problem feasible, because for any particular ABI, the set of parameters derived from the frontend type system will be small. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200604/ca488500/attachment.html>
John McCall via llvm-dev
2020-Jun-11 22:03 UTC
[llvm-dev] [cfe-dev] Clang/LLVM function ABI lowering (was: Re: [RFC] Refactor Clang: move frontend/driver/diagnostics code to LLVM)
On 4 Jun 2020, at 22:45, James Y Knight wrote:> On Thu, Jun 4, 2020 at 5:45 PM John McCall <rjmccall at apple.com> wrote: >> These attributes would have to spell out the exact expected treatment by >> the backend in essentially every aggregate case, and the frontend would >> have to carefully select that treatment, and for many ABIs that would >> still require counting registers and so on. > > > I don't have all the ABIs memorized, but I don't think it would be the case > that the frontend would need to count registers for any of the ABIs I know > of.Well, worst case, I suppose that either such targets would have to do something special in the frontend like they do now, or they’d need to use more fine-grained attributes than maybe the ABI suggests. We’d need some of the latter anyway — I think there’s some weird situation on x86-64 where Clang passes some aggregates in both integer and FP registers due to an early bug (or possibly an ambiguity in the ABI?). I agree that ABIs in practice lower types in a position-invariant way and then check if they’ve run out of registers. Obviously the frontend would need to continue handling mandatory-indirect cases like non-trivial C++ types. Would the frontend handle other indirect cases on targets like ARM64 that use indirect parameters instead of the stack argument area for large aggregates, or would the frontend just mark the argument as `abi(“indirect”)` and let it be handled by the backend? What would this actually look like in IR? Something like this? ``` %tmp = alloca %MyType, align 8 call void @MakeMyType(sret %MyType* %tmp) %arg = load %MyType, %MyType* %tmp, align 8 call void @UseMyTypee(abi(“sse”) align 8 %MyType %arg) ``` Or would we stop using `sret` as well, and this would just be: ``` %tmp = alloca %MyType, align 8 %ret = call abi(“sse”) align 8 %MyType @MakeMyType() store %MyType %rete, %MyType* %tmp, align 8 %arg = load %MyType, %MyType* %tmp, align 8 call void @UseMyTypee(abi(“sse”) align 8 %MyType %arg) ``` John.> > I see this as consisting of two independent pieces: > 1. Examining the parameter types, and distilling the important information > about each type, *for a given ABI*, into a blob of ABI-specific data. > 2. Actually choosing whether to pass a given parameter in a register, or on > the stack, or split up the parameter into multiple registers, etc. > > Step 1 should be done within Clang. The amount of data generated from this > step, for the ABIs I'm familiar with, is small, and can be derived based > only on the frontend type (not location in parameter list, etc). > Step 2 should be done within LLVM, based on the data passed down in the IR. > This of course does need to count registers, among other things. > > So, taking an example from the RISC-V ABI. Given an argument of type: > struct X { short s; double d; }; > Or, similarly, > struct X __attribute__((packed)) { struct { short i; } s[1]; double d; }; > struct X { short s; double __attribute__((aligned(256))) d; }; > > Clang would need to encode metadata saying that this type may be able to be > passed via "INT+FLOAT" register-passing, having the INT of size 2 at offset > 0, and FLOAT of size 8 at offset 8/2/256 respectively, for the 3 types > above. (Or maybe the metadata should store a GEP path, rather than > size+offset?) > > Then, LLVM, seeing an argument with the INT+FLOAT ABI rule, would allocate > it to registers/stack as follows: > 1. If you're using hardware float, and FLEN >= 8, and XLEN >= 2, and if > there is at least one floating point and one integer register available, > then: Copy the data at the provided offsets into one floating point > register and one integer register (with bits beyond the integer size > undefined). > 2. Otherwise, fallback to common aggregate handling rules: > a. If size is < XLEN, > i. and if there's 1 integer register available: Pass the struct (as > laid out in memory) in an integer register. > ii. otherwise: Pass on stack, with alignment min(stack_alignment, > max(type_alignment, XLEN)) > b. If size < XLEN*2, > i. and there are 2 registers available: Pass the struct (as laid out in > memory) in two integer registers. > ii. and there is 1 integer register available: Pass XLEN-sized half the > struct in a register, and the other XLEN-sized half on the stack. > iii. otherwise: Pass the aggregate on stack, with alignment as before. > c. Otherwise, "pass by reference" -- make a copy on the stack outside the > parameter-passing area, aligned appropriately for its type and then pass a > pointer to that memory in the usual way for passing a scalar. > (leaving out the varargs rules for simplicity). > > There's a lot of rules there, but the frontend shouldn't need to know about > almost all of it -- the frontend only needs to evaluate whether the struct > type matches the specification for INT+FLOAT (and so on, for the other > categories of special handling), and encode that categorization into the > IR. > > Unfortunately, today, Clang *does* know all those rules I listed above -- > and LLVM *also* has to know most of them! This is not a good situation. > > I do actually like this >> approach in many ways, because it provides a path to a world where the >> backend stop permissively compiling everything the frontend throws at it >> and instead emits an error if the frontend asks for something that >> can’t be done, but it’s not going to make things more abstract. > > > It doesn't make things more abstract, no. There's still going to be > ABI-specific code in the frontend. But, it separates the concerns better, > and can make the IR required from a frontend more clearly derived from the > ABI. > > >> Having worked in this space for years, I am convinced that there are two >> meaningful points for ABI lowering: (1) the high-level source-language >> information and (2) the low-level register and stack conventions. (1), >> for C interop, is always going to be duplicative of Clang. You can >> introduce an intermediate library and make Clang copy all relevant >> information out of its AST into that library’s type system, but >> fundamentally “all relevant information” is going to just keep >> expanding and expanding, and Clang is still going to have a ton of >> target-specific ABI lowering code to do that propagation. > > > I definitely think it's infeasible to provide all possibly-relevant > information about the frontend language type to LLVM in a ABI-independent > manner. But, providing ABI-specific metadata makes the problem > feasible, because for any particular ABI, the set of parameters derived > from the frontend type system will be small.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200611/5726a701/attachment.html>