thr3ads.net - llvm dev - [llvm-dev] RFC: Implementing the Swift calling convention in LLVM and Clang [Mar 2016]

If this information is useful, please help other people find it:
Share via:

John McCall via llvm-dev

2016-Mar-02 20:03 UTC

[llvm-dev] RFC: Implementing the Swift calling convention in LLVM and Clang

> On Mar 2, 2016, at 11:33 AM, Renato Golin <renato.golin at
linaro.org> wrote:
> On 2 March 2016 at 18:48, John McCall <rjmccall at apple.com> wrote:
>> The frontend will not tell the backend explicitly which parameters will
be
>> in registers; it will just pass a bunch of independent scalar values,
and
>> the backend will assign them to registers or the stack as appropriate.
> 
> I'm assuming you already have code in the back-end that does that in
> the way you want, as you said earlier you may want to use variable
> number of registers for PCS.
> 
> 
>> Our intent is to completely bypass all of the
passing-structures-in-registers
>> code in the backend by simply not exposing the backend to any
parameters
>> of aggregate type.  The frontend will turn a struct into (say) an i32,
a float,
>> and an i8; if the first two get passed in registers and the last gets
passed
>> on the stack, so be it.
> 
> How do you differentiate the @foo's below?
> 
> struct A { i32, float };
> struct B { float, i32 };
> 
> define @foo (A, i32) -> @foo(i32, float, i32);
> 
> and
> 
> define @foo (i32, B) -> @foo(i32, float, i32);
We don’t need to.  We don't use the intermediary convention’s rules for
aggregates.
The Swift rule for aggregate arguments is literally “if it’s too complex
according to
<foo>, pass it indirectly; otherwise, expand it into a sequence of scalar
values and
pass them separately”.  If that means it’s partially passed in registers and
partially
on the stack, that’s okay; we might need to re-assemble it in the callee, but
the
first part of the rule limits how expensive that can ever get.
>> The only difficulty with this plan is that, when we have multiple
results, we
>> don’t have a choice but to return a struct type.  To the extent that
backends
>> try to infer that the function actually needs to be sret, instead of
just trying
>> to find a way to return all the components of the struct type in
appropriate
>> registers, that will be sub-optimal for us.  If that’s a pervasive
problem, then
>> we probably just need to introduce a swift calling convention in LLVM.
> 
> Oh, yeah, some back-ends will fiddle with struct return. Not all
> languages have single-value-return restrictions, but I think that ship
> has sailed already for IR.
> 
> That's another reason to try and pass all by pointer at the end of the
> parameter list, instead of receive as an argument and return.
That’s pretty sub-optimal compared to just returning in registers.  Also, most
backends do have the ability to return small structs in multiple registers
already.
>> A direct result is something that’s returned in registers.  An indirect
>> result is something that’s returned by storing it in an implicit
out-parameter.
> 
> Oh, I see. In that case, any assumption on the variable would have to
> be invalidated, maybe use global volatile variables, or special
> built-ins, so that no optimisation tries to get away with it. But that
> would mess up your optimal code, especially if they have to get passed
> in registers.
I don’t understand what you mean here.  The out-parameter is still explicit in
LLVM IR.  Nothing about this is novel, except that C frontends generally won’t
combine indirect results with direct results.  Worst case, if pervasive LLVM
assumptions prevent us from combining the sret attribute with a direct result,
we just won’t use the sret attribute.
>> Oh, sorry, I forgot to talk about that.  Yes, the frontend already
rearranges
>> these arguments to the end, which means the optimizer’s default
behavior
>> of silently dropping extra call arguments ends up doing the right
thing.
> 
> Excellent!
> 
> 
>> I’m reluctant to say that the convention always requires these
arguments.
>> If we have to do that, we can, but I’d rather not; it would involve
generating
>> a lot of unnecessary IR and would probably create unnecessary
>> code-generation differences, and I don’t think it would be sufficient
for
>> error results anyway.
> 
> This should be ok for internal functions, but maybe not for global /
> public interfaces. The ARM ABI has specific behaviour guarantees for
> public interfaces (like large alignment) that would be prohibitively
> bad for all functions, but ok for public ones.
> 
> If hells break loose, you could enforce that for public interfaces only.
> 
> 
>> We don’t want checking or setting the error result to actually involve
memory
>> access.
> 
> And even though most of those access could be optimised away, there's
> no guarantee.
Right.  The backend isn’t great about removing memory operations that survive to
it.
> Another option would be to have a special built-in to recognise
> context/error variables, and plug in a late IR pass to clean up
> everything. But I'd only recommend that if we can't find another
way
> around.
> 
> 
>> The ability to call a non-throwing function as a throwing function
means
>> we’d have to provide this extra explicit result on every single
function with
>> the Swift convention, because the optimizer is definitely not going to
>> gracefully handle result-type mismatches; so even a function as simple
as
>>  func foo() -> Int32
>> would have to be lowered into IR as
>>  define { i32, i8* } @foo(i8*)
> 
> Indeed, very messy.
> 
> I'm going on a tangent, here, may be all rubbish, but...
> 
> C++ handles exception handling with the exception being thrown
> allocated in library code, not the program. If, like C++, Swift can
> only handle one exception at a time, why can't the error variable be a
> global?
> 
> The ARM back-end accepts the -rreserve-r9 option, and others seem to
> have similar options, so you could use that to force your global
> variable to live on the platform register.
> 
> That way, all your error handling built-ins deal with that global
> variable, which the back-end knows is on registers. You will need a
> special DAG node, but I'm assuming you already have/want one. You also
> drop any problem with arguments and PCS, at least for the error part.
Swift does not run in an independent environment; it has to interact with
existing C code.  That existing code does not reserve any registers globally
for this use.  Even if that were feasible, we don’t actually want to steal a
register globally from all the C code on the system that probably never
interacts with Swift.

John.

Renato Golin via llvm-dev

2016-Mar-03 10:00 UTC

head link

[llvm-dev] RFC: Implementing the Swift calling convention in LLVM and Clang

On 2 March 2016 at 20:03, John McCall <rjmccall at apple.com>
wrote:> We don’t need to.  We don't use the intermediary convention’s rules for
aggregates.
> The Swift rule for aggregate arguments is literally “if it’s too complex
according to
> <foo>, pass it indirectly; otherwise, expand it into a sequence of
scalar values and
> pass them separately”.  If that means it’s partially passed in registers
and partially
> on the stack, that’s okay; we might need to re-assemble it in the callee,
but the
> first part of the rule limits how expensive that can ever get.
Right. My worry is, then, how this plays out with ARM's AAPCS.

As you said below, you *have* to interoperate with C code, so you will
*have* to interoperate with AAPCS on ARM.

AAPCS's rules on aggregates are not simple, but they also allow part
of it in registers, part on the stack. I'm guessing you won't have the
same exact rules, but similar ones, which may prove harder to
implement than the former.

> That’s pretty sub-optimal compared to just returning in registers.  Also,
most
> backends do have the ability to return small structs in multiple registers
already.
Yes, but not all of them can return more than two, which may constrain
you if you have both error and context values in a function call, in
addition to the return value.

> I don’t understand what you mean here.  The out-parameter is still explicit
in
> LLVM IR.  Nothing about this is novel, except that C frontends generally
won’t
> combine indirect results with direct results.
Sorry, I had understood this, but your reply (for some reason) made me
think it was a hidden contract, not an explicit argument. Ignore me,
then. :)

> Right.  The backend isn’t great about removing memory operations that
survive to it.
Precisely!

> Swift does not run in an independent environment; it has to interact with
> existing C code.  That existing code does not reserve any registers
globally
> for this use.  Even if that were feasible, we don’t actually want to steal
a
> register globally from all the C code on the system that probably never
> interacts with Swift.
So, as Reid said, usage of built-ins might help you here.

Relying on LLVM's ability to not mess up your fiddling with variable
arguments seems unstable. Adding specific attributes to functions or
arguments seem too invasive. So a solution would be to add a built-in
in the beginning of the function to mark those arguments as special.

Instead of alloca %a + load -> store + return, you could have
llvm.swift.error.load(%a) -> llvm.swift.error.return(%a), which
survives most of middle-end passes intact, and a late pass then change
the function to return a composite type, either a structure or a
larger type, that will be lowered in more than one register.

This makes sure error propagation won't be optimised away, and that
you can receive the error in any register (or even stack), but will
always return it in the same registers (ex. on ARM, R1 for i32, R2+R3
for i64, etc).

I understand this might be far off what you guys did, and I'm not
trying to re-write history, just brainstorming a bit.

IMO, both David and Richard are right. This is likely not a huge deal
for the CC code, but we'd be silly not to take this opportunity to
make it less fragile overall.

cheers,
--renato

John McCall via llvm-dev

2016-Mar-03 17:36 UTC

head link

[llvm-dev] RFC: Implementing the Swift calling convention in LLVM and Clang

> On Mar 3, 2016, at 2:00 AM, Renato Golin <renato.golin at linaro.org>
wrote:
> 
> On 2 March 2016 at 20:03, John McCall <rjmccall at apple.com> wrote:
>> We don’t need to.  We don't use the intermediary convention’s rules
for aggregates.
>> The Swift rule for aggregate arguments is literally “if it’s too
complex according to
>> <foo>, pass it indirectly; otherwise, expand it into a sequence
of scalar values and
>> pass them separately”.  If that means it’s partially passed in
registers and partially
>> on the stack, that’s okay; we might need to re-assemble it in the
callee, but the
>> first part of the rule limits how expensive that can ever get.
> 
> Right. My worry is, then, how this plays out with ARM's AAPCS.
> 
> As you said below, you *have* to interoperate with C code, so you will
> *have* to interoperate with AAPCS on ARM.
I’m not sure of your point here.  We don’t use the Swift CC to call C functions.
It does not matter, at all, whether the frontend lowering of an aggregate under
the Swift CC resembles the frontend lowering of the same aggregate under AAPCS.

I brought up interoperation with C code as a counterpoint to the idea of
globally
reserving a register.
> AAPCS's rules on aggregates are not simple, but they also allow part
> of it in registers, part on the stack. I'm guessing you won't have
the
> same exact rules, but similar ones, which may prove harder to
> implement than the former.
>> That’s pretty sub-optimal compared to just returning in registers. 
Also, most
>> backends do have the ability to return small structs in multiple
registers already.
> 
> Yes, but not all of them can return more than two, which may constrain
> you if you have both error and context values in a function call, in
> addition to the return value.
We do actually use a different swiftcc calling convention in IR.  I don’t see
any
serious interop problems here.  The “intermediary” convention is just the
original
basis of swiftcc on the target.
>> I don’t understand what you mean here.  The out-parameter is still
explicit in
>> LLVM IR.  Nothing about this is novel, except that C frontends
generally won’t
>> combine indirect results with direct results.
> 
> Sorry, I had understood this, but your reply (for some reason) made me
> think it was a hidden contract, not an explicit argument. Ignore me,
> then. :)
> 
> 
>> Right.  The backend isn’t great about removing memory operations that
survive to it.
> 
> Precisely!
> 
> 
>> Swift does not run in an independent environment; it has to interact
with
>> existing C code.  That existing code does not reserve any registers
globally
>> for this use.  Even if that were feasible, we don’t actually want to
steal a
>> register globally from all the C code on the system that probably never
>> interacts with Swift.
> 
> So, as Reid said, usage of built-ins might help you here.
> 
> Relying on LLVM's ability to not mess up your fiddling with variable
> arguments seems unstable. Adding specific attributes to functions or
> arguments seem too invasive.
I’m not sure why you say that.  We already do have parameter ABI override
attributes with target-specific behavior in LLVM IR: sret and inreg.

I can understand being uneasy with adding new swiftcc-specific attributes,
though.
It would be reasonable to make this more general.  Attributes can be
parameterized;
maybe we could just say something like abi(“context”), and leave it to the CC to
interpret that?

Having that sort of ability might make some special cases easier for C lowering,
too, come to think of it.  Imagine an x86 ABI that — based on type information
otherwise erased by the conversion to LLVM IR — sometimes returns a float in
an SSE register and sometimes on the x86 stack.  It would be very awkward to
express that today, but some sort of abi(“x87”) attribute would make it easy.
> So a solution would be to add a built-in
> in the beginning of the function to mark those arguments as special.
> 
> Instead of alloca %a + load -> store + return, you could have
> llvm.swift.error.load(%a) -> llvm.swift.error.return(%a), which
> survives most of middle-end passes intact, and a late pass then change
> the function to return a composite type, either a structure or a
> larger type, that will be lowered in more than one register.
> 
> This makes sure error propagation won't be optimised away, and that
> you can receive the error in any register (or even stack), but will
> always return it in the same registers (ex. on ARM, R1 for i32, R2+R3
> for i64, etc).
> 
> I understand this might be far off what you guys did, and I'm not
> trying to re-write history, just brainstorming a bit.
> 
> IMO, both David and Richard are right. This is likely not a huge deal
> for the CC code, but we'd be silly not to take this opportunity to
> make it less fragile overall.
The lowering required for this would be very similar to the lowering that
Manman’s
patch does for swift-error: the backend basically does special value
propagation.  The main difference is that it’s completely opaque to the
middle-end
by default instead of looking like a load or store that ordinary memory
optimizations
can handle.  That seems like a loss, since those optimizations would actually do
the right thing.

John.

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Mar 2016 - RFC: Implementing the Swift calling convention in LLVM and Clang

[llvm-dev] RFC: Implementing the Swift calling convention in LLVM and Clang

[llvm-dev] RFC: Implementing the Swift calling convention in LLVM and Clang

[llvm-dev] RFC: Implementing the Swift calling convention in LLVM and Clang

Maybe Matching Threads