Adrien Guinet via llvm-dev
2020-Oct-08 07:28 UTC
[llvm-dev] __attribute__((apple_abi)): targeting Apple/ARM64 ABI from Linux (and others)
Hello everyone, I made a quick patch to clang/llvm to introduce an "apple_abi" function attribute (https://github.com/aguinet/llvm-project/commit/c4905ded3afb3182435df30e527955031cb0d098), to be able to compile functions for the Apple ARM64 ABI when targeting other ARM64 OSes (e.g. Linux). This can be seen as the Apple version of the already existing "ms_abi" attribute. In this mail, I will describe why we would want to do such a thing, the current implementation and some remaining questions I have about this (like "isn't this a terrible idea"). Motivation ========= The motivation comes a bit from far away, I'll try to make it quick. We have various libraries that targets what I call the "infernal combo", that is Android/iOS/OSX/Windows/Linux, for every major architectures that are supported by these OS. For iOS, we thus need to support armv7/arm64 and all their flavors. As we like to test things, we have to run at some points binaries under iOS/{armv7,arm64}, which is not something really easy to do provided the official Apple hardware. In one of other attempts to make all this mess easier to handle, we adapted the https://github.com/shinh/maloader project (that will be open source if all of this works) to load ARM64 MachO under Linux and run the final binary using qemu-user. This can be seen as a very light version of wine [1] for iOS. Where troubles come in ---------------------- All of this could have "just worked", but Apple has a different ABI than the "official" ARM64 one. All of this is explained here: https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms . In an attempt to fix this problem, the idea is to follow the "wine [1] spirit" and write "ABI wrappers" against the libSystem functions that cause troubles, and have the custom MachO loader use these functions instead of the linux/libc ones. The next problem to solve is how to write these wrappers. One idea was thus to implement the counter-part of "ms_abi" for Apple, hence "apple_abi". The current implementation & questions ===================================== The current implementation introduces the CC_AArch64_Apple calling convention, to enforce the usage of Apple's CC when necessary. This has mainly been inspired by how CC_Win64 works. There are I think at least these limitations: * this supposes that the original targeted CC is Apple ARM64 AAPCS. In its current form, there is no way to support for instance vector calls (see for instance https://github.com/aguinet/llvm-project/commit/c4905ded3afb3182435df30e527955031cb0d098#diff-f124368bac3e5d7be20450aa83b166daR218) * I haven't tested for a binary that targets Windows/ARM64, so chances are there are bugs in this configuration * obviously the commit misses proper tests My questions would be: * the fact that we can't target Apple's vector calls ABI shows that having one CC_AArch64Apple (as CC_Win64 exists) calling convention might not be the right implementation of this "apple_abi" attribute. Has someone better suggestions? * For variadic functions (which are among the functions that have different ABIs), GCC and Clang have __builtin_ms_va_list. My understanding is that we should have the Apple equivalent, but I'm not sure to completely understand what's at stake here. Said differently, is this builtin used to make sure we use the va_list type of the Apple ABI, should the need arise to forward it to another function that uses the Apple ABI? Example with printf ================== For now, we manage to compile this simple example for iOS/arm64: #include <stdio.h> int main(int argc, char** argv) { printf("number of args: %d, argv: %s, %s, %s\n", argc, argv[0], argv[1], argv[2]); return 0; } and run it under the combo maloader/qemu-user under Linux/x64, using this wrapper for printf: __attribute__((apple_abi)) int darwin_aarch64_printf(const char* format, ...) { va_list args; va_start(args, format); const int ret = vprintf(format, args); va_end(args); return ret; } The fact that va_start/va_end works by using the Linux ABI from a function whose arguments use the Apple ABI seems completely magical to me, so if someone knows why this work I would also be interested! Is this a terrible idea? ======================= Building these "ABI wrappers" using an "apple_abi" attribute seemed a good idea at the beginning, but this already raises some concerns (see above), and I'd be willing to hear any arguments that show that this is actually a bad idea. Thanks everyone! P.S.: about the original motivation, the darlinghq project [2] can be seen as the real wine [1] for OSX [3]. Unfortunately, as far as I know, it still doesn't have official support for iOS/ARM64 binaries (and I'm not sure they will aim at full emulation, only support on native arm64 hardware). [1] https://www.winehq.org/ [2] https://www.darlinghq.org/ [3] What I say here isn't entirely true, as darlinghq moved away from this "wine" model (which can be seen very basically as make a loader for the targeted architecture, create wrappers for system libraries and run all of this in userland). For those interested in more information, I recommend reading the article in http://blog.darlinghq.org/2017/02/the-mach-o-transition-darling-in-past-5.html
Martin Storsjö via llvm-dev
2020-Oct-08 09:13 UTC
[llvm-dev] __attribute__((apple_abi)): targeting Apple/ARM64 ABI from Linux (and others)
Hi, For the record, I've spent a nontrivial amount of time on the ARM64 version of Wine, and back in the day started out by implementing the ms_abi attribute for aarch64 just to get the handling of printf like functions right - dealing with (to some extent) most of the same issues you're dealing with here. (Also, as a side comment; the existing names "win64cc", CC_Win64 or "IsWin64" used in a number of places, are a bit misnamed in the current scope. For the original, x86-only context (with 32 and 64 bit code generation is mostly shared), where the C calling convention is similar on x86_32, but differences only arose on x86_64, naming it "Win64" probably is quite neat, but within AArch64 it's a bit redundant - and if a similar distinction would be needed on ARM (e.g. if an explicit windows calling convention would be needed), reusing the existing "win64cc" is even more out of place...) On Thu, 8 Oct 2020, Adrien Guinet via llvm-dev wrote:> In one of other attempts to make all this mess easier to handle, we > adapted the https://github.com/shinh/maloader project (that will be open > source if all of this works) to load ARM64 MachO under Linux and run the > final binary using qemu-user. This can be seen as a very light version > of wine [1] for iOS.> [3] What I say here isn't entirely true, as darlinghq moved away from > this "wine" model (which can be seen very basically as make a loader for > the targeted architecture, create wrappers for system libraries and run > all of this in userland). For those interested in more information, I > recommend reading the article in > http://blog.darlinghq.org/2017/02/the-mach-o-transition-darling-in-past-5.htmlI would say this isn't entirely accurate regarding how wine works - maybe it was the case for other thinner win32 binary loaders that have existed though. Wine never (at least not in the last 20 years afaik) just translated calls between the windows and host environment. Wine consists of a mostly full reimplementation of all the supported Windows APIs, and these only occasionally call down to the host libc and host's native APIs. It's true that Wine used to build its modules as native ELF (or MachO) binaries - but they weren't just plain ELF .so's; internally they contain most of the PE DLL data structures as well, so that run and interact with other modules using the normal DLL import/export mechanisms. But lately this has been taken even further, and now most modules can be built as real DLLs as well - linking against wine's msvcrt/ucrt instead of the host libc, etc. For higher level components that only interact with other DLLs, this is mostly straightforward, but for lower level components that actually do need to call the native host environment, they have been split into a native ELF/MachO component (which links against whatever system libraries it needs to use), and the bulk of the code as either a real DLL or as a DLL wrapped in ELF/MachO. This requires having a suitable cross compiler available (but with clang being multi-targeting, that should be trivially available). So that sounds very much like the same approach that Darling is taking, except that Darling doesn't maintain support for building the emulated components as ELF, only as native MachO. And Darling has the benefit of being able to build Apple's open sourced code, instead of having to reimplement it all based on the public interfaces. In any case - even if the bulk of the code is built as the emulated platform's native binaries (DLL or MachO), I guess there's a need for interaction at some layer (even if the interface might be quite thin), so having support for something like this sounds sensible to me. And being able to interact with code built for a different ABI on a per-function level also sounds very sensible to me. So I don't think this is a bad idea. BTW, for running Windows code on Linux, one constant stumbling block has been the use of the x18 register. On Linux, this register is normally free to use by any function, but on Windows, it is supposed to remain constant (pointing at a thread specific data structure), with various workarounds being used to retain it. For the Darwin case, x18 is reserved (so compiler generated code doesn't use it, similar to windows), but AFAIK nothing really uses it. Earlier, the Darwin kernel used to overwrite the x18 register to 1 on context switch, just to make sure that no code kept relying on it retaining its value, but this doesn't seem to be the case any longer. As no code actually uses it, it shouldn't be any problem for your usecase.> The current implementation & questions > =====================================> > The current implementation introduces the CC_AArch64_Apple calling > convention, to enforce the usage of Apple's CC when necessary. This has > mainly been inspired by how CC_Win64 works. > > There are I think at least these limitations: > > * this supposes that the original targeted CC is Apple ARM64 AAPCS. In its current form, > there is no way to support for instance vector calls (see for instance > https://github.com/aguinet/llvm-project/commit/c4905ded3afb3182435df30e527955031cb0d098#diff-f124368bac3e5d7be20450aa83b166daR218)I'm not familiar with the vector calling convention here - but if that's used, the function (on the C level) already has a suitable attribute specifying the non-standard calling convention? Wouldn't that end up lowered into the right thing here as well? Or is it a case where there's a generic "vector" calling convention which turns into different things depending on whether targetin linux or darwin? In that case, you'd probably need add a separate attribute and calling conventions, like apple_vector and sysv_vector (or whatever to call the default), to allow specifying the intent more exactly. For windows on i386, there's actually at least 4 different calling conventions being used; cdecl (the default for C code), stdcall, fastcall and vectorcall. As those names aren't associated with anything else on other platforms, you can use e.g. __attribute__((fastcall)) on any platform.> My questions would be: > * the fact that we can't target Apple's vector calls ABI shows that having one > CC_AArch64Apple (as CC_Win64 exists) calling convention might not be the right > implementation of this "apple_abi" attribute. Has someone better suggestions?It doesn't sound too bad to me, but as naming things is one of the hardest things, one could also think of other, less generic names (as the attribute "apple_abi" or whatever it is, doesn't per se imply any specific ABI, but just is the apple default C calling convention) - but "apple_c_default" also is ugly.> * For variadic functions (which are among the functions that have > different ABIs), GCC and Clang have __builtin_ms_va_list. My > understanding is that we should have the Apple equivalent, but I'm not > sure to completely understand what's at stake here. Said differently, is > this builtin used to make sure we use the va_list type of the Apple ABI, > should the need arise to forward it to another function that uses the > Apple ABI?Exactly. In your example, you're implementing printf, so you're receiving variadic arguments on the stack, boiling them down to a (linux native) va_list and passing them to a linux native vprintf. If you'd be implementing and wrapping the darwin vprintf on the other hand, you'd need to declare it to be receiving a __builtin_apple_va_list.> Example with printf > ==================> > For now, we manage to compile this simple example for iOS/arm64: > > #include <stdio.h> > > int main(int argc, char** argv) > { > printf("number of args: %d, argv: %s, %s, %s\n", argc, argv[0], argv[1], argv[2]); > return 0; > } > > and run it under the combo maloader/qemu-user under Linux/x64, using this wrapper for printf: > > __attribute__((apple_abi)) int darwin_aarch64_printf(const char* format, ...) > { > va_list args; > va_start(args, format); > const int ret = vprintf(format, args); > va_end(args); > return ret; > } > > The fact that va_start/va_end works by using the Linux ABI from a > function whose arguments use the Apple ABI seems completely magical to > me, so if someone knows why this work I would also be interested!I think this might be a borderline case that I wasn't entirely sure would work right, but apparently does. (Or maybe the code really is flexible enough to systematically handle such mixed cases?) The calling convention attribute indicates how and where the variadic arguments are laid out on the stack, but these are then collected into a linux native va_list, which is passed to the linux native vprintf function that interprets them accordingly. FWIW, if you want to experiment with how variadic functions and va_list behaves on different platforms, you can try e.g. this test snippet: void vararg(int a, ...); void call_vararg(void) { vararg(7, 8, 9, 10.0, 11, 12.0, 13); } void other(__builtin_va_list ap); void receive_vararg(int a, ...) { __builtin_va_list ap; __builtin_va_start(ap, a); other(ap); __builtin_va_end(ap); } int use_vararg(__builtin_va_list *ap) { return __builtin_va_arg(*ap, int); } Compiling this with e.g. "clang -target {aarch64-windows,aarch64-linux-gnu,arm64-apple-darwin} -S -O2 -o - test.c" lets you have a look at what they end up like. E.g. use_vararg is identical between darwin and windows, while call_vararg is kind of similar between linux and windows (except windows passes all variadic args in GPRs), and receive_vararg is pretty different between all of them.> Is this a terrible idea? > =======================> > Building these "ABI wrappers" using an "apple_abi" attribute seemed a > good idea at the beginning, but this already raises some concerns (see > above), and I'd be willing to hear any arguments that show that this is > actually a bad idea.It's certainly more sustainable and durable to provide full, proper implementations of the target, like Darling and Wine do, but even then, being able to build a function taking arguments with a foreign calling convention does sound sensible and useful to me. Depending on exactly where you draw the line between "emulated"/foreign executables and native host system, you might not have any variadic functions in the border interface layer, and then you might get away without such support in the compiler, but to me, it sounds like a useful thing to have in any case. // Martin
Tim Northover via llvm-dev
2020-Oct-08 09:15 UTC
[llvm-dev] __attribute__((apple_abi)): targeting Apple/ARM64 ABI from Linux (and others)
Hi Adrien, On Thu, 8 Oct 2020 at 08:28, Adrien Guinet via llvm-dev <llvm-dev at lists.llvm.org> wrote:> * this supposes that the original targeted CC is Apple ARM64 AAPCS. In its current form, > there is no way to support for instance vector calls (see for instance > https://github.com/aguinet/llvm-project/commit/c4905ded3afb3182435df30e527955031cb0d098#diff-f124368bac3e5d7be20450aa83b166daR218)I'm afraid I don't understand this point.> * the fact that we can't target Apple's vector calls ABI shows that having one > CC_AArch64Apple (as CC_Win64 exists) calling convention might not be the right > implementation of this "apple_abi" attribute. Has someone better suggestions?Needing two calling conventions seems really odd to me, unless it's for genuinely different ABI slices (arm64 vs arm64e or arm64_32 for example), and even there I'm not sure.> The fact that va_start/va_end works by using the Linux ABI from a function whose arguments > use the Apple ABI seems completely magical to me, so if someone knows why this work I > would also be interested!It's a series of coincidences conspiring together, I think. Linux's varargs ABI doesn't change from the normal one, so functions have to store all GPRs and vector registers that might contain arguments (as well as where stack args start), and va_list describes where they were stored: typedef struct { void *stack; void *gr_top; void *vr_top; int gr_offs; int vr_offs; } va_list; This is what you're getting with your "va_list" declaration. While the Darwin one is just a double pointer, but conceptually typedef struct { void *stack; } va_list; because all anonymous args go on the stack there on Darwin. That means when you call (Darwin's) va_start in your vprintf function it "correctly" initializes the first field of that struct, leaving the rest garbage. The gr_offs and vr_offs fields decide whether to use gr_top/vr_top or stack to actually get the argument, and in this case if gr_offs happens to be >= 0 it'll "correctly" use the stack to retrieve everything. I'm guessing that happens to be the case for simple programs (quite possibly the stack is still zero-initialized if this is a trivial test-case). You're also getting very lucky in that a Darwin varargs function changes how much of the stack each argument uses, bringing it in line with the normal AAPCS (otherwise the entire forwarding enterprise would be doomed and you'd have to implement significant chunks of vprintf to repack the arguments). So, at a high level what you'll *want* to do to correctly forward from Darwin to Linux is make sure that always happens: initialize gr_offs and vr_offs to 0 to begin with so only the stack is available (I'd also set the *_top fields to NULL for good measure). Take the time to be grateful you're not trying to go the other way, too! Now, back to your previous question...> * For variadic functions (which are among the functions that have different ABIs), GCC and > Clang have __builtin_ms_va_list. My understanding is that we should have the Apple > equivalent, but I'm not sure to completely understand what's at stake here. Said > differently, is this builtin used to make sure we use the va_list type of the Apple ABI, > should the need arise to forward it to another function that uses the Apple ABI?That, together with __builtin_ms_va_arg and __builtin_ms_va_start, are for if you have a Linux-side function that wants to make use of a va_list or anonymous args coming from Darwin code in a relatively agnostic way. I think what you're doing (here at least) is so intimately tied to bridging the two ABIs that using it would just be a fig-leaf. Cheers. Tim.
Adrien Guinet via llvm-dev
2020-Oct-09 04:49 UTC
[llvm-dev] __attribute__((apple_abi)): targeting Apple/ARM64 ABI from Linux (and others)
Hello Martin, Thanks for your very detailed answer. Comments below. On 10/8/20 11:13 AM, Martin Storsjö wrote:> Hi, > > For the record, I've spent a nontrivial amount of time on the ARM64 > version of Wine, and back in the day started out by implementing the > ms_abi attribute for aarch64 just to get the handling of printf like > functions right - dealing with (to some extent) most of the same issues > you're dealing with here.Interesting, and thanks for you work on the Wine/ARM64 port!> On Thu, 8 Oct 2020, Adrien Guinet via llvm-dev wrote: >> [3] What I say here isn't entirely true, as darlinghq moved away from >> this "wine" model (which can be seen very basically as make a loader >> for the targeted architecture, create wrappers for system libraries >> and run all of this in userland). For those interested in more >> information, I recommend reading the article in >> http://blog.darlinghq.org/2017/02/the-mach-o-transition-darling-in-past-5.html >> > > I would say this isn't entirely accurate regarding how wine works - > maybe it was the case for other thinner win32 binary loaders that have > existed though. > > Wine never (at least not in the last 20 years afaik) just translated > calls between the windows and host environment. Wine consists of a > mostly full reimplementation of all the supported Windows APIs, and > these only occasionally call down to the host libc and host's native > APIs. It's true that Wine used to build its modules as native ELF (or > MachO) binaries - but they weren't just plain ELF .so's; internally they > contain most of the PE DLL data structures as well, so that run and > interact with other modules using the normal DLL import/export mechanisms.I do agree on this, and my comment has been a (failed) attempt at trying to summarize wine in one sentence...> So that sounds very much like the same approach that Darling is taking, > except that Darling doesn't maintain support for building the emulated > components as ELF, only as native MachO. And Darling has the benefit of > being able to build Apple's open sourced code, instead of having to > reimplement it all based on the public interfaces. > > In any case - even if the bulk of the code is built as the emulated > platform's native binaries (DLL or MachO), I guess there's a need for > interaction at some layer (even if the interface might be quite thin), > so having support for something like this sounds sensible to me. > > And being able to interact with code built for a different ABI on a > per-function level also sounds very sensible to me. So I don't think > this is a bad idea.Okay, so I guess I will continue on this rabbit hole a little bit more :)> BTW, for running Windows code on Linux, one constant stumbling block has > been the use of the x18 register. On Linux, this register is normally > free to use by any function, but on Windows, it is supposed to remain > constant (pointing at a thread specific data structure), with various > workarounds being used to retain it. > > For the Darwin case, x18 is reserved (so compiler generated code doesn't > use it, similar to windows), but AFAIK nothing really uses it. Earlier, > the Darwin kernel used to overwrite the x18 register to 1 on context > switch, just to make sure that no code kept relying on it retaining its > value, but this doesn't seem to be the case any longer. As no code > actually uses it, it shouldn't be any problem for your usecase.Interesting to know indeed. And TBH I'm glad I don't have to deal with that problem in this usecase...>> The current implementation & questions >> =====================================>> >> The current implementation introduces the CC_AArch64_Apple calling >> convention, to enforce the usage of Apple's CC when necessary. This >> has mainly been inspired by how CC_Win64 works. >> >> There are I think at least these limitations: >> >> * this supposes that the original targeted CC is Apple ARM64 AAPCS. In >> its current form, >> there is no way to support for instance vector calls (see for instance >> https://github.com/aguinet/llvm-project/commit/c4905ded3afb3182435df30e527955031cb0d098#diff-f124368bac3e5d7be20450aa83b166daR218) >> > > I'm not familiar with the vector calling convention here - but if that's > used, the function (on the C level) already has a suitable attribute > specifying the non-standard calling convention? Wouldn't that end up > lowered into the right thing here as well?Let's say a user wants to target the Apple "aapcs-vfp" calling convention from a Linux/ARM64 binary. He would for instance want to use that combination: __attribute__((apple_abi)) __attribute__((pcs("aapcs-vfp"))) void foo(...) In our current implementation, that would not work because we would try to setup two different LLVM calling conventions on the same function.> Or is it a case where there's a generic "vector" calling convention > which turns into different things depending on whether targetin linux or > darwin?That's my understanding reading for instance https://llvm.org/doxygen/AArch64RegisterInfo_8cpp_source.html#l00149> In that case, you'd probably need add a separate attribute and > calling conventions, like apple_vector and sysv_vector (or whatever to > call the default), to allow specifying the intent more exactly.On the LLVM level I guess yes, but maybe we might keep this simple on the clang level by allowing the combination above?> For windows on i386, there's actually at least 4 different calling > conventions being used; cdecl (the default for C code), stdcall, > fastcall and vectorcall. As those names aren't associated with anything > else on other platforms, you can use e.g. __attribute__((fastcall)) on > any platform.Okay, that works in this case indeed.>> My questions would be: >> * the fact that we can't target Apple's vector calls ABI shows that >> having one >> CC_AArch64Apple (as CC_Win64 exists) calling convention might not be >> the right >> implementation of this "apple_abi" attribute. Has someone better >> suggestions? > > It doesn't sound too bad to me, but as naming things is one of the > hardest things, one could also think of other, less generic names (as > the attribute "apple_abi" or whatever it is, doesn't per se imply any > specific ABI, but just is the apple default C calling convention) - but > "apple_c_default" also is ugly.Cf. above, allowing the combination of attributes might be a viable solution.>> * For variadic functions (which are among the functions that have >> different ABIs), GCC and Clang have __builtin_ms_va_list. My >> understanding is that we should have the Apple equivalent, but I'm not >> sure to completely understand what's at stake here. Said differently, >> is this builtin used to make sure we use the va_list type of the Apple >> ABI, should the need arise to forward it to another function that uses >> the Apple ABI? > > Exactly. In your example, you're implementing printf, so you're > receiving variadic arguments on the stack, boiling them down to a (linux > native) va_list and passing them to a linux native vprintf. If you'd be > implementing and wrapping the darwin vprintf on the other hand, you'd > need to declare it to be receiving a __builtin_apple_va_list.Okay thanks! So I'll add this to the todo list.>> The fact that va_start/va_end works by using the Linux ABI from a >> function whose arguments use the Apple ABI seems completely magical to >> me, so if someone knows why this work I would also be interested! > > I think this might be a borderline case that I wasn't entirely sure > would work right, but apparently does. (Or maybe the code really is > flexible enough to systematically handle such mixed cases?)Tim Northover described what seems to happen in another answer, and so it looks like to be mostly out of luck that it works.> The calling convention attribute indicates how and where the variadic > arguments are laid out on the stack, but these are then collected into a > linux native va_list, which is passed to the linux native vprintf > function that interprets them accordingly. > > FWIW, if you want to experiment with how variadic functions and va_list > behaves on different platforms, you can try e.g. this test snippet: > > void vararg(int a, ...); > void call_vararg(void) { > vararg(7, 8, 9, 10.0, 11, 12.0, 13); > } > > void other(__builtin_va_list ap); > void receive_vararg(int a, ...) { > __builtin_va_list ap; > __builtin_va_start(ap, a); > other(ap); > __builtin_va_end(ap); > } > > int use_vararg(__builtin_va_list *ap) { > return __builtin_va_arg(*ap, int); > } > > Compiling this with e.g. "clang -target > {aarch64-windows,aarch64-linux-gnu,arm64-apple-darwin} -S -O2 -o - > test.c" lets you have a look at what they end up like. E.g. use_vararg > is identical between darwin and windows, while call_vararg is kind of > similar between linux and windows (except windows passes all variadic > args in GPRs), and receive_vararg is pretty different between all of them.Thanks a lot for this tip. I will have a closer look at it.>> Is this a terrible idea? >> =======================>> >> Building these "ABI wrappers" using an "apple_abi" attribute seemed a >> good idea at the beginning, but this already raises some concerns (see >> above), and I'd be willing to hear any arguments that show that this >> is actually a bad idea. > > It's certainly more sustainable and durable to provide full, proper > implementations of the target, like Darling and Wine do, but even then, > being able to build a function taking arguments with a foreign calling > convention does sound sensible and useful to me.Okay, fair enough :)> Depending on exactly where you draw the line between "emulated"/foreign > executables and native host system, you might not have any variadic > functions in the border interface layer, and then you might get away > without such support in the compiler, but to me, it sounds like a useful > thing to have in any case.Our test cases use very few libc/libSystem functions, but some of them are indeed from the "printf"-family to output interesting informations, so I think it's worth the efforts to support them. The goal indeed isn't to go through a full implementation of that targeted system.
Adrien Guinet via llvm-dev
2020-Oct-09 13:02 UTC
[llvm-dev] __attribute__((apple_abi)): targeting Apple/ARM64 ABI from Linux (and others)
Hello Tim, Thanks for the details you provided! Answers & comments below. On 10/8/20 11:15 AM, Tim Northover wrote:> On Thu, 8 Oct 2020 at 08:28, Adrien Guinet via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> * this supposes that the original targeted CC is Apple ARM64 AAPCS. In its current form, >> there is no way to support for instance vector calls (see for instance >> https://github.com/aguinet/llvm-project/commit/c4905ded3afb3182435df30e527955031cb0d098#diff-f124368bac3e5d7be20450aa83b166daR218) > > I'm afraid I don't understand this point.ARM64 defines two calling conventions: aapcs and aapcs-vfp (https://developer.arm.com/documentation/dui0491/i/Compiler-specific-Features/--attribute----pcs--calling-convention-----function-attribute). Using __attribute__((apple_abi)) in its current form would only allow to target aapcs from a foreign OS, not aapcs-vfp.>> * the fact that we can't target Apple's vector calls ABI shows that having one >> CC_AArch64Apple (as CC_Win64 exists) calling convention might not be the right >> implementation of this "apple_abi" attribute. Has someone better suggestions? > > Needing two calling conventions seems really odd to me, unless it's > for genuinely different ABI slices (arm64 vs arm64e or arm64_32 for > example), and even there I'm not sure.See above. The idea would be to also have, for instance, CC_AArch64Apple_VFP (even if I'm really not found of this). It could also end up as a non-supported case.>> The fact that va_start/va_end works by using the Linux ABI from a function whose arguments >> use the Apple ABI seems completely magical to me, so if someone knows why this work I >> would also be interested! > > It's a series of coincidences conspiring together, I think. Linux's > varargs ABI doesn't change from the normal one, so functions have to > store all GPRs and vector registers that might contain arguments (as > well as where stack args start), and va_list describes where they were > stored: > > typedef struct { > void *stack; > void *gr_top; > void *vr_top; > int gr_offs; > int vr_offs; > } va_list; > > This is what you're getting with your "va_list" declaration. While the > Darwin one is just a double pointer, but conceptually > > typedef struct { > void *stack; > } va_list; > > because all anonymous args go on the stack there on Darwin. > > That means when you call (Darwin's) va_start in your vprintf functionIt's Linux's va_start no?> it "correctly" initializes the first field of that struct, leaving the > rest garbage. The gr_offs and vr_offs fields decide whether to use > gr_top/vr_top or stack to actually get the argument, and in this case > if gr_offs happens to be >= 0 it'll "correctly" use the stack to > retrieve everything. I'm guessing that happens to be the case for > simple programs (quite possibly the stack is still zero-initialized if > this is a trivial test-case).Okay got it. So the good way to lower this va_start would be to correctly set the rest of the structure to zero if va_start is called from a function which has an Apple ABI (while targetting Linux)? (actually answered below)> You're also getting very lucky in that a Darwin varargs function > changes how much of the stack each argument uses, bringing it in line > with the normal AAPCS (otherwise the entire forwarding enterprise > would be doomed and you'd have to implement significant chunks of > vprintf to repack the arguments).Indeed!> So, at a high level what you'll *want* to do to correctly forward from > Darwin to Linux is make sure that always happens: initialize gr_offs > and vr_offs to 0 to begin with so only the stack is available (I'd > also set the *_top fields to NULL for good measure).Okay that seems to answer my question just above.> Take the time to be grateful you're not trying to go the other way, too!Yes :)>> * For variadic functions (which are among the functions that have different ABIs), GCC and >> Clang have __builtin_ms_va_list. My understanding is that we should have the Apple >> equivalent, but I'm not sure to completely understand what's at stake here. Said >> differently, is this builtin used to make sure we use the va_list type of the Apple ABI, >> should the need arise to forward it to another function that uses the Apple ABI? > > That, together with __builtin_ms_va_arg and __builtin_ms_va_start, are > for if you have a Linux-side function that wants to make use of a > va_list or anonymous args coming from Darwin code in a relatively > agnostic way. I think what you're doing (here at least) is so > intimately tied to bridging the two ABIs that using it would just be a > fig-leaf.Okay, thanks for the confirmation. Regards
Apparently Analagous Threads
- [LLVMdev] llc with -march=mips failed to compile va_start()/va_end()/va_arg()
- [LLVMdev] [RFC] NoBuiltin Attribute
- [LLVMdev] [RFC] NoBuiltin Attribute
- [LLVMdev] llc with -march=mips failed to compile va_start()/va_end()/va_arg()
- [PATCH] Support for Channel Mapping 253.