mats petersson via llvm-dev
2016-May-23 12:00 UTC
[llvm-dev] A "Cross-Platform Runtime Library API" in LLVM IR
At least for Linux/Unix, there's very little you can actually achieve without at least some of the libc linked into your own system. Unless you actually write your own system call interface functions - which is dependent on the processor architecture (using `int X` [or `syscall`] in x86, perhaps `trap Y` in 68K, `swi` on ARM, etc) - values for X may vary depending on OS too, and ABI (different registers/calling convention may exist for the same architecture) - and of course, it will be yet different on Windows, either way. Using standard library functions in the C library will give a reasonable semblance of working on most platforms for which there is a C compiler. Sure, you can reduce the size of the application itself by a fair bit [when statically linked - my typical Pascal test executables are in the tens of kilobytes, because they dynamically link to libc], but I very much doubt you'll gain much overall time from using something other than libc - but you'll end up doing a lot of work. Do, by all means, investigate this, and if you find some significant savings, report back. But in general, I'd say, you probably end up doing the same thing that libc does already. And most languages need some start-up code, so calling `_start` at that point isn't such a terrible thing. I use the C compiler as a linker, so I just call my language's startup `main` and be done with it. But you'll probably find yourself implementing something like this: https://github.com/Leporacanthicus/lacsap/blob/master/runtime/main.c either way, even if it's not precisely called `main`, and isn't written in C. Yes, it's necessary to call `_start` or equivalen, if you want to `stdin` and `stdout`, to initialize those - but sooner or later, you'll end up wanting to buffer I/O a little bit beyond calling the OS read/write operations (`write` is pretty rubbish for implementing `printf` or Pascal's `writeln` - because you get far too many user-kernel-user transitions, making it slow), and most likely, you don't want to call `sbrk` and `mmap` to allocate memory directly either. So, the runtime library for language X will have to have some runtime library that sets things up in accordance to how it's file and memory handling. The key point here is "how much do you gain, and how much effort is it". If it's a very small gain, and a large amount of effort, is there something else that makes it meaningful to do? So far, I see no such case. By all means, if you want to implement your own language, and write your own runtime to go with that, LLVM will allow that. But you will need to implement some reasonably efficient handling of file-I/O and memory allocation for each OS you want. And a reason why you won't want to do ONE library that allows this for many languages is that different languages have different semantics for the DETAILS of how you do certain things (error handling could be throwing exceptions, returning error codes, aborting the execution - or some combination thereof based on compiler or language pragma, etc, etc). -- Mats On 23 May 2016 at 12:35, Lorenzo Laneve <lore97drk at icloud.com> wrote:> You guys are saying that the library which defines the runtime library is > written in C for many languages. > The problem is that such functions are in the libc and so the object files > have to be linked against the **entire** libc. > Sorry if I'm wrong but isn't it a little inefficient or hard to handle? > With "hard to handle" I mean the entry point: > if I use C's I/O operations, ain't I forced to use the C _start() > implementation that calls its initializers (such as stdin and stdout > initialization) before calling main() ? > > I was thinking about maybe trying to make the languages runtimes > independent from the C runtime. Wouldn't it make runtimes faster? > > > On May 23, 2016, at 1:19 PM, David Chisnall <David.Chisnall at cl.cam.ac.uk> > wrote: > > > >> On 23 May 2016, at 12:16, Lorenzo Laneve <lore97drk at icloud.com> wrote: > >> > >> I'm not talking about a new library instead of the libc, I'm talking > about letting people create a library optimized for a specific frontend, > regardless of the target. > > > > It sounded as if you were talking about a library that sits underneath > such a thing. Lots of languages have their own runtime libraries (I > maintain two of them), including Go, C++, Swift, and Objective-C. These > are generally intended to be portable across front ends, defining the > low-level binary interfaces that compilers target. > > > > David > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160523/ed64246d/attachment.html>
Lorenzo Laneve via llvm-dev
2016-May-23 12:35 UTC
[llvm-dev] A "Cross-Platform Runtime Library API" in LLVM IR
Actually my idea was not to provide the same library for all languages, but it was to provide an API for creating a library that better fits the needs of a specific language. If you tell me that relying on the C runtime is the best way to make a runtime library and there are really few disadvantages, then ok. I also have to say that creating a runtime independent from the C runtime might make the language unable to be interoperable with the C itself, so effectively this would be a great disadvantage. The main disadvantages of the libc integration are: - The executables might result a little larger, containing some useless C entities. - The entry point might be uselessly "heavier than needed", calling useless initializations - If a language wants to add its own global initializers, it has to use the main() function called by _start(), and I think that all programs for convention should start from main()> On May 23, 2016, at 2:00 PM, mats petersson <mats at planetcatfish.com> wrote: > > At least for Linux/Unix, there's very little you can actually achieve without at least some of the libc linked into your own system. Unless you actually write your own system call interface functions - which is dependent on the processor architecture (using `int X` [or `syscall`] in x86, perhaps `trap Y` in 68K, `swi` on ARM, etc) - values for X may vary depending on OS too, and ABI (different registers/calling convention may exist for the same architecture) - and of course, it will be yet different on Windows, either way. Using standard library functions in the C library will give a reasonable semblance of working on most platforms for which there is a C compiler. > > Sure, you can reduce the size of the application itself by a fair bit [when statically linked - my typical Pascal test executables are in the tens of kilobytes, because they dynamically link to libc], but I very much doubt you'll gain much overall time from using something other than libc - but you'll end up doing a lot of work. > > Do, by all means, investigate this, and if you find some significant savings, report back. But in general, I'd say, you probably end up doing the same thing that libc does already. And most languages need some start-up code, so calling `_start` at that point isn't such a terrible thing. I use the C compiler as a linker, so I just call my language's startup `main` and be done with it. But you'll probably find yourself implementing something like this: > https://github.com/Leporacanthicus/lacsap/blob/master/runtime/main.c > either way, even if it's not precisely called `main`, and isn't written in C. > > Yes, it's necessary to call `_start` or equivalen, if you want to `stdin` and `stdout`, to initialize those - but sooner or later, you'll end up wanting to buffer I/O a little bit beyond calling the OS read/write operations (`write` is pretty rubbish for implementing `printf` or Pascal's `writeln` - because you get far too many user-kernel-user transitions, making it slow), and most likely, you don't want to call `sbrk` and `mmap` to allocate memory directly either. So, the runtime library for language X will have to have some runtime library that sets things up in accordance to how it's file and memory handling. > > The key point here is "how much do you gain, and how much effort is it". If it's a very small gain, and a large amount of effort, is there something else that makes it meaningful to do? So far, I see no such case. > > By all means, if you want to implement your own language, and write your own runtime to go with that, LLVM will allow that. But you will need to implement some reasonably efficient handling of file-I/O and memory allocation for each OS you want. And a reason why you won't want to do ONE library that allows this for many languages is that different languages have different semantics for the DETAILS of how you do certain things (error handling could be throwing exceptions, returning error codes, aborting the execution - or some combination thereof based on compiler or language pragma, etc, etc). > > -- > Mats > >> On 23 May 2016 at 12:35, Lorenzo Laneve <lore97drk at icloud.com> wrote: >> You guys are saying that the library which defines the runtime library is written in C for many languages. >> The problem is that such functions are in the libc and so the object files have to be linked against the **entire** libc. >> Sorry if I'm wrong but isn't it a little inefficient or hard to handle? >> With "hard to handle" I mean the entry point: >> if I use C's I/O operations, ain't I forced to use the C _start() implementation that calls its initializers (such as stdin and stdout initialization) before calling main() ? >> >> I was thinking about maybe trying to make the languages runtimes independent from the C runtime. Wouldn't it make runtimes faster? >> >> > On May 23, 2016, at 1:19 PM, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote: >> > >> >> On 23 May 2016, at 12:16, Lorenzo Laneve <lore97drk at icloud.com> wrote: >> >> >> >> I'm not talking about a new library instead of the libc, I'm talking about letting people create a library optimized for a specific frontend, regardless of the target. >> > >> > It sounded as if you were talking about a library that sits underneath such a thing. Lots of languages have their own runtime libraries (I maintain two of them), including Go, C++, Swift, and Objective-C. These are generally intended to be portable across front ends, defining the low-level binary interfaces that compilers target. >> > >> > David >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160523/8dd8c698/attachment.html>
David Chisnall via llvm-dev
2016-May-23 12:43 UTC
[llvm-dev] A "Cross-Platform Runtime Library API" in LLVM IR
On 23 May 2016, at 13:35, Lorenzo Laneve <lore97drk at icloud.com> wrote:> > The main disadvantages of the libc integration are: > - The executables might result a little larger, containing some useless C entities.Dynamically linking libc adds very little to the binary size.> - The entry point might be uselessly "heavier than needed", calling useless initializations > - If a language wants to add its own global initializers, it has to use the main() function called by _start(), and I think that all programs for convention should start from main()This has very little to do with libc, but rather to do with the C runtime parts (csu / crt*). These are misnamed, as they’re not really specific to C. If you want things like LLVM’s llvm.init globals to work, then you must either use these or use something equivalent. If you want to start things before main() then you can use exactly the same mechanism as libc. David
Yichao Yu via llvm-dev
2016-May-23 13:05 UTC
[llvm-dev] A "Cross-Platform Runtime Library API" in LLVM IR
On Mon, May 23, 2016 at 8:35 AM, Lorenzo Laneve via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Actually my idea was not to provide the same library for all languages, but > it was to provide an API for creating a library that better fits the needs > of a specific language.OT: I don't think duplicating the libc API with a different name is very useful but what I do think would be useful is some way to tell LLVM that certain runtime functions behaves similar to malloc/free. For example, it would be great if LLVM can remove the call to our runtime allocation function (which might be GC'd, making it different from malloc of course...) if it can prove that the result is never escaped. (much like what is done with `malloc`/`free` pairs). There are certain attribute (noalias for example) that helps but IIRC the removal of malloc/free isn't available for other functions by attaching metadata?