Rafael Espíndola via llvm-dev
2016-Jan-20 15:30 UTC
[llvm-dev] lld: ELF/COFF main() interface
Sorry for being late on this thread. I just wanted to say I am strongly on Rui's side on this one. There current design is for lld *not* not be a library and I think that is important. That has saved us a tremendous amount of work for doing library like code and a lot of design for library interfaces. The comparison of old and new ELF code is night and day as far as productivity and performance are concerned. Designing right now would be premature because it is not clear what commonalities there will be on how to refactor them. For example, both MCJIT and lld apply relocations, but there are tremendously different options on how to factor this * Have MC produce position dependent code and MCJIT would be a bit more like other jits and not need relocations. * Move relocation processing to LLVM somewhere and have lld and MCJIT use it. * Have MC produce shared objects directly, saving MCJIT the complication of using relocatable objects. * Have MCJIT use lld as trivial library that implements "ld foo.o -o foo.so -shared". The situation is even less clear for the other parts we are missing in llvm: objcopy, readelf, etc. We have to discuss and prototype these before we can make a decision. Committing now would be premature design and stall the progress on one thing we are sure we need: A high quality, bsd license linker. Lets get that implemented. While that MCJIT will move along and we will be in a position to productively discuss what can be shared and at what cost (complexity and performance). Last but not least, anything that is not needed in two different areas should remain application code. The only point of paying the complexity of writing a library is if it is used. Cheers, Rafael
Chandler Carruth via llvm-dev
2016-Jan-21 03:15 UTC
[llvm-dev] lld: ELF/COFF main() interface
On Wed, Jan 20, 2016 at 7:30 AM Rafael Espíndola <llvm-dev at lists.llvm.org> wrote:> Sorry for being late on this thread. > > I just wanted to say I am strongly on Rui's side on this one. > > There current design is for lld *not* not be a library and I think > that is important. That has saved us a tremendous amount of work for > doing library like code and a lot of design for library interfaces. > The comparison of old and new ELF code is night and day as far as > productivity and performance are concerned. Designing right now would > be premature because it is not clear what commonalities there will be > on how to refactor them. > > For example, both MCJIT and lld apply relocations, but there are > tremendously different options on how to factor this > > * Have MC produce position dependent code and MCJIT would be a bit > more like other jits and not need relocations. > * Move relocation processing to LLVM somewhere and have lld and MCJIT use > it. > * Have MC produce shared objects directly, saving MCJIT the > complication of using relocatable objects. > * Have MCJIT use lld as trivial library that implements "ld foo.o -o > foo.so -shared". > > The situation is even less clear for the other parts we are missing in > llvm: objcopy, readelf, etc. > > We have to discuss and prototype these before we can make a decision. > Committing now would be premature design and stall the progress on one > thing we are sure we need: A high quality, bsd license linker. Lets > get that implemented. While that MCJIT will move along and we will be > in a position to productively discuss what can be shared and at what > cost (complexity and performance). > > Last but not least, anything that is not needed in two different areas > should remain application code. The only point of paying the > complexity of writing a library is if it is used. >I strongly disagree about some of this, but agree about other aspects. I feel like there are two issues conflated here: 1) Having a fundamentally library-oriented structure of code and design philosophy. 2) Having general APIs for a library of code that allows it to be reused in different ways by different clients. For #1, let me indicate the kinds of things I'm thinking about here: - Cannot rely on global state - Cannot directly call "exit" (but can call "abort" for *programmer* errors like asserts) - Cannot leak memory There are probably others, but this is the gist of it. Now, you could still design everything with the simplest imaginable API, that is incredibly narrow and specialized for a *single* user. But there are still fundamentals of the style of code that are absolutely necessary to build a library. And the only way to make sure we get this right, is to have the single user of the code use it as a library and keep all the business logic inside the library. This pattern is fundamental to literally every part of LLVM, including Clang, LLDB, and thus far LLD. I think it is a core principle of the project as a whole. I think that unless LLD continues to follow this principle, it doesn't really fit in the LLVM project at all. But for #2, I actually completely agree with you. We will never guess the *right* general purpose API for different users to share logic until we actually have those different users. I very much like lazy design of APIs as users for those APIs arrive. It's one of the reasons I'm so strongly in favor of the lack of API stability in LLVM -- it *allows* us to figure these APIs out as the actual use cases emerge and we learn what they need to do. One of the nice things about changing APIs though is that there tends to be a clear incremental path to evolve the API. But if your code doesn't use basic memory management techniques, or if even reportable errors (as opposed to asserted programmer errors) are inherently fatal, fixing that can be incredibly hard and present a huge barrier to adoption of the library. So, I encourage LLD to keep its interfaces highly specialized for the users it actually has -- and indeed today that may be exactly one user, the command line linker. But when a new user for the libraries arrives, it needs to adapt to support an API that they can use, provided the use case is reasonable for the LLD code to support. And most importantly, it needs to be engineered as at least a fundamentally library oriented body of code. Finally, I will directly state that we (Google) have a specific interest in both linking LLD libraries into the Clang executable rather than having separate binaries, and in invoking LLD to link many different executables from a single process. So there is at least one concrete user here today. Now, the API we would need for both of these is *exactly* the API that the existing command line linker would need. But the functionality would have to be reasonable to access via a library call. -Chandler> > Cheers, > Rafael > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160121/70d65329/attachment.html>
On Wed, Jan 20, 2016 at 7:15 PM, Chandler Carruth <chandlerc at gmail.com> wrote:> On Wed, Jan 20, 2016 at 7:30 AM Rafael Espíndola <llvm-dev at lists.llvm.org> > wrote: > >> Sorry for being late on this thread. >> >> I just wanted to say I am strongly on Rui's side on this one. >> >> There current design is for lld *not* not be a library and I think >> that is important. That has saved us a tremendous amount of work for >> doing library like code and a lot of design for library interfaces. >> The comparison of old and new ELF code is night and day as far as >> productivity and performance are concerned. Designing right now would >> be premature because it is not clear what commonalities there will be >> on how to refactor them. >> >> For example, both MCJIT and lld apply relocations, but there are >> tremendously different options on how to factor this >> >> * Have MC produce position dependent code and MCJIT would be a bit >> more like other jits and not need relocations. >> * Move relocation processing to LLVM somewhere and have lld and MCJIT use >> it. >> * Have MC produce shared objects directly, saving MCJIT the >> complication of using relocatable objects. >> * Have MCJIT use lld as trivial library that implements "ld foo.o -o >> foo.so -shared". >> >> The situation is even less clear for the other parts we are missing in >> llvm: objcopy, readelf, etc. >> >> We have to discuss and prototype these before we can make a decision. >> Committing now would be premature design and stall the progress on one >> thing we are sure we need: A high quality, bsd license linker. Lets >> get that implemented. While that MCJIT will move along and we will be >> in a position to productively discuss what can be shared and at what >> cost (complexity and performance). >> >> Last but not least, anything that is not needed in two different areas >> should remain application code. The only point of paying the >> complexity of writing a library is if it is used. >> > > I strongly disagree about some of this, but agree about other aspects. I > feel like there are two issues conflated here: > > 1) Having a fundamentally library-oriented structure of code and design > philosophy. > > 2) Having general APIs for a library of code that allows it to be reused > in different ways by different clients. > > For #1, let me indicate the kinds of things I'm thinking about here: > - Cannot rely on global state > - Cannot directly call "exit" (but can call "abort" for *programmer* > errors like asserts) > - Cannot leak memory > > There are probably others, but this is the gist of it. Now, you could > still design everything with the simplest imaginable API, that is > incredibly narrow and specialized for a *single* user. But there are still > fundamentals of the style of code that are absolutely necessary to build a > library. And the only way to make sure we get this right, is to have the > single user of the code use it as a library and keep all the business logic > inside the library. > > This pattern is fundamental to literally every part of LLVM, including > Clang, LLDB, and thus far LLD. I think it is a core principle of the > project as a whole. I think that unless LLD continues to follow this > principle, it doesn't really fit in the LLVM project at all. > > > But for #2, I actually completely agree with you. We will never guess the > *right* general purpose API for different users to share logic until we > actually have those different users. I very much like lazy design of APIs > as users for those APIs arrive. It's one of the reasons I'm so strongly in > favor of the lack of API stability in LLVM -- it *allows* us to figure > these APIs out as the actual use cases emerge and we learn what they need > to do. > > One of the nice things about changing APIs though is that there tends to > be a clear incremental path to evolve the API. But if your code doesn't use > basic memory management techniques, or if even reportable errors (as > opposed to asserted programmer errors) are inherently fatal, fixing that > can be incredibly hard and present a huge barrier to adoption of the > library. > > > So, I encourage LLD to keep its interfaces highly specialized for the > users it actually has -- and indeed today that may be exactly one user, the > command line linker. > > But when a new user for the libraries arrives, it needs to adapt to > support an API that they can use, provided the use case is reasonable for > the LLD code to support. > > And most importantly, it needs to be engineered as at least a > fundamentally library oriented body of code. > > > Finally, I will directly state that we (Google) have a specific interest > in both linking LLD libraries into the Clang executable rather than having > separate binaries, and in invoking LLD to link many different executables > from a single process. So there is at least one concrete user here today. > Now, the API we would need for both of these is *exactly* the API that the > existing command line linker would need. But the functionality would have > to be reasonable to access via a library call. >I haven't heard of that until now. :) What is the point of doing that? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160120/937e78c1/attachment.html>
Rafael Espíndola via llvm-dev
2016-Jan-21 18:49 UTC
[llvm-dev] lld: ELF/COFF main() interface
> There are probably others, but this is the gist of it. Now, you could still > design everything with the simplest imaginable API, that is incredibly > narrow and specialized for a *single* user. But there are still fundamentals > of the style of code that are absolutely necessary to build a library. And > the only way to make sure we get this right, is to have the single user of > the code use it as a library and keep all the business logic inside the > library. > > This pattern is fundamental to literally every part of LLVM, including > Clang, LLDB, and thus far LLD. I think it is a core principle of the project > as a whole. I think that unless LLD continues to follow this principle, it > doesn't really fit in the LLVM project at all.The single user so far is the one the people actually coding the project care for. I seems odd to say that it doesn't fit in the LLVM project when it has attracted a lot of contributors and hit some important milestones.> So, I encourage LLD to keep its interfaces highly specialized for the users > it actually has -- and indeed today that may be exactly one user, the > command line linker.We have a highly specialized api consisting of one function: elf2::link(ArrayRef<const char *> Args). That fits 100% of the uses we have. If there is ever another use we can evaluate the cost of supporting it, but first we need to actually write the linker. Note that this is history replaying itself in a bigger scale. We used to have a fancy library to handle archives and llvm-ar was written on top of it. It was the worst ar implementation by far. It had horrible error handling, incompatible options and produced ar files with indexes that no linker could use. I nuked the library and wrote llvm-ar as the trivial program that it is. To the best of my knowledge it was then the fastest ar in existence, actually useful (linkers can use it's .a files) and far easier to maintain. When the effort to support windows came up, there was a need to create archives from within lld since link.exe can run lib.exe. The maintainable code was easy to refactor into one library function llvm::writeArchive. If another use ever show up, we evaluate it. If not, we keep the very narrow interface.> Finally, I will directly state that we (Google) have a specific interest in > both linking LLD libraries into the Clang executable rather than having > separate binaries, and in invoking LLD to link many different executables > from a single process. So there is at least one concrete user here today. > Now, the API we would need for both of these is *exactly* the API that the > existing command line linker would need. But the functionality would have to > be reasonable to access via a library call.Given that clang can fork, I assume that this new clang+lld can fork. If so, you might actually already be able to do it, just call elf2::link(ArrayRef<const char *> Args) in a new process. It is guaranteed to not crash your program or leak resources (short of a kernel bug). Cheers, Rafael
> On Jan 20, 2016, at 7:15 PM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Wed, Jan 20, 2016 at 7:30 AM Rafael Espíndola <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Sorry for being late on this thread. > > I just wanted to say I am strongly on Rui's side on this one. > > There current design is for lld *not* not be a library and I think > that is important. That has saved us a tremendous amount of work for > doing library like code and a lot of design for library interfaces. > The comparison of old and new ELF code is night and day as far as > productivity and performance are concerned. Designing right now would > be premature because it is not clear what commonalities there will be > on how to refactor them. > > For example, both MCJIT and lld apply relocations, but there are > tremendously different options on how to factor this > > * Have MC produce position dependent code and MCJIT would be a bit > more like other jits and not need relocations. > * Move relocation processing to LLVM somewhere and have lld and MCJIT use it. > * Have MC produce shared objects directly, saving MCJIT the > complication of using relocatable objects. > * Have MCJIT use lld as trivial library that implements "ld foo.o -o > foo.so -shared". > > The situation is even less clear for the other parts we are missing in > llvm: objcopy, readelf, etc. > > We have to discuss and prototype these before we can make a decision. > Committing now would be premature design and stall the progress on one > thing we are sure we need: A high quality, bsd license linker. Lets > get that implemented. While that MCJIT will move along and we will be > in a position to productively discuss what can be shared and at what > cost (complexity and performance). > > Last but not least, anything that is not needed in two different areas > should remain application code. The only point of paying the > complexity of writing a library is if it is used. > > I strongly disagree about some of this, but agree about other aspects. I feel like there are two issues conflated here: > > 1) Having a fundamentally library-oriented structure of code and design philosophy. > > 2) Having general APIs for a library of code that allows it to be reused in different ways by different clients. > > For #1, let me indicate the kinds of things I'm thinking about here: > - Cannot rely on global state > - Cannot directly call "exit" (but can call "abort" for *programmer* errors like asserts) > - Cannot leak memory > > There are probably others, but this is the gist of it. Now, you could still design everything with the simplest imaginable API, that is incredibly narrow and specialized for a *single* user. But there are still fundamentals of the style of code that are absolutely necessary to build a library. And the only way to make sure we get this right, is to have the single user of the code use it as a library and keep all the business logic inside the library. > > This pattern is fundamental to literally every part of LLVM, including Clang, LLDB, and thus far LLD. I think it is a core principle of the project as a whole. I think that unless LLD continues to follow this principle, it doesn't really fit in the LLVM project at all.FWIW I totally agree with all of Chandler’s points. — Mehdi> > > But for #2, I actually completely agree with you. We will never guess the *right* general purpose API for different users to share logic until we actually have those different users. I very much like lazy design of APIs as users for those APIs arrive. It's one of the reasons I'm so strongly in favor of the lack of API stability in LLVM -- it *allows* us to figure these APIs out as the actual use cases emerge and we learn what they need to do. > > One of the nice things about changing APIs though is that there tends to be a clear incremental path to evolve the API. But if your code doesn't use basic memory management techniques, or if even reportable errors (as opposed to asserted programmer errors) are inherently fatal, fixing that can be incredibly hard and present a huge barrier to adoption of the library. > > > So, I encourage LLD to keep its interfaces highly specialized for the users it actually has -- and indeed today that may be exactly one user, the command line linker. > > But when a new user for the libraries arrives, it needs to adapt to support an API that they can use, provided the use case is reasonable for the LLD code to support. > > And most importantly, it needs to be engineered as at least a fundamentally library oriented body of code. > > > Finally, I will directly state that we (Google) have a specific interest in both linking LLD libraries into the Clang executable rather than having separate binaries, and in invoking LLD to link many different executables from a single process. So there is at least one concrete user here today. Now, the API we would need for both of these is *exactly* the API that the existing command line linker would need. But the functionality would have to be reasonable to access via a library call. > > -Chandler > > > > Cheers, > Rafael > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160121/e2e072fb/attachment.html>