thr3ads.net - llvm dev - [llvm-dev] lld: ELF/COFF main() interface [Jan 2016]

If this information is useful, please help other people find it:
Share via:

Rafael Espíndola via llvm-dev

2016-Jan-20 15:30 UTC

[llvm-dev] lld: ELF/COFF main() interface

Sorry for being late on this thread.

I just wanted to say I am strongly on Rui's side on this one.

There current design is for lld *not* not be a library and I think
that is important. That has saved us a tremendous amount of work for
doing library like code and a lot of design for library interfaces.
The comparison of old and new ELF code is night and day as far as
productivity and performance are concerned. Designing right now would
be premature because it is not clear what commonalities there will be
on how to refactor them.

For example, both MCJIT and lld apply relocations, but there are
tremendously different options on how to factor this

* Have MC produce position dependent code and MCJIT would be a bit
more like other jits and not need relocations.
* Move relocation processing to LLVM somewhere and have lld and MCJIT use it.
* Have MC produce shared objects directly, saving MCJIT the
complication of using relocatable objects.
* Have MCJIT use lld as trivial library that implements "ld foo.o -o
foo.so -shared".

The situation is even less clear for the other parts we are missing in
llvm: objcopy, readelf, etc.

We have to discuss and prototype these before we can make a decision.
Committing now would be premature design and stall the progress on one
thing we are sure we need: A high quality, bsd  license linker. Lets
get that implemented. While that MCJIT will move along and we will be
in a position to productively discuss what can be shared and at what
cost (complexity and performance).

Last but not least, anything that is not needed in two different areas
should remain application code. The only point of paying the
complexity of writing a library is if it is used.

Cheers,
Rafael

Chandler Carruth via llvm-dev

2016-Jan-21 03:15 UTC

head link

[llvm-dev] lld: ELF/COFF main() interface

On Wed, Jan 20, 2016 at 7:30 AM Rafael Espíndola <llvm-dev at
lists.llvm.org>
wrote:
> Sorry for being late on this thread.
>
> I just wanted to say I am strongly on Rui's side on this one.
>
> There current design is for lld *not* not be a library and I think
> that is important. That has saved us a tremendous amount of work for
> doing library like code and a lot of design for library interfaces.
> The comparison of old and new ELF code is night and day as far as
> productivity and performance are concerned. Designing right now would
> be premature because it is not clear what commonalities there will be
> on how to refactor them.
>
> For example, both MCJIT and lld apply relocations, but there are
> tremendously different options on how to factor this
>
> * Have MC produce position dependent code and MCJIT would be a bit
> more like other jits and not need relocations.
> * Move relocation processing to LLVM somewhere and have lld and MCJIT use
> it.
> * Have MC produce shared objects directly, saving MCJIT the
> complication of using relocatable objects.
> * Have MCJIT use lld as trivial library that implements "ld foo.o -o
> foo.so -shared".
>
> The situation is even less clear for the other parts we are missing in
> llvm: objcopy, readelf, etc.
>
> We have to discuss and prototype these before we can make a decision.
> Committing now would be premature design and stall the progress on one
> thing we are sure we need: A high quality, bsd  license linker. Lets
> get that implemented. While that MCJIT will move along and we will be
> in a position to productively discuss what can be shared and at what
> cost (complexity and performance).
>
> Last but not least, anything that is not needed in two different areas
> should remain application code. The only point of paying the
> complexity of writing a library is if it is used.
>
I strongly disagree about some of this, but agree about other aspects. I
feel like there are two issues conflated here:

1) Having a fundamentally library-oriented structure of code and design
philosophy.

2) Having general APIs for a library of code that allows it to be reused in
different ways by different clients.

For #1, let me indicate the kinds of things I'm thinking about here:
- Cannot rely on global state
- Cannot directly call "exit" (but can call "abort" for
*programmer* errors
like asserts)
- Cannot leak memory

There are probably others, but this is the gist of it. Now, you could still
design everything with the simplest imaginable API, that is incredibly
narrow and specialized for a *single* user. But there are still
fundamentals of the style of code that are absolutely necessary to build a
library. And the only way to make sure we get this right, is to have the
single user of the code use it as a library and keep all the business logic
inside the library.

This pattern is fundamental to literally every part of LLVM, including
Clang, LLDB, and thus far LLD. I think it is a core principle of the
project as a whole. I think that unless LLD continues to follow this
principle, it doesn't really fit in the LLVM project at all.

But for #2, I actually completely agree with you. We will never guess the
*right* general purpose API for different users to share logic until we
actually have those different users. I very much like lazy design of APIs
as users for those APIs arrive. It's one of the reasons I'm so strongly
in
favor of the lack of API stability in LLVM -- it *allows* us to figure
these APIs out as the actual use cases emerge and we learn what they need
to do.

One of the nice things about changing APIs though is that there tends to be
a clear incremental path to evolve the API. But if your code doesn't use
basic memory management techniques, or if even reportable errors (as
opposed to asserted programmer errors) are inherently fatal, fixing that
can be incredibly hard and present a huge barrier to adoption of the
library.

So, I encourage LLD to keep its interfaces highly specialized for the users
it actually has -- and indeed today that may be exactly one user, the
command line linker.

But when a new user for the libraries arrives, it needs to adapt to support
an API that they can use, provided the use case is reasonable for the LLD
code to support.

And most importantly, it needs to be engineered as at least a fundamentally
library oriented body of code.

Finally, I will directly state that we (Google) have a specific interest in
both linking LLD libraries into the Clang executable rather than having
separate binaries, and in invoking LLD to link many different executables
from a single process. So there is at least one concrete user here today.
Now, the API we would need for both of these is *exactly* the API that the
existing command line linker would need. But the functionality would have
to be reasonable to access via a library call.

-Chandler

>
> Cheers,
> Rafael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160121/70d65329/attachment.html>

Rui Ueyama via llvm-dev

2016-Jan-21 03:41 UTC

head link

[llvm-dev] lld: ELF/COFF main() interface

On Wed, Jan 20, 2016 at 7:15 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:
> On Wed, Jan 20, 2016 at 7:30 AM Rafael Espíndola <llvm-dev at
lists.llvm.org>
> wrote:
>
>> Sorry for being late on this thread.
>>
>> I just wanted to say I am strongly on Rui's side on this one.
>>
>> There current design is for lld *not* not be a library and I think
>> that is important. That has saved us a tremendous amount of work for
>> doing library like code and a lot of design for library interfaces.
>> The comparison of old and new ELF code is night and day as far as
>> productivity and performance are concerned. Designing right now would
>> be premature because it is not clear what commonalities there will be
>> on how to refactor them.
>>
>> For example, both MCJIT and lld apply relocations, but there are
>> tremendously different options on how to factor this
>>
>> * Have MC produce position dependent code and MCJIT would be a bit
>> more like other jits and not need relocations.
>> * Move relocation processing to LLVM somewhere and have lld and MCJIT
use
>> it.
>> * Have MC produce shared objects directly, saving MCJIT the
>> complication of using relocatable objects.
>> * Have MCJIT use lld as trivial library that implements "ld foo.o
-o
>> foo.so -shared".
>>
>> The situation is even less clear for the other parts we are missing in
>> llvm: objcopy, readelf, etc.
>>
>> We have to discuss and prototype these before we can make a decision.
>> Committing now would be premature design and stall the progress on one
>> thing we are sure we need: A high quality, bsd  license linker. Lets
>> get that implemented. While that MCJIT will move along and we will be
>> in a position to productively discuss what can be shared and at what
>> cost (complexity and performance).
>>
>> Last but not least, anything that is not needed in two different areas
>> should remain application code. The only point of paying the
>> complexity of writing a library is if it is used.
>>
>
> I strongly disagree about some of this, but agree about other aspects. I
> feel like there are two issues conflated here:
>
> 1) Having a fundamentally library-oriented structure of code and design
> philosophy.
>
> 2) Having general APIs for a library of code that allows it to be reused
> in different ways by different clients.
>
> For #1, let me indicate the kinds of things I'm thinking about here:
> - Cannot rely on global state
> - Cannot directly call "exit" (but can call "abort" for
*programmer*
> errors like asserts)
> - Cannot leak memory
>
> There are probably others, but this is the gist of it. Now, you could
> still design everything with the simplest imaginable API, that is
> incredibly narrow and specialized for a *single* user. But there are still
> fundamentals of the style of code that are absolutely necessary to build a
> library. And the only way to make sure we get this right, is to have the
> single user of the code use it as a library and keep all the business logic
> inside the library.
>
> This pattern is fundamental to literally every part of LLVM, including
> Clang, LLDB, and thus far LLD. I think it is a core principle of the
> project as a whole. I think that unless LLD continues to follow this
> principle, it doesn't really fit in the LLVM project at all.
>
>
> But for #2, I actually completely agree with you. We will never guess the
> *right* general purpose API for different users to share logic until we
> actually have those different users. I very much like lazy design of APIs
> as users for those APIs arrive. It's one of the reasons I'm so
strongly in
> favor of the lack of API stability in LLVM -- it *allows* us to figure
> these APIs out as the actual use cases emerge and we learn what they need
> to do.
>
> One of the nice things about changing APIs though is that there tends to
> be a clear incremental path to evolve the API. But if your code doesn't
use
> basic memory management techniques, or if even reportable errors (as
> opposed to asserted programmer errors) are inherently fatal, fixing that
> can be incredibly hard and present a huge barrier to adoption of the
> library.
>
>
> So, I encourage LLD to keep its interfaces highly specialized for the
> users it actually has -- and indeed today that may be exactly one user, the
> command line linker.
>
> But when a new user for the libraries arrives, it needs to adapt to
> support an API that they can use, provided the use case is reasonable for
> the LLD code to support.
>
> And most importantly, it needs to be engineered as at least a
> fundamentally library oriented body of code.
>
>
> Finally, I will directly state that we (Google) have a specific interest
> in both linking LLD libraries into the Clang executable rather than having
> separate binaries, and in invoking LLD to link many different executables
> from a single process. So there is at least one concrete user here today.
> Now, the API we would need for both of these is *exactly* the API that the
> existing command line linker would need. But the functionality would have
> to be reasonable to access via a library call.
>
I haven't heard of that until now. :) What is the point of doing that?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160120/937e78c1/attachment.html>

Rafael Espíndola via llvm-dev

2016-Jan-21 18:49 UTC

head link

[llvm-dev] lld: ELF/COFF main() interface

> There are probably others, but this is the gist of it. Now, you could still
> design everything with the simplest imaginable API, that is incredibly
> narrow and specialized for a *single* user. But there are still
fundamentals
> of the style of code that are absolutely necessary to build a library. And
> the only way to make sure we get this right, is to have the single user of
> the code use it as a library and keep all the business logic inside the
> library.
>
> This pattern is fundamental to literally every part of LLVM, including
> Clang, LLDB, and thus far LLD. I think it is a core principle of the
project
> as a whole. I think that unless LLD continues to follow this principle, it
> doesn't really fit in the LLVM project at all.
The single user so far is the one the people actually coding the
project care for. I seems odd to say that it doesn't fit in the LLVM
project when it has attracted a lot of contributors and hit some
important milestones.
> So, I encourage LLD to keep its interfaces highly specialized for the users
> it actually has -- and indeed today that may be exactly one user, the
> command line linker.
We have a highly specialized api consisting of one function:
elf2::link(ArrayRef<const char *> Args). That fits 100% of the uses we
have. If there is ever another use we can evaluate the cost of
supporting it, but first we need to actually write the linker.

Note that this is history replaying itself in a bigger scale. We used
to have a fancy library to handle archives and llvm-ar was written on
top of it. It was the worst ar implementation by far. It had horrible
error handling, incompatible options and produced ar files with
indexes that no linker could use.

I nuked the library and wrote llvm-ar as the trivial program that it
is. To the best of my knowledge it was then the fastest ar in
existence, actually useful (linkers can use it's .a files) and far
easier to maintain.

When the effort to support windows came up, there was a need to create
archives from within lld since link.exe can run lib.exe. The
maintainable code was easy to refactor into one library function
llvm::writeArchive. If another use ever show up, we evaluate it. If
not, we keep the very narrow interface.
> Finally, I will directly state that we (Google) have a specific interest in
> both linking LLD libraries into the Clang executable rather than having
> separate binaries, and in invoking LLD to link many different executables
> from a single process. So there is at least one concrete user here today.
> Now, the API we would need for both of these is *exactly* the API that the
> existing command line linker would need. But the functionality would have
to
> be reasonable to access via a library call.
Given that clang can fork, I assume that this new clang+lld can fork.
If so, you might actually already be able to do it, just call
elf2::link(ArrayRef<const char *> Args) in a new process. It is
guaranteed to not crash your program or leak resources (short of a
kernel bug).

Cheers,
Rafael

Mehdi Amini via llvm-dev

2016-Jan-21 19:03 UTC

head link

[llvm-dev] lld: ELF/COFF main() interface

> On Jan 20, 2016, at 7:15 PM, Chandler Carruth via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On Wed, Jan 20, 2016 at 7:30 AM Rafael Espíndola <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> Sorry for being late on this thread.
> 
> I just wanted to say I am strongly on Rui's side on this one.
> 
> There current design is for lld *not* not be a library and I think
> that is important. That has saved us a tremendous amount of work for
> doing library like code and a lot of design for library interfaces.
> The comparison of old and new ELF code is night and day as far as
> productivity and performance are concerned. Designing right now would
> be premature because it is not clear what commonalities there will be
> on how to refactor them.
> 
> For example, both MCJIT and lld apply relocations, but there are
> tremendously different options on how to factor this
> 
> * Have MC produce position dependent code and MCJIT would be a bit
> more like other jits and not need relocations.
> * Move relocation processing to LLVM somewhere and have lld and MCJIT use
it.
> * Have MC produce shared objects directly, saving MCJIT the
> complication of using relocatable objects.
> * Have MCJIT use lld as trivial library that implements "ld foo.o -o
> foo.so -shared".
> 
> The situation is even less clear for the other parts we are missing in
> llvm: objcopy, readelf, etc.
> 
> We have to discuss and prototype these before we can make a decision.
> Committing now would be premature design and stall the progress on one
> thing we are sure we need: A high quality, bsd  license linker. Lets
> get that implemented. While that MCJIT will move along and we will be
> in a position to productively discuss what can be shared and at what
> cost (complexity and performance).
> 
> Last but not least, anything that is not needed in two different areas
> should remain application code. The only point of paying the
> complexity of writing a library is if it is used.
> 
> I strongly disagree about some of this, but agree about other aspects. I
feel like there are two issues conflated here:
> 
> 1) Having a fundamentally library-oriented structure of code and design
philosophy.
> 
> 2) Having general APIs for a library of code that allows it to be reused in
different ways by different clients.
> 
> For #1, let me indicate the kinds of things I'm thinking about here:
> - Cannot rely on global state
> - Cannot directly call "exit" (but can call "abort" for
*programmer* errors like asserts)
> - Cannot leak memory
> 
> There are probably others, but this is the gist of it. Now, you could still
design everything with the simplest imaginable API, that is incredibly narrow
and specialized for a *single* user. But there are still fundamentals of the
style of code that are absolutely necessary to build a library. And the only way
to make sure we get this right, is to have the single user of the code use it as
a library and keep all the business logic inside the library.
> 
> This pattern is fundamental to literally every part of LLVM, including
Clang, LLDB, and thus far LLD. I think it is a core principle of the project as
a whole. I think that unless LLD continues to follow this principle, it
doesn't really fit in the LLVM project at all.

FWIW I totally agree with all of Chandler’s points.

— 
Mehdi


> 
> 
> But for #2, I actually completely agree with you. We will never guess the
*right* general purpose API for different users to share logic until we actually
have those different users. I very much like lazy design of APIs as users for
those APIs arrive. It's one of the reasons I'm so strongly in favor of
the lack of API stability in LLVM -- it *allows* us to figure these APIs out as
the actual use cases emerge and we learn what they need to do.
> 
> One of the nice things about changing APIs though is that there tends to be
a clear incremental path to evolve the API. But if your code doesn't use
basic memory management techniques, or if even reportable errors (as opposed to
asserted programmer errors) are inherently fatal, fixing that can be incredibly
hard and present a huge barrier to adoption of the library.
> 
> 
> So, I encourage LLD to keep its interfaces highly specialized for the users
it actually has -- and indeed today that may be exactly one user, the command
line linker.
> 
> But when a new user for the libraries arrives, it needs to adapt to support
an API that they can use, provided the use case is reasonable for the LLD code
to support.
> 
> And most importantly, it needs to be engineered as at least a fundamentally
library oriented body of code.
> 
> 
> Finally, I will directly state that we (Google) have a specific interest in
both linking LLD libraries into the Clang executable rather than having separate
binaries, and in invoking LLD to link many different executables from a single
process. So there is at least one concrete user here today. Now, the API we
would need for both of these is *exactly* the API that the existing command line
linker would need. But the functionality would have to be reasonable to access
via a library call.
> 
> -Chandler
> 
>  
> 
> Cheers,
> Rafael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160121/e2e072fb/attachment.html>

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Jan 2016 - lld: ELF/COFF main() interface

[llvm-dev] lld: ELF/COFF main() interface

[llvm-dev] lld: ELF/COFF main() interface

[llvm-dev] lld: ELF/COFF main() interface

[llvm-dev] lld: ELF/COFF main() interface

[llvm-dev] lld: ELF/COFF main() interface

Seemingly Similar Threads