thr3ads.net - llvm dev - [LLVMdev] LLD improvement plan [May 2015]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2015-May-02 02:06 UTC

[LLVMdev] LLD improvement plan

On Fri, May 1, 2015 at 6:46 PM Nick Kledzik <kledzik at apple.com> wrote:
>
> On May 1, 2015, at 12:31 PM, Rui Ueyama <ruiu at google.com> wrote:
>
> *The atom model is not the best model for some architectures *
>
>
> The atom model is a good fit for the llvm compiler model for all
> architectures.  There is a one-to-one mapping between llvm::GlobalObject
> (e.g. function or global variable) and lld:DefinedAtom.
>
I'm not sure how that's really relevant.

On some architectures, the unit at which linking is defined to occur isn't
a global object. A classic example of this are architectures that have a
hard semantic reliance grouping two symbols together and linking either
both or neither of them.

> The problem is the ELF/PECOFF file format.   (Actually mach-o is also
> section based, but we have refrained from adding complex section-centric
> features to it, so mapping it to atoms is not too hard).
>
> I’d rather see our effort put to moving ahead to an llvm based object file
> format (aka “native” format) which bypasses the impedance mismatch of going
> through ELF/COFF.
>
We still have to be able to (efficiently) link existing ELF and COFF
objects though? While I'm actually pretty interested in some better object
file format, I also want a better linker for the world we live in today...

>
>
>
>
> *One symbol resolution model doesn’t fit all*
>
>
> Yes, the Resolver was meant to call out to the LinkingContext object to
> direct it on how to link.  Somehow that got morphed into “there should be a
> universal data model that when the Resolver process the input data, the
> right platform specific linking behavior falls out”.
>
>
> -Nick
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150502/49e18575/attachment.html>

Alex Rosenberg

2015-May-03 20:50 UTC

head link

[LLVMdev] LLD improvement plan

On May 1, 2015, at 7:06 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> On Fri, May 1, 2015 at 6:46 PM Nick Kledzik <kledzik at apple.com>
wrote:
> 
> On May 1, 2015, at 12:31 PM, Rui Ueyama <ruiu at google.com> wrote:
>> The atom model is not the best model for some architectures
> 
> The atom model is a good fit for the llvm compiler model for all
architectures.  There is a one-to-one mapping between llvm::GlobalObject (e.g.
function or global variable) and lld:DefinedAtom.
> 
> I'm not sure how that's really relevant.
> 
> On some architectures, the unit at which linking is defined to occur
isn't a global object. A classic example of this are architectures that have
a hard semantic reliance grouping two symbols together and linking either both
or neither of them.
>  
> The problem is the ELF/PECOFF file format.   (Actually mach-o is also
section based, but we have refrained from adding complex section-centric
features to it, so mapping it to atoms is not too hard).
> 
> I’d rather see our effort put to moving ahead to an llvm based object file
format (aka “native” format) which bypasses the impedance mismatch of going
through ELF/COFF.
> 
> We still have to be able to (efficiently) link existing ELF and COFF
objects though? While I'm actually pretty interested in some better object
file format, I also want a better linker for the world we live in today...
For us, this is secondary. A major part of the reason we started lld was to
embrace the atom model, that is to bring the linker closer to the compiler. We
have a lot of long-term goals that involve altering the traditional
compiler/linker flow, with a goal toward actual improvements in developer
workflow. Just iterating again on the exact same design we've had since the
'70s is not good enough.

The same is true of other legacy we're inheriting like linker scripts. While
we want them to work and work efficiently, we should consider them part of
necessary legacy to support and not make them fundamental to our internal
design. This planning will allow us latitude to make fundamental improvements.
We make similar decisions across LLVM all the time, take our attitude toward
__builtin_constant_p() or nested functions for example.

We've been at this for several years. We had goals and deadlines that
we're not meeting. We've abandoned several significant design points so
far because Rui is making progress on PECOFF and jettisons things and we let it
slide because of his rapid pace.

Core command line? GONE.
Round-trip testing? GONE.
Native file format? GONE.
And now we're against the Atom model?

I don't want a new linker that just happens to be so mired in the same
legacy that we've ended up with nothing but a gratuitous rewrite with a
better license.

We want:

* A new clean command line that obviates the need for linker scripts and their
incestuous design requirements.
* lld is thoroughly tested, including the efficient new native object file
format it provides.
* lld is like the rest of LLVM and can be remixed such that it's built-in to
the Clang driver, should we choose to.
* We can have the linker drive compilation such that objects don't leave the
disk cache before being consumed by the linker.

Perhaps we should schedule an in-person lld meeting. Almost everybody is in the
Bay Area. I'm happy to host if we think this will help.

Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150503/ec079bee/attachment.html>

Rui Ueyama

2015-May-03 22:13 UTC

head link

[LLVMdev] LLD improvement plan

It is not a secondary goal for me to create a linker that works very well
with existing file formats. I'm trying to create a practical tool with
clean codebase and with the LLVM library. So I can't agree with you that
it's secondary.

I don't also share the view that we are trying to solve the problem
that's
solved in '70s. How fast we can link a several hundred megabyte executable
is, for example, a pretty modern problem that we have today.

I don't oppose to the idea of creating a new file format that you think
better than the existing ones. We may come up with a better design,
implement that to the compiler, set out the foundation, and create a linker
based on that. However, what's actually happening is coming up with a new
idea which is not necessarily best to represent existing file formats, set
out the foundation based on that idea, and let LLD developers create a
linker for the existing formats base on the foundation (which is, again, is
not suitable for the formats). And there was no efforts made in the recent
few years for "the new format". I have to say that something is not
correct
here.

On Sun, May 3, 2015 at 1:50 PM, Alex Rosenberg <alexr at leftfield.org>
wrote:
> On May 1, 2015, at 7:06 PM, Chandler Carruth <chandlerc at
google.com> wrote:
>
> On Fri, May 1, 2015 at 6:46 PM Nick Kledzik <kledzik at apple.com>
wrote:
>
>>
>> On May 1, 2015, at 12:31 PM, Rui Ueyama <ruiu at google.com>
wrote:
>>
>> *The atom model is not the best model for some architectures *
>>
>>
>> The atom model is a good fit for the llvm compiler model for all
>> architectures.  There is a one-to-one mapping between
llvm::GlobalObject
>> (e.g. function or global variable) and lld:DefinedAtom.
>>
>
> I'm not sure how that's really relevant.
>
> On some architectures, the unit at which linking is defined to occur
isn't
> a global object. A classic example of this are architectures that have a
> hard semantic reliance grouping two symbols together and linking either
> both or neither of them.
>
>
>> The problem is the ELF/PECOFF file format.   (Actually mach-o is also
>> section based, but we have refrained from adding complex
section-centric
>> features to it, so mapping it to atoms is not too hard).
>>
>> I’d rather see our effort put to moving ahead to an llvm based object
>> file format (aka “native” format) which bypasses the impedance mismatch
of
>> going through ELF/COFF.
>>
>
> We still have to be able to (efficiently) link existing ELF and COFF
> objects though? While I'm actually pretty interested in some better
object
> file format, I also want a better linker for the world we live in today...
>
>>
> For us, this is secondary. A major part of the reason we started lld was
> to embrace the atom model, that is to bring the linker closer to the
> compiler. We have a lot of long-term goals that involve altering the
> traditional compiler/linker flow, with a goal toward actual improvements in
> developer workflow. Just iterating again on the exact same design we've
had
> since the '70s is not good enough.
>
> The same is true of other legacy we're inheriting like linker scripts.
> While we want them to work and work efficiently, we should consider them
> part of necessary legacy to support and not make them fundamental to our
> internal design. This planning will allow us latitude to make fundamental
> improvements. We make similar decisions across LLVM all the time, take our
> attitude toward __builtin_constant_p() or nested functions for example.
>
> We've been at this for several years. We had goals and deadlines that
> we're not meeting. We've abandoned several significant design
points so far
> because Rui is making progress on PECOFF and jettisons things and we let it
> slide because of his rapid pace.
>
> Core command line? GONE.
> Round-trip testing? GONE.
> Native file format? GONE.
> And now we're against the Atom model?
>
> I don't want a new linker that just happens to be so mired in the same
> legacy that we've ended up with nothing but a gratuitous rewrite with a
> better license.
>
> We want:
>
> * A new clean command line that obviates the need for linker scripts and
> their incestuous design requirements.
> * lld is thoroughly tested, including the efficient new native object file
> format it provides.
> * lld is like the rest of LLVM and can be remixed such that it's
built-in
> to the Clang driver, should we choose to.
> * We can have the linker drive compilation such that objects don't
leave
> the disk cache before being consumed by the linker.
>
> Perhaps we should schedule an in-person lld meeting. Almost everybody is
> in the Bay Area. I'm happy to host if we think this will help.
>
> Alex
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150503/08df37b5/attachment.html>

Davide Italiano

2015-May-03 22:51 UTC

head link

[LLVMdev] LLD improvement plan

On Sun, May 3, 2015 at 1:50 PM, Alex Rosenberg <alexr at leftfield.org>
wrote:> On May 1, 2015, at 7:06 PM, Chandler Carruth <chandlerc at
google.com> wrote:
>
> On Fri, May 1, 2015 at 6:46 PM Nick Kledzik <kledzik at apple.com>
wrote:
>>
>>
>> On May 1, 2015, at 12:31 PM, Rui Ueyama <ruiu at google.com>
wrote:
>>
>> The atom model is not the best model for some architectures
>>
>>
>> The atom model is a good fit for the llvm compiler model for all
>> architectures.  There is a one-to-one mapping between
llvm::GlobalObject
>> (e.g. function or global variable) and lld:DefinedAtom.
>
>
> I'm not sure how that's really relevant.
>
> On some architectures, the unit at which linking is defined to occur
isn't a
> global object. A classic example of this are architectures that have a hard
> semantic reliance grouping two symbols together and linking either both or
> neither of them.
>
>>
>> The problem is the ELF/PECOFF file format.   (Actually mach-o is also
>> section based, but we have refrained from adding complex
section-centric
>> features to it, so mapping it to atoms is not too hard).
>>
>> I’d rather see our effort put to moving ahead to an llvm based object
file
>> format (aka “native” format) which bypasses the impedance mismatch of
going
>> through ELF/COFF.
>
>
> We still have to be able to (efficiently) link existing ELF and COFF
objects
> though? While I'm actually pretty interested in some better object file
> format, I also want a better linker for the world we live in today...
>
>
> For us, this is secondary. A major part of the reason we started lld was to
> embrace the atom model, that is to bring the linker closer to the compiler.
> We have a lot of long-term goals that involve altering the traditional
> compiler/linker flow, with a goal toward actual improvements in developer
> workflow. Just iterating again on the exact same design we've had since
the
> '70s is not good enough.
>
> The same is true of other legacy we're inheriting like linker scripts.
While
> we want them to work and work efficiently, we should consider them part of
> necessary legacy to support and not make them fundamental to our internal
> design. This planning will allow us latitude to make fundamental
> improvements. We make similar decisions across LLVM all the time, take our
> attitude toward __builtin_constant_p() or nested functions for example.
>
> We've been at this for several years. We had goals and deadlines that
we're
> not meeting. We've abandoned several significant design points so far
> because Rui is making progress on PECOFF and jettisons things and we let it
> slide because of his rapid pace.
>
> Core command line? GONE.
> Round-trip testing? GONE.
> Native file format? GONE.
> And now we're against the Atom model?
>
> I don't want a new linker that just happens to be so mired in the same
> legacy that we've ended up with nothing but a gratuitous rewrite with a
> better license.
>
> We want:
>
> * A new clean command line that obviates the need for linker scripts and
> their incestuous design requirements.
> * lld is thoroughly tested, including the efficient new native object file
> format it provides.
> * lld is like the rest of LLVM and can be remixed such that it's
built-in to
> the Clang driver, should we choose to.
> * We can have the linker drive compilation such that objects don't
leave the
> disk cache before being consumed by the linker.
>
> Perhaps we should schedule an in-person lld meeting. Almost everybody is in
> the Bay Area. I'm happy to host if we think this will help.
>
> Alex
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
There are projects (like FreeBSD) which need a new linker.
In the FreeBSD case it is mainly motivated by a licensing issue, but I
feel like this doesn't mean that the linker needs to be slower or
harder to hack on because we want to treat as first class citizen a
format that has been largely unmaintained in the last 6 months at
least and as second class citizen widespread formats like ELF. I'm
personally excited about the idea of a new format and I would like to
spend some time thinking about it, although I always try to be
pragmatic.
I will be happy to discuss this further in person.

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare

James Y Knight

2015-May-04 14:45 UTC

head link

[LLVMdev] LLD improvement plan

> And now we're against the Atom model?

I'm quite new to the llvm community, and basically unfamiliar with LLD, so
maybe I'm simply uninformed. If so, I will now proceed to demonstrate that
to an entire list of people. :)

I've read the doc on http://lld.llvm.org/design.html, but the list of
features it says that you get with LLD/Atoms and don't get with the
"old generation" of linkers that use "sections"...are all
things that ELF linkers already do using sections, and do not require anything
finer grained than sections. Sections in ELF objects can actually be as
fine-grained as you want them to be -- just as an "Atom". Doc also
says, "An atom is an indivisible chunk of code or data." -- which is
also what a section is for ELF.

AFAICT, atoms in LLD are simply a restricted form of ELF sections: restricted to
having a single symbol associated with them. It doesn't appear that
they're actually enabling any new features that no other linker can do.

I'm not very familiar with Mach-O, but it sounds like, contrary to ELF,
Mach-O files cannot be generated with one section per global object, but that
Mach-O sections (at least as used by OSX) *are* expected to be
subdivided/rearranged/etc, and are not atomic. Given that set of properties for
the input file format, of course it makes sense that you'd want to subdivide
Mach-O "sections" within the linker into smaller atomic pieces to work
on them.

But for ELF, the compiler can/will output separate sections for each
function/global variable, and the contents of a section should never be mangled.
It can also emit multiple symbols into a single section. That an ELF section
*may* contain multiple functions/globals which need to stay together is not a
problem with the file format -- it's an advantage -- an additional
flexibility of representation.

I gather the current model in LLD doesn't support an atomic unit with
multiple symbols cleanly. And that that's the main issue that would be good
to fix here.

But, rather than talking about "eliminating the atom model" -- which
seems to be contentious -- maybe it would be more peaceful to just say that the
desired change is to "allow atoms to have multiple global symbols
associated, and have more metadata"? It appears to me that it amounts to
essentially the same thing, but may not be as contentious if described that way.

If that change was made, you'd just need to know that LLD has slightly
unique terminology; "ELF section" == "LLD Atom". (but
"Mach-O section" turns into multiple "LLD Atom"s).

Am I wrong?

James

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150504/e5d0d548/attachment.html>

llvm dev - May 2015 - [LLVMdev] LLD improvement plan

[LLVMdev] LLD improvement plan

[LLVMdev] LLD improvement plan

[LLVMdev] LLD improvement plan

[LLVMdev] LLD improvement plan

[LLVMdev] LLD improvement plan