thr3ads.net - llvm dev - [LLVMdev] Using thin archives when building llvm [Jul 2015]

If this information is useful, please help other people find it:
Share via:

Rafael Espíndola

2015-Jul-23 00:32 UTC

[LLVMdev] Using thin archives when building llvm

> Cool!
>
> So the thin archive is a divergence from the standard ar file (although
it's
> compatible with GNU). Is there any room to push it further? Last time I ran
> the linker with profiling enabled, it spends a good amount of time just to
> find the terminating nul character in the archive file symbol table. If we
> store string length for each symbol, the linker can read archive files
> faster.
We can probably do it, yes.

Take a look at the BSD format (used on OS X, I just implemented it in llvm).

It is a bit better in that the symbol table is organized as a series
of offset pairs. One to the member, one to the string table.

This already improves handling on the traditional unix linker model
where we scan each member to see if it should be included on the link.
Once we find out it is to be included, it is really fast to scan past
the member without looking for nulls as one has to do in the GNU
format.

That doesn't help with COFF were we do a single pass anyway, but there
is more that we could benefit from the BSD format. I think that in
practice the string table in in order, so that we can compute the
string size by looking at the next member. I will give that a try.

Another reason to come up with a thin BSD format variant :-)

Cheers,
Rafael

P.S.: While testing the thin archive format I noticed that the thin
.lib files were a lot bigger than what I was getting on linux. It
turns out it was because cl.exe was producing .obj files with a *lot*
more symbols than clang on linux. Trying clang on windows showed that
cl.exe was not dropping what we call linkonce_odr, but clang on
windows still produces more symbols than clang on linux.

Rui Ueyama

2015-Jul-23 22:52 UTC

head link

[LLVMdev] Using thin archives when building llvm

On Wed, Jul 22, 2015 at 5:32 PM, Rafael Espíndola <
rafael.espindola at gmail.com> wrote:
> > Cool!
> >
> > So the thin archive is a divergence from the standard ar file
(although
> it's
> > compatible with GNU). Is there any room to push it further? Last time
I
> ran
> > the linker with profiling enabled, it spends a good amount of time
just
> to
> > find the terminating nul character in the archive file symbol table.
If
> we
> > store string length for each symbol, the linker can read archive files
> > faster.
>
> We can probably do it, yes.
>
> Take a look at the BSD format (used on OS X, I just implemented it in
> llvm).
>
> It is a bit better in that the symbol table is organized as a series
> of offset pairs. One to the member, one to the string table.
>
> This already improves handling on the traditional unix linker model
> where we scan each member to see if it should be included on the link.
> Once we find out it is to be included, it is really fast to scan past
> the member without looking for nulls as one has to do in the GNU
> format.
>
> That doesn't help with COFF were we do a single pass anyway, but there
> is more that we could benefit from the BSD format. I think that in
> practice the string table in in order, so that we can compute the
> string size by looking at the next member. I will give that a try.
>
Even if the string table is in in-order in practice, you have to search for
NUL characters byte-by-byte unless it's really guaranteed to be in-order,
no?

Another reason to come up with a thin BSD format variant
:-)>
> Cheers,
> Rafael
>
> P.S.: While testing the thin archive format I noticed that the thin
> .lib files were a lot bigger than what I was getting on linux. It
> turns out it was because cl.exe was producing .obj files with a *lot*
> more symbols than clang on linux. Trying clang on windows showed that
> cl.exe was not dropping what we call linkonce_odr, but clang on
> windows still produces more symbols than clang on linux.
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150723/7e80634d/attachment.html>

Rafael Espíndola

2015-Jul-23 23:03 UTC

head link

[LLVMdev] Using thin archives when building llvm

.>
>
> Even if the string table is in in-order in practice, you have to searchfor NUL characters byte-by-byte unless it's really guaranteed to be
in-order, no?>
No. Let's say we are at symbol N and want to find its size. Each symbol in
the bsd format is represented with a pair of offsets. One to the member and
one to the string table.

We should be able to compute the symbol name size as the difference from
the current symbol string table offset and the next symbol string table
offset.

Cheers, Rafael
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150723/754c9802/attachment.html>

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Jul 2015 - [LLVMdev] Using thin archives when building llvm

[LLVMdev] Using thin archives when building llvm

[LLVMdev] Using thin archives when building llvm

[LLVMdev] Using thin archives when building llvm

Seemingly Similar Threads