Rui Ueyama via llvm-dev
2017-Jul-01 02:10 UTC
[llvm-dev] [LLD] Adding WebAssembly support to lld
Hi Sam, First, I want to know the symbol resolution semantics. I can imagine that that is set in stone yet, but just that you guys are still discussing what would be the best semantics or file format for the linkable wasm object file. I think by knowing more about the format and semantics, we can give you guys valuable feedback, as we've been actively working on the linker for a few years now. (And we know a lot of issues in existing object file format, so I don't want you guys to copy these failures.) As Sean pointed out, this looks very different from ELF or COFF in object construction. Does this mean the linker has to reconstruct everything? The ELF and COFF linkers are multi-threaded, as each thread can work on different sections simultaneously when writing to an output file. I wonder if it's still doable in wasm. Also, I wonder if there's a way to parallelize symbol resolution. Since there's no linkable wasm programs, we can take a radical approach. Have you ever considered making the file format more efficiently than ELF or COFF so that they are linked really fast? For example, in order to avoid a lot of (possibly very long due to name mangling) symbols, you could store SHA hashes or something so that linkers are able to handle symbols as an array of fixed-size elements. That is just an example. There are a lot of possible improvements we can make for a completely new file format. On Fri, Jun 30, 2017 at 5:19 PM, Sean Silva via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Can you link to docs about the wasm object format? (both relocatable and > executable) > > Also, traditional object file linkers are primarily concerned with > concatenating binary blobs with small amount of patching of said binary > blobs based on computed virtual (memory) addresses. Or perhaps to put it > another way, what traditional object file linkers do is construct program > images meant to be mapped directly into memory. > > My understanding is that wasm is pretty different from this (though > "linker frontend" things like the symbol resolution process is presumably > similar). Looking at Writer::run in your patch it seems like wasm is indeed > very different. E.g. the linker is aware of things like "types" and knowing > internal structure of functions (e.g. write_sig knows about how many > parameters a function has) > > Can you elaborate on semantically what the linker is actually doing for > wasm? > > -- Sean Silva > > On Fri, Jun 30, 2017 at 4:46 PM, Sam Clegg via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi llvmers, >> >> As you may know, work has been progressing on the experimental >> WebAssembly backend in llvm. However, there is currently not a good >> linking story. Most the of existing linking strategies (i.e. those in >> the emscripten toolchain) involve bitcode linking and whole program >> compilation at link time. >> >> To improve this situation I've been working on adding a wasm backend >> for lld. My current work is here: https://reviews.llvm.org/D34851 >> >> Although this port is not ready for production use (its missing >> several key features such as comdat support and full support for weak >> aliases) its already getting a some testing on the wasm waterfall: >> https://wasm-stat.us/builders/linux >> >> I'm hopeful that my patch may now be at an MVP stage that could be >> considered for merging into upstream lld. Thoughts? LLD maintainers, >> would you support the addition of a new backend? >> >> cheers, >> sam >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170630/1c0d6f50/attachment.html>
Sean Silva via llvm-dev
2017-Jul-01 04:25 UTC
[llvm-dev] [LLD] Adding WebAssembly support to lld
On Fri, Jun 30, 2017 at 7:10 PM, Rui Ueyama <ruiu at google.com> wrote:> Hi Sam, > > First, I want to know the symbol resolution semantics. I can imagine that > that is set in stone yet, but just that you guys are still discussing what > would be the best semantics or file format for the linkable wasm object > file. I think by knowing more about the format and semantics, we can give > you guys valuable feedback, as we've been actively working on the linker > for a few years now. (And we know a lot of issues in existing object file > format, so I don't want you guys to copy these failures.) > > As Sean pointed out, this looks very different from ELF or COFF in object > construction. Does this mean the linker has to reconstruct everything? The > ELF and COFF linkers are multi-threaded, as each thread can work on > different sections simultaneously when writing to an output file. I wonder > if it's still doable in wasm. > > Also, I wonder if there's a way to parallelize symbol resolution. Since > there's no linkable wasm programs, we can take a radical approach. >Another question for Sam is how many "innovation tokens" the wasm folks are prepared to burn on the object format. E.g. do they not really care as long as it works, or are they willing to invest significant time in designing and optimizing it. There's a lot of interesting directions when creating a new object format (this was actually one of the initial goals of LLD, way back at the project's inception!). There are lots of ideas but very little has actually been explored even to the point of knowing that making X change will give Y% speedup. So most (all?) of these things are definitely "research" type work. Also, looking at LLD's profile, there actually aren't really many things that immediately stand out as major (order of magnitude) improvements that are possible. The only obvious major thing that sticks out to me would be that if the relocations don't affect "layout" (or the wasm equivalent; e.g. don't require allocating bss or GOT entries), then we do only a single scan over relocations, which is about 30% of the current Ultimately we will be limited by disk IO (and if I remember from Rui's presentation, we're only like 4x slower than `cp`) as long as we don't go to a model that allows us to transcend writing the output file to disk in the critical path.> > Have you ever considered making the file format more efficiently than ELF > or COFF so that they are linked really fast? For example, in order to avoid > a lot of (possibly very long due to name mangling) symbols, you could store > SHA hashes or something so that linkers are able to handle symbols as an > array of fixed-size elements. > >For Sam's benefit, this is something that we've been thinking about for a while, but we don't really know how much speedup it will really give. (and IIRC Chandler said at one point that something like this had been tried at google but the extra per-TU time didn't pay off in the link time or something like that). Also, strings are only a big bottleneck (last I checked the profile) in debug info builds and we already know that that's just a fundamental problem due to split dwarf not being ubiquitous for ELF at this point in time.> That is just an example. There are a lot of possible improvements we can > make for a completely new file format. >I guess one thing that would be good to clarify is the design goal of this wasm linker. (and how interested you are in changing the format at this point) If I had to guess, I would guess that ideally the wasm linker would be a drop-in replacement for a standard native linker, so that changes to user build systems is minimal. E.g. the linker invocation would want to stay `ld main.o libfoo.a libbar.a ...` just like in the corresponding native link. (although how does wasm handle ar? for LLVM bitcode in LTO, that's always a stumbling block) Sam, could you clarify? If symbol resolution, "input section" selection, and archive semantics aren't close to that of native linkers, then it would make it difficult to port existing C/C++ apps (every app has e.g. an __attribute__((weak)) in there somewhere, or a C++ inline function so comdat/linkonce is needed, etc.). -- Sean Silva> > On Fri, Jun 30, 2017 at 5:19 PM, Sean Silva via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Can you link to docs about the wasm object format? (both relocatable and >> executable) >> >> Also, traditional object file linkers are primarily concerned with >> concatenating binary blobs with small amount of patching of said binary >> blobs based on computed virtual (memory) addresses. Or perhaps to put it >> another way, what traditional object file linkers do is construct program >> images meant to be mapped directly into memory. >> >> My understanding is that wasm is pretty different from this (though >> "linker frontend" things like the symbol resolution process is presumably >> similar). Looking at Writer::run in your patch it seems like wasm is indeed >> very different. E.g. the linker is aware of things like "types" and knowing >> internal structure of functions (e.g. write_sig knows about how many >> parameters a function has) >> >> Can you elaborate on semantically what the linker is actually doing for >> wasm? >> >> -- Sean Silva >> >> On Fri, Jun 30, 2017 at 4:46 PM, Sam Clegg via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi llvmers, >>> >>> As you may know, work has been progressing on the experimental >>> WebAssembly backend in llvm. However, there is currently not a good >>> linking story. Most the of existing linking strategies (i.e. those in >>> the emscripten toolchain) involve bitcode linking and whole program >>> compilation at link time. >>> >>> To improve this situation I've been working on adding a wasm backend >>> for lld. My current work is here: https://reviews.llvm.org/D34851 >>> >>> Although this port is not ready for production use (its missing >>> several key features such as comdat support and full support for weak >>> aliases) its already getting a some testing on the wasm waterfall: >>> https://wasm-stat.us/builders/linux >>> >>> I'm hopeful that my patch may now be at an MVP stage that could be >>> considered for merging into upstream lld. Thoughts? LLD maintainers, >>> would you support the addition of a new backend? >>> >>> cheers, >>> sam >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170630/1b91e626/attachment.html>
Sam Clegg via llvm-dev
2017-Jul-01 17:32 UTC
[llvm-dev] [LLD] Adding WebAssembly support to lld
On Fri, Jun 30, 2017 at 7:10 PM, Rui Ueyama <ruiu at google.com> wrote:> Hi Sam, > > First, I want to know the symbol resolution semantics. I can imagine that > that is set in stone yet, but just that you guys are still discussing what > would be the best semantics or file format for the linkable wasm object > file. I think by knowing more about the format and semantics, we can give > you guys valuable feedback, as we've been actively working on the linker for > a few years now. (And we know a lot of issues in existing object file > format, so I don't want you guys to copy these failures.)I've been aiming the match the semantics of native linkers in or order to minimize porting efforts and allow existing software and build systems to target WebAssembly. Specifically, I'm trying to match what the ELF port of lld does in terms of loading archives, objects and symbols including the resolution of weak symbols. Indeed, you can see that I borrow a large amount of the SymbolTable code. Like the other lld ports it does not use the left-to-right-only strategy of symbol resolution.> As Sean pointed out, this looks very different from ELF or COFF in object > construction. Does this mean the linker has to reconstruct everything? The > ELF and COFF linkers are multi-threaded, as each thread can work on > different sections simultaneously when writing to an output file. I wonder > if it's still doable in wasm.It should be doable for the data and code sections. For the other section types (of which there are at least 8), we will most likely be forced to reconstruct them fully and I'm not sure if that will be parallelizable in the same say. I'm hoping that doing code and data and relocations in parallel will still be a big win.> > Also, I wonder if there's a way to parallelize symbol resolution. Since > there's no linkable wasm programs, we can take a radical approach. > > Have you ever considered making the file format more efficiently than ELF or > COFF so that they are linked really fast? For example, in order to avoid a > lot of (possibly very long due to name mangling) symbols, you could store > SHA hashes or something so that linkers are able to handle symbols as an > array of fixed-size elements. > > That is just an example. There are a lot of possible improvements we can > make for a completely new file format.At this point I've mostly been focused on producing a working linker that matches the semantics of existing native linkers. My short goal is to provide something that can replace the current bitcode linking solution. Perhaps I'm aiming too low :)> > On Fri, Jun 30, 2017 at 5:19 PM, Sean Silva via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> Can you link to docs about the wasm object format? (both relocatable and >> executable) >> >> Also, traditional object file linkers are primarily concerned with >> concatenating binary blobs with small amount of patching of said binary >> blobs based on computed virtual (memory) addresses. Or perhaps to put it >> another way, what traditional object file linkers do is construct program >> images meant to be mapped directly into memory. >> >> My understanding is that wasm is pretty different from this (though >> "linker frontend" things like the symbol resolution process is presumably >> similar). Looking at Writer::run in your patch it seems like wasm is indeed >> very different. E.g. the linker is aware of things like "types" and knowing >> internal structure of functions (e.g. write_sig knows about how many >> parameters a function has) >> >> Can you elaborate on semantically what the linker is actually doing for >> wasm? >> >> -- Sean Silva >> >> On Fri, Jun 30, 2017 at 4:46 PM, Sam Clegg via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >>> >>> Hi llvmers, >>> >>> As you may know, work has been progressing on the experimental >>> WebAssembly backend in llvm. However, there is currently not a good >>> linking story. Most the of existing linking strategies (i.e. those in >>> the emscripten toolchain) involve bitcode linking and whole program >>> compilation at link time. >>> >>> To improve this situation I've been working on adding a wasm backend >>> for lld. My current work is here: https://reviews.llvm.org/D34851 >>> >>> Although this port is not ready for production use (its missing >>> several key features such as comdat support and full support for weak >>> aliases) its already getting a some testing on the wasm waterfall: >>> https://wasm-stat.us/builders/linux >>> >>> I'm hopeful that my patch may now be at an MVP stage that could be >>> considered for merging into upstream lld. Thoughts? LLD maintainers, >>> would you support the addition of a new backend? >>> >>> cheers, >>> sam >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >