Ömer Sinan Ağacan via llvm-dev
2020-Aug-10 15:04 UTC
[llvm-dev] (wasm-ld) Any fundamental problems with linking a shared wasm library statically?
wasm-ld is currently unable to link a shared wasm library (generated with `wasm-ld --shared`) with .o files and produce a working executable. I'm curious if there's a fundamental reason for this, or is this simply something that wasn't needed and could be implemented if needed. I think this could be done by - Resolving "GOT.mem" and "GOT.func" imports and replacing the imports with globals to mem/table indices of the imported symbols. - Applying dynamic relocations ($__wasm_apply_relocs) statically or dynamically in a linker-generated start function. - (I think dylink section is not needed) For (1), for a GOT.func import, we can find the function's table index (add it to the table if it's not already added) and replace the import with the constant of the index. For GOT.mem imports it's similar, we find the location of the symbol relative to the module's memory base, then replace the import with `memory_base + offset`. For (2), in general it's not possible to run an arbitrary wasm function in link time, but I think relocation functions are basically just a list of statements (in a C-like language) `memory_base + <offset> = <address of symbol> + <constant>`. The RHS can also be `<table base> + constant`. So I think it could be run at link time. Alternatively, I think we could run $__wasm_call_ctors as first thing in a linker-generated main function after updating memory_base and table_base of the imported module and it'd apply the relocations. Would this make sense? I'm new at wasm and not too experienced in linking (just a happy user of ld.lld and gold) so it's possible that I'm missing something and this is not going to work. Thanks, Ömer
Sam Clegg via llvm-dev
2020-Aug-10 22:08 UTC
[llvm-dev] (wasm-ld) Any fundamental problems with linking a shared wasm library statically?
Hi Omer, Dynamic linking support in wasm-ld is still a work in progress. When you used the `-shared` flag to link your shared libraries you should have seen a warning like `wasm-ld: warning: creating shared libraries, with -shared, is not yet stable` (at least with ToT llvm). The dynamic linking support that does exist today is mostly to support the emscripten compiler. There is some information on the current status here: https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md#implementation-status . I am curious what your use case is. From your description of your proposed solution it sounds like you want to be able to statically link a shared library with object files to produce what is essentially a statically linked executable. Is that right? In that case why not linking with the `.a` version of the library? If you want to build a static executable you can't generally do so if you have `.so` files as input, right? (not with lld or GNU ld anyway). I could be misunderstanding what you are asking for here... The current ABI imports GOT.func and GOT.mem from the environment and relies on a dynamic linker being present in the embedder (in the case of emscripten this is implemented JavaScript). The actual table and memory offset cannot be known statically by wasm-ld which is why those imports are generated. I am keen to move the dynamic linking story forward for wasm in llvm and there plans afoot for a new stable ABI based on a new WebAssembly proposal: https://github.com/WebAssembly/module-linking/. cheers, sam On Mon, Aug 10, 2020 at 8:05 AM Ömer Sinan Ağacan via llvm-dev < llvm-dev at lists.llvm.org> wrote:> wasm-ld is currently unable to link a shared wasm library (generated with > `wasm-ld --shared`) with .o files and produce a working executable. > > I'm curious if there's a fundamental reason for this, or is this simply > something that wasn't needed and could be implemented if needed. > > I think this could be done by > > - Resolving "GOT.mem" and "GOT.func" imports and replacing the imports with > globals to mem/table indices of the imported symbols. > > - Applying dynamic relocations ($__wasm_apply_relocs) statically or > dynamically > in a linker-generated start function. > > - (I think dylink section is not needed) > > For (1), for a GOT.func import, we can find the function's table index > (add it > to the table if it's not already added) and replace the import with the > constant > of the index. For GOT.mem imports it's similar, we find the location of the > symbol relative to the module's memory base, then replace the import with > `memory_base + offset`. > > For (2), in general it's not possible to run an arbitrary wasm function in > link > time, but I think relocation functions are basically just a list of > statements > (in a C-like language) `memory_base + <offset> = <address of symbol> + > <constant>`. The RHS can also be `<table base> + constant`. So I think it > could > be run at link time. > > Alternatively, I think we could run $__wasm_call_ctors as first thing in a > linker-generated main function after updating memory_base and table_base > of the > imported module and it'd apply the relocations. > > Would this make sense? I'm new at wasm and not too experienced in linking > (just > a happy user of ld.lld and gold) so it's possible that I'm missing > something and > this is not going to work. > > Thanks, > > Ömer > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200810/687a0110/attachment.html>
Ömer Sinan Ağacan via llvm-dev
2020-Aug-17 15:04 UTC
[llvm-dev] (wasm-ld) Any fundamental problems with linking a shared wasm library statically?
Hi Sam, Thanks for your response and sorry for my late response.> From your description of your proposed solution it sounds like you want to be > able to statically link a shared library with object files to produce what is > essentially a statically linked executable. Is that right?Correct.> In that case why not linking with the `.a` version of the library?The problem is I have a Wasm generator that doesn't use LLVM, and generating statically-linkable .o/.a that wasm-ld can link is a lot of work, with all the extra sections and relocations etc. Similarly implementing a linker that understands the wasm I generate and can link it with .o/.a files generated by LLVM is also a lot of work. So instead what I do is I generate the code almost as if I'm generating a single-file executable .wasm, then any C and Rust code that I want to link with it is compiled to a shared library, which is much easier to link. So far it works fine but sometimes changing the C/Rust code causes changes in the generated shared .wasm that breaks my linker (hence my original question regarding GOT.func/GOT.mem imports). One obvious idea here is to use LLVM in my code generator, which would generate .o/.a files that wasm-ld can link. I think the main difficulty with that is it's actually much easier to generate Wasm than to generate LLVM. It seems weird to generate a lower-level language (LLVM IR), then compile that to a higher-level language (Wasm). (I should mention I don't have a lot of experience generating LLVM so I may be wrong about it being lower level than Wasm)> The current ABI imports GOT.func and GOT.mem from the environment and relies > on a dynamic linker being present in the embedder (in the case of emscripten > this is implemented JavaScript). The actual table and memory offset cannot > be known statically by wasm-ld which is why those imports are generated.Is this still the case if I assume no dynamic linking (dload etc. or using Wasm-specific host functions)? My impression was that I can run functions $wasm_apply_relocs etc. in runtime and it should work fine. Things would with loading new modules in runtime but I never do that and can assume that it won't be done. My implementation works fine currently (or at least I'm not aware of any bugs). Thanks, Ömer Sam Clegg <sbc at google.com>, 11 Ağu 2020 Sal, 01:08 tarihinde şunu yazdı:> > Hi Omer, > > Dynamic linking support in wasm-ld is still a work in progress. When you used the `-shared` flag to link your shared libraries you should have seen a warning like `wasm-ld: warning: creating shared libraries, with -shared, is not yet stable` (at least with ToT llvm). > > The dynamic linking support that does exist today is mostly to support the emscripten compiler. There is some information on the current status here: https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md#implementation-status. > > I am curious what your use case is. From your description of your proposed solution it sounds like you want to be able to statically link a shared library with object files to produce what is essentially a statically linked executable. Is that right? In that case why not linking with the `.a` version of the library? If you want to build a static executable you can't generally do so if you have `.so` files as input, right? (not with lld or GNU ld anyway). I could be misunderstanding what you are asking for here... > > The current ABI imports GOT.func and GOT.mem from the environment and relies on a dynamic linker being present in the embedder (in the case of emscripten this is implemented JavaScript). The actual table and memory offset cannot be known statically by wasm-ld which is why those imports are generated. > > I am keen to move the dynamic linking story forward for wasm in llvm and there plans afoot for a new stable ABI based on a new WebAssembly proposal: https://github.com/WebAssembly/module-linking/. > > cheers, > sam > > On Mon, Aug 10, 2020 at 8:05 AM Ömer Sinan Ağacan via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> wasm-ld is currently unable to link a shared wasm library (generated with >> `wasm-ld --shared`) with .o files and produce a working executable. >> >> I'm curious if there's a fundamental reason for this, or is this simply >> something that wasn't needed and could be implemented if needed. >> >> I think this could be done by >> >> - Resolving "GOT.mem" and "GOT.func" imports and replacing the imports with >> globals to mem/table indices of the imported symbols. >> >> - Applying dynamic relocations ($__wasm_apply_relocs) statically or dynamically >> in a linker-generated start function. >> >> - (I think dylink section is not needed) >> >> For (1), for a GOT.func import, we can find the function's table index (add it >> to the table if it's not already added) and replace the import with the constant >> of the index. For GOT.mem imports it's similar, we find the location of the >> symbol relative to the module's memory base, then replace the import with >> `memory_base + offset`. >> >> For (2), in general it's not possible to run an arbitrary wasm function in link >> time, but I think relocation functions are basically just a list of statements >> (in a C-like language) `memory_base + <offset> = <address of symbol> + >> <constant>`. The RHS can also be `<table base> + constant`. So I think it could >> be run at link time. >> >> Alternatively, I think we could run $__wasm_call_ctors as first thing in a >> linker-generated main function after updating memory_base and table_base of the >> imported module and it'd apply the relocations. >> >> Would this make sense? I'm new at wasm and not too experienced in linking (just >> a happy user of ld.lld and gold) so it's possible that I'm missing something and >> this is not going to work. >> >> Thanks, >> >> Ömer >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev