LLVM already implements its own version of almost all of binutils. The exceptions to this rule are objcopy and strip. This is a proposal to implement an llvm version of objcopy/strip to complete llvm’s binutils. Several projects only use gnu binutils because of objcopy/strip. LLVM itself uses objcopy in fact. Chromium and Fuchsia currently use objcopy as well. If you want to distribute your build tools this is a problem due to licensing. It’s also a bit of a blemish on LLVM because LLVM could be made more self sufficient if there was an llvm version of objcopy. Additionally Chromium is one of the popular benchmarks for LLVM so it would be nice if Chromium didn’t have to use binutils. Using [elftoolchain](https://sourceforge.net/p/elftoolchain/wiki/Home/) solves the licensing issue for Fuchsia but is elf specific and only solves the issue for Fuchsia. I propose implementing llvm-objcopy to be a minimum viable replacement for objcopy. I’ve gone though the sources of LLVM, Clang, Chromium, and Fuchsia to try and find the major use cases of objcopy. Here is a list of use cases I have found and which projects use them. This list includes some use cases not found in these 4 projects. 1. Use Case: Stripping debug information of an executable to a file Who uses it: LLVM, Fuchsia, Chromium ```sh objcopy --only-keep-debug foo foo.debug objcopy --strip-debug foo foo ``` [Example use]( https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake ) When it is useful: This reduces the size of the file for distribution while maintaining the debug information in a file for later use. Anyone distributing an executable in anyway could benefit from this. 2. Use Case: Stripping debug information of a relocatable object to a file Who uses it: None of the 4 projects considered ```sh objcopy --only-keep-debug foo.o foo.debug objcopy --strip-debug foo.o foo.o ``` When it is useful: In distribution of an SDK in the form of an archive it would be nice to strip this information. This allows debug information to be distributed separately. 3. Use Case: Stripping debug information of a shared library to a file Who uses it: None of the 4 projects ```sh objcopy --only-keep-debug foo.so foo.debug objcopy --strip-debug foo.so foo.so ``` When is it Useful: Same benefits as the previous case. If you want to distribute a library this option allows you to distribute a smaller binary while maintaining the ability to debug. 4. Use Case: Stripping an executable Who uses it: None of the 4 projects ```sh objcopy --strip-all foo foo ``` When is it useful: Anytime an executable is being distributed and there is no reason to keep debugging information. This makes the executable smaller than simply stripping debug info and doesn't produce an extra file. 5. Use Case: “Complete stripping” an executable Who uses it: None of the 4 projects ```sh eu-strip --strip-sections foo ``` When is it useful: This is an extreme form of stripping that even strips the section headers since they are not needed for loading. This is useful in the same contexts as stripping but some tools and dynamic linkers may be confused by it. This is possibly only valid on ELF unlike general stripping which is a valid option on multiple platforms. 6. Use Case: DWARF fission Who uses it: Clang, Fuchsia, Chromium ```sh objcopy --extract-dwo foo foo.debug objcopy --strip-dwo foo foo ``` [Example use 1]( https://github.com/llvm-mirror/clang/blob/3efd04e48004628cfaffead00ecb1c206b0b6cb2/lib/Driver/ToolChains/CommonArgs.cpp ) [Example use 2]( https://github.com/llvm-mirror/clang/blob/a0badfbffbee71c2c757d580fc852d2124dadc5a/test/Driver/split-debug.s ) When is it useful: DWARF fission can be used to speed up large builds. In some cases builds can be too large to be handled and DWARF fission makes this manageable. DWARF fission is useful in almost any project of sufficient size. 7. Use Case: Converting an executable to binary Who uses it: Fuchsia ```sh objcopy -O binary magenta.elf magenta.bin ``` [Example use]( https://fuchsia.googlesource.com/magenta/+/master/make/build.mk#20) When is it useful: For kernels and embedded applications that need just the raw segments. 8. Use Case: Adding a gdb index Who uses it: Chromium ```sh gdb -batch foo -ex "save gdb-index dir" -ex quit objcopy --add-section .gdb_index="dir/foo.gdb-index" \ --set-section-flags .gdb_index=readonly foo foo ``` [Example use]( https://cs.chromium.org/chromium/src/build/gdb-add-index?type=cs&q=objcopy&l=71 ) When is it useful: Adding a gdb index reduces startup time for debugging an application. Any sufficiently large program with a sufficiently large amount of debug information can potentially benefit from this. 9. Use Case: Converting between formats Who uses it: Fuchsia (only in Magenta GCC build) ```sh objcopy --target=pei-x86-64 magenta.elf megenta.pe ``` [Example use]( https://fuchsia.googlesource.com/magenta/+/master/bootloader/build.mk#97) When is it useful: This is primarily useful when you can’t directly target a needed format. 10. Use Case: Removing symbols not needed for relocation Who uses it: Chromium ```sh objcopy --strip-unneeded foo foo ``` [Example use]( https://cs.chromium.org/chromium/src/third_party/libevdev/src/common.mk?type=cs&q=objcopy&l=397 ) When is it useful: This is useful when shipping an SDK or some relocatable binaries. 11. Use Case: Removing local symbols Who uses it: LLVM ```sh objcopy --discard-all foo foo ``` [Example use]( https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake ) (hidden in definition of “strip_command” using strip instead of objcopy and using -x instead of --discard-all) When is it useful: Anytime you don’t need locals for debugging this can be useful. 12. Use Case: Removing a specific unwanted section Who uses it: LLVM ```sh objcopy --remove-section=.debug_aranges foo foo ``` [Example use]( https://github.com/llvm-mirror/llvm/blob/93e6e5414ded14bcbb233baaaa5567132fee9a0c/test/DebugInfo/Inputs/fission-ranges.cc ) When is it useful: This is useful when you know that you have an unwanted section that isn’t removed by one of the other stripping options. This can also be used to remove an existing section for replacement by a new section. We would like to build this up incrementally by solving specific use cases as they come up. To start with we would like to tackle the use cases important to us. We primarily care about fully linked executables and not relocatable files. I plan to implement conversion from ELF to binary first. After that I plan on implementing stripping for ELF executables. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170602/964278b4/attachment.html>
I've thought about building an llvm-objcopy for a long time and the approach you've outlined is the same one that I would have suggested (analyzing a set of critical use cases, triaging them, and then incrementally building them. In other words, this approach SGTM. I've CC'ed a couple other people who might have some comments (but I've talked with them about objcopy before in one way or another and I don't get the feeling that they would disagree with the overall approach). A couple specific suggestions about the more concrete code design. IIRC, when I looked at GNU objcopy I saw why it was called objcopy: it basically looked like it was originally a program that copied an object file without modification. Then command line argument parsing was added and tons of flags appeared that triggered a mess of random `if` statements that would modify the copying process. I don't think we want to have an implementation like that, especially since we don't have anything even remotely similar to the "writing" side of the BFD library (libObject's object format agnostic interface is only for reading). 1. It seems that (besides the format conversion operations) everything is ELF. It will dramatically simplify the implementation to make it ELF-only at first. I would even recommend against using libObject's object-format agnostic reading implementation. One of the things we have learned while working on LLD is that abstracting across object formats is very difficult to get right. There are just too many subtle semantic differences that penetrate very deep into the program. As an example, LLD/ELF (which is ELF-only) and LLD/COFF (which is COFF-only) are each about 1/3 (or less) the size of the previous linker design that attempted to handle all 3 formats (MachO is the third format) together (and they are actually much more complete than the previous design was before we switched to the new design; normalizing for the difference in features, 1/6 the size is probably more accurate). Unless you also have as a goal (I don't think you do) to make progress towards an LLVM-based analog of the GNU BFD library as you work on objcopy, sticking to object-format specific code is probably preferable. It's *a lot* easier to look at format-specific implementations and see what can be shared vs making a mistake about the abstractions used across object formats and require untangling the incorrect abstraction. 2. I would really suggest making sure that there is a very, very clear separation between the objcopy-compatible command line parsing and the internals that actually do the work. In fact, it may be reasonable to have the separation be so profound that tool is called `llvm-objtool` (with subcommands like `llvm-objtool formatconvert ...`) and have the objcopy-compatible command line parsing essentially dispatch into one of them (with such parsing be triggered by looking at argv[0]). Regardless of whether it makes sense to go that far, it's best to err on the side of having separate implementations even if it seems to require duplicating some code. For example, if you have the same for loop in two different "subcommands", it may be best to make an iterator encapsulating it (or a helper function that takes a lambda) rather than adding a bool parameter to the function containing that loop. 3. (This is just a "keep an eye out" type thing. No specific suggestion.) As the implementation of objcopy progresses, especially if the object writing code is incrementally factored out between shared routines (as we try to avoid one huge writing routine taking 17 arguments controlling what it does), we may want to look at it together with other object file writing code in the LLVM project (LLD, llvm-dwp, MC) to see what can be unified. llvm-dwp is probably the most similar and most likely to be able to share code. -- Sean Silva On Thu, Jun 1, 2017 at 5:21 PM, Jake Ehrlich via llvm-dev < llvm-dev at lists.llvm.org> wrote:> LLVM already implements its own version of almost all of binutils. The > exceptions to this rule are objcopy and strip. This is a proposal to > implement > an llvm version of objcopy/strip to complete llvm’s binutils. > > Several projects only use gnu binutils because of objcopy/strip. LLVM > itself > uses objcopy in fact. Chromium and Fuchsia currently use objcopy as well. > If you > want to distribute your build tools this is a problem due to licensing. > It’s > also a bit of a blemish on LLVM because LLVM could be made more self > sufficient > if there was an llvm version of objcopy. Additionally Chromium is one of > the > popular benchmarks for LLVM so it would be nice if Chromium didn’t have to > use > binutils. Using > [elftoolchain](https://sourceforge.net/p/elftoolchain/wiki/Home/) > solves the licensing issue for Fuchsia but is elf specific and only solves > the > issue for Fuchsia. I propose implementing llvm-objcopy to be a minimum > viable > replacement for objcopy. > > I’ve gone though the sources of LLVM, Clang, Chromium, and Fuchsia to try > and > find the major use cases of objcopy. Here is a list of use cases I have > found > and which projects use them. This list includes some use cases not found in > these 4 projects. > > 1. Use Case: Stripping debug information of an executable to a file > Who uses it: LLVM, Fuchsia, Chromium > > ```sh > objcopy --only-keep-debug foo foo.debug > objcopy --strip-debug foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/ > cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake) > When it is useful: > This reduces the size of the file for distribution while maintaining > the debug > information in a file for later use. Anyone distributing an executable > in > anyway could benefit from this. > > 2. Use Case: Stripping debug information of a relocatable object to a file > > Who uses it: None of the 4 projects considered > > ```sh > objcopy --only-keep-debug foo.o foo.debug > objcopy --strip-debug foo.o foo.o > ``` > > When it is useful: > In distribution of an SDK in the form of an archive it would be nice to > strip > this information. This allows debug information to be distributed > separately. > > 3. Use Case: Stripping debug information of a shared library to a file > Who uses it: None of the 4 projects > > ```sh > objcopy --only-keep-debug foo.so foo.debug > objcopy --strip-debug foo.so foo.so > ``` > > When is it Useful: > Same benefits as the previous case. If you want to distribute a library > this > option allows you to distribute a smaller binary while maintaining the > ability > to debug. > > 4. Use Case: Stripping an executable > Who uses it: None of the 4 projects > > ```sh > objcopy --strip-all foo foo > ``` > > When is it useful: > Anytime an executable is being distributed and there is no reason to > keep > debugging information. This makes the executable smaller than simply > stripping debug info and doesn't produce an extra file. > > 5. Use Case: “Complete stripping” an executable > Who uses it: None of the 4 projects > ```sh > eu-strip --strip-sections foo > ``` > When is it useful: > This is an extreme form of stripping that even strips the section > headers > since they are not needed for loading. This is useful in the same > contexts as > stripping but some tools and dynamic linkers may be confused by it. > This is > possibly only valid on ELF unlike general stripping which is a valid > option on > multiple platforms. > > 6. Use Case: DWARF fission > Who uses it: Clang, Fuchsia, Chromium > > ```sh > objcopy --extract-dwo foo foo.debug > objcopy --strip-dwo foo foo > ``` > > [Example use 1](https://github.com/llvm-mirror/clang/blob/ > 3efd04e48004628cfaffead00ecb1c206b0b6cb2/lib/Driver/ > ToolChains/CommonArgs.cpp) > [Example use 2](https://github.com/llvm-mirror/clang/blob/ > a0badfbffbee71c2c757d580fc852d2124dadc5a/test/Driver/split-debug.s) > > When is it useful: > DWARF fission can be used to speed up large builds. In some cases > builds can > be too large to be handled and DWARF fission makes this manageable. > DWARF > fission is useful in almost any project of sufficient size. > > 7. Use Case: Converting an executable to binary > Who uses it: Fuchsia > > ```sh > objcopy -O binary magenta.elf magenta.bin > ``` > > [Example use](https://fuchsia.googlesource.com/magenta/+/ > master/make/build.mk#20) > > When is it useful: > For kernels and embedded applications that need just the raw segments. > > 8. Use Case: Adding a gdb index > Who uses it: Chromium > > ```sh > gdb -batch foo -ex "save gdb-index dir" -ex quit > objcopy --add-section .gdb_index="dir/foo.gdb-index" \ > --set-section-flags .gdb_index=readonly foo foo > ``` > > [Example use](https://cs.chromium.org/chromium/src/build/gdb-add- > index?type=cs&q=objcopy&l=71) > > When is it useful: > Adding a gdb index reduces startup time for debugging an application. > Any > sufficiently large program with a sufficiently large amount of debug > information can potentially benefit from this. > > 9. Use Case: Converting between formats > Who uses it: Fuchsia (only in Magenta GCC build) > > ```sh > objcopy --target=pei-x86-64 magenta.elf megenta.pe > ``` > > [Example use](https://fuchsia.googlesource.com/magenta/+/ > master/bootloader/build.mk#97) > > When is it useful: > This is primarily useful when you can’t directly target a needed format. > > 10. Use Case: Removing symbols not needed for relocation > Who uses it: Chromium > > ```sh > objcopy --strip-unneeded foo foo > ``` > > [Example use](https://cs.chromium.org/chromium/src/third_party/ > libevdev/src/common.mk?type=cs&q=objcopy&l=397) > > When is it useful: > This is useful when shipping an SDK or some relocatable binaries. > > 11. Use Case: Removing local symbols > Who uses it: LLVM > > ```sh > objcopy --discard-all foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/ > cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake) > (hidden in definition of “strip_command” using strip instead of > objcopy and > using -x instead of --discard-all) > > When is it useful: > Anytime you don’t need locals for debugging this can be useful. > > 12. Use Case: Removing a specific unwanted section > Who uses it: LLVM > > ```sh > objcopy --remove-section=.debug_aranges foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/ > 93e6e5414ded14bcbb233baaaa5567132fee9a0c/test/DebugInfo/ > Inputs/fission-ranges.cc) > > When is it useful: > This is useful when you know that you have an unwanted section that > isn’t > removed by one of the other stripping options. This can also be used to > remove an existing section for replacement by a new section. > > We would like to build this up incrementally by solving specific use cases > as they come up. To start with we would like to tackle the use cases > important to us. We primarily care about fully linked executables and not > relocatable files. I plan to implement conversion from ELF to binary first. > After that I plan on implementing stripping for ELF executables. > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170601/cd908c63/attachment.html>
I think use-case #7, converting an executable to a ROMable/flashable binary image is important to a large number of people who are fairly invisible in the LLVM community. All the more so because LLVM is the easiest compiler to get up and running for a new or custom ISA, most of which are used in embedded applications, not in general purpose PCs. Of course these people are not hurting too badly, because binutils exists, but it would be good not to need it. On Fri, Jun 2, 2017 at 3:21 AM, Jake Ehrlich via llvm-dev < llvm-dev at lists.llvm.org> wrote:> LLVM already implements its own version of almost all of binutils. The > exceptions to this rule are objcopy and strip. This is a proposal to > implement > an llvm version of objcopy/strip to complete llvm’s binutils. > > Several projects only use gnu binutils because of objcopy/strip. LLVM > itself > uses objcopy in fact. Chromium and Fuchsia currently use objcopy as well. > If you > want to distribute your build tools this is a problem due to licensing. > It’s > also a bit of a blemish on LLVM because LLVM could be made more self > sufficient > if there was an llvm version of objcopy. Additionally Chromium is one of > the > popular benchmarks for LLVM so it would be nice if Chromium didn’t have to > use > binutils. Using > [elftoolchain](https://sourceforge.net/p/elftoolchain/wiki/Home/) > solves the licensing issue for Fuchsia but is elf specific and only solves > the > issue for Fuchsia. I propose implementing llvm-objcopy to be a minimum > viable > replacement for objcopy. > > I’ve gone though the sources of LLVM, Clang, Chromium, and Fuchsia to try > and > find the major use cases of objcopy. Here is a list of use cases I have > found > and which projects use them. This list includes some use cases not found in > these 4 projects. > > 1. Use Case: Stripping debug information of an executable to a file > Who uses it: LLVM, Fuchsia, Chromium > > ```sh > objcopy --only-keep-debug foo foo.debug > objcopy --strip-debug foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/ > cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake) > When it is useful: > This reduces the size of the file for distribution while maintaining > the debug > information in a file for later use. Anyone distributing an executable > in > anyway could benefit from this. > > 2. Use Case: Stripping debug information of a relocatable object to a file > > Who uses it: None of the 4 projects considered > > ```sh > objcopy --only-keep-debug foo.o foo.debug > objcopy --strip-debug foo.o foo.o > ``` > > When it is useful: > In distribution of an SDK in the form of an archive it would be nice to > strip > this information. This allows debug information to be distributed > separately. > > 3. Use Case: Stripping debug information of a shared library to a file > Who uses it: None of the 4 projects > > ```sh > objcopy --only-keep-debug foo.so foo.debug > objcopy --strip-debug foo.so foo.so > ``` > > When is it Useful: > Same benefits as the previous case. If you want to distribute a library > this > option allows you to distribute a smaller binary while maintaining the > ability > to debug. > > 4. Use Case: Stripping an executable > Who uses it: None of the 4 projects > > ```sh > objcopy --strip-all foo foo > ``` > > When is it useful: > Anytime an executable is being distributed and there is no reason to > keep > debugging information. This makes the executable smaller than simply > stripping debug info and doesn't produce an extra file. > > 5. Use Case: “Complete stripping” an executable > Who uses it: None of the 4 projects > ```sh > eu-strip --strip-sections foo > ``` > When is it useful: > This is an extreme form of stripping that even strips the section > headers > since they are not needed for loading. This is useful in the same > contexts as > stripping but some tools and dynamic linkers may be confused by it. > This is > possibly only valid on ELF unlike general stripping which is a valid > option on > multiple platforms. > > 6. Use Case: DWARF fission > Who uses it: Clang, Fuchsia, Chromium > > ```sh > objcopy --extract-dwo foo foo.debug > objcopy --strip-dwo foo foo > ``` > > [Example use 1](https://github.com/llvm-mirror/clang/blob/ > 3efd04e48004628cfaffead00ecb1c206b0b6cb2/lib/Driver/ > ToolChains/CommonArgs.cpp) > [Example use 2](https://github.com/llvm-mirror/clang/blob/ > a0badfbffbee71c2c757d580fc852d2124dadc5a/test/Driver/split-debug.s) > > When is it useful: > DWARF fission can be used to speed up large builds. In some cases > builds can > be too large to be handled and DWARF fission makes this manageable. > DWARF > fission is useful in almost any project of sufficient size. > > 7. Use Case: Converting an executable to binary > Who uses it: Fuchsia > > ```sh > objcopy -O binary magenta.elf magenta.bin > ``` > > [Example use](https://fuchsia.googlesource.com/magenta/+/ > master/make/build.mk#20) > > When is it useful: > For kernels and embedded applications that need just the raw segments. > > 8. Use Case: Adding a gdb index > Who uses it: Chromium > > ```sh > gdb -batch foo -ex "save gdb-index dir" -ex quit > objcopy --add-section .gdb_index="dir/foo.gdb-index" \ > --set-section-flags .gdb_index=readonly foo foo > ``` > > [Example use](https://cs.chromium.org/chromium/src/build/gdb-add- > index?type=cs&q=objcopy&l=71) > > When is it useful: > Adding a gdb index reduces startup time for debugging an application. > Any > sufficiently large program with a sufficiently large amount of debug > information can potentially benefit from this. > > 9. Use Case: Converting between formats > Who uses it: Fuchsia (only in Magenta GCC build) > > ```sh > objcopy --target=pei-x86-64 magenta.elf megenta.pe > ``` > > [Example use](https://fuchsia.googlesource.com/magenta/+/ > master/bootloader/build.mk#97) > > When is it useful: > This is primarily useful when you can’t directly target a needed format. > > 10. Use Case: Removing symbols not needed for relocation > Who uses it: Chromium > > ```sh > objcopy --strip-unneeded foo foo > ``` > > [Example use](https://cs.chromium.org/chromium/src/third_party/ > libevdev/src/common.mk?type=cs&q=objcopy&l=397) > > When is it useful: > This is useful when shipping an SDK or some relocatable binaries. > > 11. Use Case: Removing local symbols > Who uses it: LLVM > > ```sh > objcopy --discard-all foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/ > cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake) > (hidden in definition of “strip_command” using strip instead of > objcopy and > using -x instead of --discard-all) > > When is it useful: > Anytime you don’t need locals for debugging this can be useful. > > 12. Use Case: Removing a specific unwanted section > Who uses it: LLVM > > ```sh > objcopy --remove-section=.debug_aranges foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/ > 93e6e5414ded14bcbb233baaaa5567132fee9a0c/test/DebugInfo/ > Inputs/fission-ranges.cc) > > When is it useful: > This is useful when you know that you have an unwanted section that > isn’t > removed by one of the other stripping options. This can also be used to > remove an existing section for replacement by a new section. > > We would like to build this up incrementally by solving specific use cases > as they come up. To start with we would like to tackle the use cases > important to us. We primarily care about fully linked executables and not > relocatable files. I plan to implement conversion from ELF to binary first. > After that I plan on implementing stripping for ELF executables. > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170602/54a3fc84/attachment.html>
yeah, something that people toss around from time to time - certainly if it's useful enough to you to motivate the work/time/effort, great! Seconding your comments & Sean's: Implement features as needed (in a binutils compatible interface/behavior - but doesn't have to have all the features. Could start out with clear errors for all the features "this isn't supported" & fill out features as they're needed by folks with the motivation to implement them). The LLVM use of objcopy for DWARF Fission's probably a marginal one - it'd be nice to remove that dependency and produce the separate files directly, at least when using the integrated assembler, but that's a fair bit more work. Having an LLVM objcopy would provide the opportunity to address a couple of the issues that exist already: 1) Windows support (I'm not sure what that really looks like - whether it would actually work with COFF files, etc, but I remember David Majnemer looking at this at one point) 2) Object file size regression (the LLVM object size optimization of having two of the object file string tables (strtab and shstrtab) in one section instead of two saves a bunch of file size - but binutils objcopy doesn't know that trick, so turning on fission undoes that improvement) On Thu, Jun 1, 2017 at 5:21 PM Jake Ehrlich via llvm-dev < llvm-dev at lists.llvm.org> wrote:> LLVM already implements its own version of almost all of binutils. The > exceptions to this rule are objcopy and strip. This is a proposal to > implement > an llvm version of objcopy/strip to complete llvm’s binutils. > > Several projects only use gnu binutils because of objcopy/strip. LLVM > itself > uses objcopy in fact. Chromium and Fuchsia currently use objcopy as well. > If you > want to distribute your build tools this is a problem due to licensing. > It’s > also a bit of a blemish on LLVM because LLVM could be made more self > sufficient > if there was an llvm version of objcopy. Additionally Chromium is one of > the > popular benchmarks for LLVM so it would be nice if Chromium didn’t have to > use > binutils. Using > [elftoolchain](https://sourceforge.net/p/elftoolchain/wiki/Home/) > solves the licensing issue for Fuchsia but is elf specific and only solves > the > issue for Fuchsia. I propose implementing llvm-objcopy to be a minimum > viable > replacement for objcopy. > > I’ve gone though the sources of LLVM, Clang, Chromium, and Fuchsia to try > and > find the major use cases of objcopy. Here is a list of use cases I have > found > and which projects use them. This list includes some use cases not found in > these 4 projects. > > 1. Use Case: Stripping debug information of an executable to a file > Who uses it: LLVM, Fuchsia, Chromium > > ```sh > objcopy --only-keep-debug foo foo.debug > objcopy --strip-debug foo foo > ``` > > [Example use]( > https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake > ) > When it is useful: > This reduces the size of the file for distribution while maintaining > the debug > information in a file for later use. Anyone distributing an executable > in > anyway could benefit from this. > > 2. Use Case: Stripping debug information of a relocatable object to a file > > Who uses it: None of the 4 projects considered > > ```sh > objcopy --only-keep-debug foo.o foo.debug > objcopy --strip-debug foo.o foo.o > ``` > > When it is useful: > In distribution of an SDK in the form of an archive it would be nice to > strip > this information. This allows debug information to be distributed > separately. > > 3. Use Case: Stripping debug information of a shared library to a file > Who uses it: None of the 4 projects > > ```sh > objcopy --only-keep-debug foo.so foo.debug > objcopy --strip-debug foo.so foo.so > ``` > > When is it Useful: > Same benefits as the previous case. If you want to distribute a library > this > option allows you to distribute a smaller binary while maintaining the > ability > to debug. > > 4. Use Case: Stripping an executable > Who uses it: None of the 4 projects > > ```sh > objcopy --strip-all foo foo > ``` > > When is it useful: > Anytime an executable is being distributed and there is no reason to > keep > debugging information. This makes the executable smaller than simply > stripping debug info and doesn't produce an extra file. > > 5. Use Case: “Complete stripping” an executable > Who uses it: None of the 4 projects > ```sh > eu-strip --strip-sections foo > ``` > When is it useful: > This is an extreme form of stripping that even strips the section > headers > since they are not needed for loading. This is useful in the same > contexts as > stripping but some tools and dynamic linkers may be confused by it. > This is > possibly only valid on ELF unlike general stripping which is a valid > option on > multiple platforms. > > 6. Use Case: DWARF fission > Who uses it: Clang, Fuchsia, Chromium > > ```sh > objcopy --extract-dwo foo foo.debug > objcopy --strip-dwo foo foo > ``` > > [Example use 1]( > https://github.com/llvm-mirror/clang/blob/3efd04e48004628cfaffead00ecb1c206b0b6cb2/lib/Driver/ToolChains/CommonArgs.cpp > ) > [Example use 2]( > https://github.com/llvm-mirror/clang/blob/a0badfbffbee71c2c757d580fc852d2124dadc5a/test/Driver/split-debug.s > ) > > When is it useful: > DWARF fission can be used to speed up large builds. In some cases > builds can > be too large to be handled and DWARF fission makes this manageable. > DWARF > fission is useful in almost any project of sufficient size. > > 7. Use Case: Converting an executable to binary > Who uses it: Fuchsia > > ```sh > objcopy -O binary magenta.elf magenta.bin > ``` > > [Example use]( > https://fuchsia.googlesource.com/magenta/+/master/make/build.mk#20) > > When is it useful: > For kernels and embedded applications that need just the raw segments. > > 8. Use Case: Adding a gdb index > Who uses it: Chromium > > ```sh > gdb -batch foo -ex "save gdb-index dir" -ex quit > objcopy --add-section .gdb_index="dir/foo.gdb-index" \ > --set-section-flags .gdb_index=readonly foo foo > ``` > > [Example use]( > https://cs.chromium.org/chromium/src/build/gdb-add-index?type=cs&q=objcopy&l=71 > ) > > When is it useful: > Adding a gdb index reduces startup time for debugging an application. > Any > sufficiently large program with a sufficiently large amount of debug > information can potentially benefit from this. > > 9. Use Case: Converting between formats > Who uses it: Fuchsia (only in Magenta GCC build) > > ```sh > objcopy --target=pei-x86-64 magenta.elf megenta.pe > ``` > > [Example use]( > https://fuchsia.googlesource.com/magenta/+/master/bootloader/build.mk#97) > > When is it useful: > This is primarily useful when you can’t directly target a needed format. > > 10. Use Case: Removing symbols not needed for relocation > Who uses it: Chromium > > ```sh > objcopy --strip-unneeded foo foo > ``` > > [Example use]( > https://cs.chromium.org/chromium/src/third_party/libevdev/src/common.mk?type=cs&q=objcopy&l=397 > ) > > When is it useful: > This is useful when shipping an SDK or some relocatable binaries. > > 11. Use Case: Removing local symbols > Who uses it: LLVM > > ```sh > objcopy --discard-all foo foo > ``` > > [Example use]( > https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake > ) > (hidden in definition of “strip_command” using strip instead of > objcopy and > using -x instead of --discard-all) > > When is it useful: > Anytime you don’t need locals for debugging this can be useful. > > 12. Use Case: Removing a specific unwanted section > Who uses it: LLVM > > ```sh > objcopy --remove-section=.debug_aranges foo foo > ``` > > [Example use]( > https://github.com/llvm-mirror/llvm/blob/93e6e5414ded14bcbb233baaaa5567132fee9a0c/test/DebugInfo/Inputs/fission-ranges.cc > ) > > When is it useful: > This is useful when you know that you have an unwanted section that > isn’t > removed by one of the other stripping options. This can also be used to > remove an existing section for replacement by a new section. > > We would like to build this up incrementally by solving specific use cases > as they come up. To start with we would like to tackle the use cases > important to us. We primarily care about fully linked executables and not > relocatable files. I plan to implement conversion from ELF to binary first. > After that I plan on implementing stripping for ELF executables. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170602/d8fa98a5/attachment.html>
On 1 June 2017 at 20:21, Jake Ehrlich via llvm-dev <llvm-dev at lists.llvm.org> wrote:> LLVM already implements its own version of almost all of binutils. The > exceptions to this rule are objcopy and strip. This is a proposal to > implement > an llvm version of objcopy/strip to complete llvm’s binutils.A bit of info from the FreeBSD perspective. In FreeBSD we use ELF Tool Chain versions of most binutils (addr2line, c++filt, objcopy, nm, size, strings, strip, readelf) and bespoke versions of other tools (ar, elfdump, ranlib). The exceptions are as (for which we have no replacement), ld.bfd (we've switched to LLD for arm64 and are working on other architectures), and objdump (investigating llvm-objdump, although it still has some limitations). That said, I would very much like to see LLVM equivalents for all of the tools so that we can compare and benchmark against other implementations, and so that we have an alternative available if it becomes necessary. I would be happy for llvm-objcopy to exist.> 1. Use Case: Stripping debug information of an executable to a file > 6. Use Case: DWARF fission > 8. Use Case: Adding a gdb indexIt seems like these ought to just be done by the linker. We don't yet use (6) in FreeBSD because our toolchain is still using some ancient components on most architectures but it is something I very much wish to start doing.> 2. Use Case: Stripping debug information of a relocatable object to a file > Who uses it: None of the 4 projects considered > 5. Use Case: “Complete stripping” an executable > Who uses it: None of the 4 projects > ```sh > eu-strip --strip-sections foo > ```I'd be surprised to find this being used.> 3. Use Case: Stripping debug information of a shared library to a file > 4. Use Case: Stripping an executable > 7. Use Case: Converting an executable to binary > 9. Use Case: Converting between formats [ELF->PE] > 12. Use Case: Removing a specific unwanted sectionWe use these cases in FreeBSD. One additional use case for you: converting from a binary to an ELF object file ``` objcopy -I binary -O elf64-x86-64 foo.bin foo.o ``` This is sometimes used for embedding binary files for use by drivers and such.
Having llvm-objcopy would be great! A really small version of it was already implemented: https://github.com/RodAtDISA/llvm-objcopy https://github.com/tpimh/llvm-objcopy (this fork can be compiled with LLVM master) The functions of this implementation of objcopy are very, uhm... limited. I think this topic appeared already several times on this mailing list, so probably we get some information from the previous discussions. It would be great to have all the binutils in LLVM, so no external toolchain is needed. Regards, Dmitry 02.06.2017, 21:35, "Ed Maste via llvm-dev" <llvm-dev at lists.llvm.org>:> On 1 June 2017 at 20:21, Jake Ehrlich via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> LLVM already implements its own version of almost all of binutils. The >> exceptions to this rule are objcopy and strip. This is a proposal to >> implement >> an llvm version of objcopy/strip to complete llvm’s binutils. > > A bit of info from the FreeBSD perspective. In FreeBSD we use ELF Tool > Chain versions of most binutils (addr2line, c++filt, objcopy, nm, > size, strings, strip, readelf) and bespoke versions of other tools > (ar, elfdump, ranlib). The exceptions are as (for which we have no > replacement), ld.bfd (we've switched to LLD for arm64 and are working > on other architectures), and objdump (investigating llvm-objdump, > although it still has some limitations). > > That said, I would very much like to see LLVM equivalents for all of > the tools so that we can compare and benchmark against other > implementations, and so that we have an alternative available if it > becomes necessary. I would be happy for llvm-objcopy to exist. > >> 1. Use Case: Stripping debug information of an executable to a file >> 6. Use Case: DWARF fission >> 8. Use Case: Adding a gdb index > > It seems like these ought to just be done by the linker. We don't yet > use (6) in FreeBSD because our toolchain is still using some ancient > components on most architectures but it is something I very much wish > to start doing. > >> 2. Use Case: Stripping debug information of a relocatable object to a file >> Who uses it: None of the 4 projects considered >> 5. Use Case: “Complete stripping” an executable >> Who uses it: None of the 4 projects >> ```sh >> eu-strip --strip-sections foo >> ``` > > I'd be surprised to find this being used. > >> 3. Use Case: Stripping debug information of a shared library to a file >> 4. Use Case: Stripping an executable >> 7. Use Case: Converting an executable to binary >> 9. Use Case: Converting between formats [ELF->PE] >> 12. Use Case: Removing a specific unwanted section > > We use these cases in FreeBSD. > > One additional use case for you: converting from a binary to an ELF object file > ``` > objcopy -I binary -O elf64-x86-64 foo.bin foo.o > ``` > This is sometimes used for embedding binary files for use by drivers and such. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Hello Jake, I don’t have any experience with objcopy but have some with the classic UNIX strip especially on darwin for Mach-O. I even prototyped up an llvm-strip and got it working enough to do the default stripping on an hello world executable to get a "bit for bit" match for a darwin for Mach-O file against the existing strip(1) tool. As Sean points out, there is nothing currently in the llvm’s libObject code that writes a binary. I do agree with Sean that getting this correct it is best to not try to make a unified bit of code that writes the three formats llvm currently cares about (ELF, COFF and Mach-O). Also my experience suggests, creating tools that write “modified binaries” from fully linked binaries is quite different than writing binaries from an assembler or linker. As you have very limited degrees of freedom slicing and dicing a fully linked file and still have a correctly formed file. That is you can’t usually change any addresses, etc. and you have to update all the references to things like indexes into the symbol table, string table, etc from other tables in the object file. So while it might be good to "keep an eye out” for what could be shared, if you push too hard on that I think your design may not turn out all that clean. That said, I do think there is value sharing the "object file reader” code so that all the error checking can be in one place. While I’m not a big fan of libObject it did prove workable for my prototype for llvm-strip for the reading in of object files. But I did as Sean suggested and went with a totally object format dependent bit of code to write a modified linked object. I did this a bit cleaner that what I did with the darwin cctools open source code I wrote many decades ago. But I feel it is best to have an object format dependent bit of code to put back together the modified parts of a linked Mach-O file. As that is easy to get wrong and a pain to debug when one does get it wrong. My thinking was to have a bit of library code that the darwin tools like install_name_tool(1), bitcode_strip(1), etc could shared and use to reconstruct their modified fully linked binaries. My thoughts, Kev> On Jun 1, 2017, at 6:28 PM, Sean Silva via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I've thought about building an llvm-objcopy for a long time and the approach you've outlined is the same one that I would have suggested (analyzing a set of critical use cases, triaging them, and then incrementally building them. In other words, this approach SGTM. I've CC'ed a couple other people who might have some comments (but I've talked with them about objcopy before in one way or another and I don't get the feeling that they would disagree with the overall approach). > > A couple specific suggestions about the more concrete code design. > > IIRC, when I looked at GNU objcopy I saw why it was called objcopy: it basically looked like it was originally a program that copied an object file without modification. Then command line argument parsing was added and tons of flags appeared that triggered a mess of random `if` statements that would modify the copying process. I don't think we want to have an implementation like that, especially since we don't have anything even remotely similar to the "writing" side of the BFD library (libObject's object format agnostic interface is only for reading). > > 1. It seems that (besides the format conversion operations) everything is ELF. It will dramatically simplify the implementation to make it ELF-only at first. I would even recommend against using libObject's object-format agnostic reading implementation. One of the things we have learned while working on LLD is that abstracting across object formats is very difficult to get right. There are just too many subtle semantic differences that penetrate very deep into the program. As an example, LLD/ELF (which is ELF-only) and LLD/COFF (which is COFF-only) are each about 1/3 (or less) the size of the previous linker design that attempted to handle all 3 formats (MachO is the third format) together (and they are actually much more complete than the previous design was before we switched to the new design; normalizing for the difference in features, 1/6 the size is probably more accurate). Unless you also have as a goal (I don't think you do) to make progress towards an LLVM-based analog of the GNU BFD library as you work on objcopy, sticking to object-format specific code is probably preferable. It's *a lot* easier to look at format-specific implementations and see what can be shared vs making a mistake about the abstractions used across object formats and require untangling the incorrect abstraction. > > 2. I would really suggest making sure that there is a very, very clear separation between the objcopy-compatible command line parsing and the internals that actually do the work. In fact, it may be reasonable to have the separation be so profound that tool is called `llvm-objtool` (with subcommands like `llvm-objtool formatconvert ...`) and have the objcopy-compatible command line parsing essentially dispatch into one of them (with such parsing be triggered by looking at argv[0]). Regardless of whether it makes sense to go that far, it's best to err on the side of having separate implementations even if it seems to require duplicating some code. For example, if you have the same for loop in two different "subcommands", it may be best to make an iterator encapsulating it (or a helper function that takes a lambda) rather than adding a bool parameter to the function containing that loop. > > 3. (This is just a "keep an eye out" type thing. No specific suggestion.) As the implementation of objcopy progresses, especially if the object writing code is incrementally factored out between shared routines (as we try to avoid one huge writing routine taking 17 arguments controlling what it does), we may want to look at it together with other object file writing code in the LLVM project (LLD, llvm-dwp, MC) to see what can be unified. llvm-dwp is probably the most similar and most likely to be able to share code. > > > -- Sean Silva > > On Thu, Jun 1, 2017 at 5:21 PM, Jake Ehrlich via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > LLVM already implements its own version of almost all of binutils. The > exceptions to this rule are objcopy and strip. This is a proposal to implement > an llvm version of objcopy/strip to complete llvm’s binutils. > > Several projects only use gnu binutils because of objcopy/strip. LLVM itself > uses objcopy in fact. Chromium and Fuchsia currently use objcopy as well. If you > want to distribute your build tools this is a problem due to licensing. It’s > also a bit of a blemish on LLVM because LLVM could be made more self sufficient > if there was an llvm version of objcopy. Additionally Chromium is one of the > popular benchmarks for LLVM so it would be nice if Chromium didn’t have to use > binutils. Using > [elftoolchain](https://sourceforge.net/p/elftoolchain/wiki/Home/ <https://sourceforge.net/p/elftoolchain/wiki/Home/>) > solves the licensing issue for Fuchsia but is elf specific and only solves the > issue for Fuchsia. I propose implementing llvm-objcopy to be a minimum viable > replacement for objcopy. > > I’ve gone though the sources of LLVM, Clang, Chromium, and Fuchsia to try and > find the major use cases of objcopy. Here is a list of use cases I have found > and which projects use them. This list includes some use cases not found in > these 4 projects. > > 1. Use Case: Stripping debug information of an executable to a file > Who uses it: LLVM, Fuchsia, Chromium > > ```sh > objcopy --only-keep-debug foo foo.debug > objcopy --strip-debug foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake <https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake>) > When it is useful: > This reduces the size of the file for distribution while maintaining the debug > information in a file for later use. Anyone distributing an executable in > anyway could benefit from this. > > 2. Use Case: Stripping debug information of a relocatable object to a file > Who uses it: None of the 4 projects considered > > ```sh > objcopy --only-keep-debug foo.o foo.debug > objcopy --strip-debug foo.o foo.o > ``` > > When it is useful: > In distribution of an SDK in the form of an archive it would be nice to strip > this information. This allows debug information to be distributed separately. > > 3. Use Case: Stripping debug information of a shared library to a file > Who uses it: None of the 4 projects > > ```sh > objcopy --only-keep-debug foo.so foo.debug > objcopy --strip-debug foo.so foo.so > ``` > > When is it Useful: > Same benefits as the previous case. If you want to distribute a library this > option allows you to distribute a smaller binary while maintaining the ability > to debug. > > 4. Use Case: Stripping an executable > Who uses it: None of the 4 projects > > ```sh > objcopy --strip-all foo foo > ``` > > When is it useful: > Anytime an executable is being distributed and there is no reason to keep > debugging information. This makes the executable smaller than simply > stripping debug info and doesn't produce an extra file. > > 5. Use Case: “Complete stripping” an executable > Who uses it: None of the 4 projects > ```sh > eu-strip --strip-sections foo > ``` > When is it useful: > This is an extreme form of stripping that even strips the section headers > since they are not needed for loading. This is useful in the same contexts as > stripping but some tools and dynamic linkers may be confused by it. This is > possibly only valid on ELF unlike general stripping which is a valid option on > multiple platforms. > > 6. Use Case: DWARF fission > Who uses it: Clang, Fuchsia, Chromium > > ```sh > objcopy --extract-dwo foo foo.debug > objcopy --strip-dwo foo foo > ``` > > [Example use 1](https://github.com/llvm-mirror/clang/blob/3efd04e48004628cfaffead00ecb1c206b0b6cb2/lib/Driver/ToolChains/CommonArgs.cpp <https://github.com/llvm-mirror/clang/blob/3efd04e48004628cfaffead00ecb1c206b0b6cb2/lib/Driver/ToolChains/CommonArgs.cpp>) > [Example use 2](https://github.com/llvm-mirror/clang/blob/a0badfbffbee71c2c757d580fc852d2124dadc5a/test/Driver/split-debug.s <https://github.com/llvm-mirror/clang/blob/a0badfbffbee71c2c757d580fc852d2124dadc5a/test/Driver/split-debug.s>) > > When is it useful: > DWARF fission can be used to speed up large builds. In some cases builds can > be too large to be handled and DWARF fission makes this manageable. DWARF > fission is useful in almost any project of sufficient size. > > 7. Use Case: Converting an executable to binary > Who uses it: Fuchsia > > ```sh > objcopy -O binary magenta.elf magenta.bin > ``` > > [Example use](https://fuchsia.googlesource.com/magenta/+/master/make/build.mk#20 <https://fuchsia.googlesource.com/magenta/+/master/make/build.mk#20>) > > When is it useful: > For kernels and embedded applications that need just the raw segments. > > 8. Use Case: Adding a gdb index > Who uses it: Chromium > > ```sh > gdb -batch foo -ex "save gdb-index dir" -ex quit > objcopy --add-section .gdb_index="dir/foo.gdb-index" \ > --set-section-flags .gdb_index=readonly foo foo > ``` > > [Example use](https://cs.chromium.org/chromium/src/build/gdb-add-index?type=cs&q=objcopy&l=71 <https://cs.chromium.org/chromium/src/build/gdb-add-index?type=cs&q=objcopy&l=71>) > > When is it useful: > Adding a gdb index reduces startup time for debugging an application. Any > sufficiently large program with a sufficiently large amount of debug > information can potentially benefit from this. > > 9. Use Case: Converting between formats > Who uses it: Fuchsia (only in Magenta GCC build) > > ```sh > objcopy --target=pei-x86-64 magenta.elf megenta.pe <http://megenta.pe/> > ``` > > [Example use](https://fuchsia.googlesource.com/magenta/+/master/bootloader/build.mk#97 <https://fuchsia.googlesource.com/magenta/+/master/bootloader/build.mk#97>) > > When is it useful: > This is primarily useful when you can’t directly target a needed format. > > 10. Use Case: Removing symbols not needed for relocation > Who uses it: Chromium > > ```sh > objcopy --strip-unneeded foo foo > ``` > > [Example use](https://cs.chromium.org/chromium/src/third_party/libevdev/src/common.mk?type=cs&q=objcopy&l=397 <https://cs.chromium.org/chromium/src/third_party/libevdev/src/common.mk?type=cs&q=objcopy&l=397>) > > When is it useful: > This is useful when shipping an SDK or some relocatable binaries. > > 11. Use Case: Removing local symbols > Who uses it: LLVM > > ```sh > objcopy --discard-all foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake <https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake>) > (hidden in definition of “strip_command” using strip instead of objcopy and > using -x instead of --discard-all) > > When is it useful: > Anytime you don’t need locals for debugging this can be useful. > > 12. Use Case: Removing a specific unwanted section > Who uses it: LLVM > > ```sh > objcopy --remove-section=.debug_aranges foo foo > ``` > > [Example use](https://github.com/llvm-mirror/llvm/blob/93e6e5414ded14bcbb233baaaa5567132fee9a0c/test/DebugInfo/Inputs/fission-ranges.cc <https://github.com/llvm-mirror/llvm/blob/93e6e5414ded14bcbb233baaaa5567132fee9a0c/test/DebugInfo/Inputs/fission-ranges.cc>) > > When is it useful: > This is useful when you know that you have an unwanted section that isn’t > removed by one of the other stripping options. This can also be used to > remove an existing section for replacement by a new section. > > We would like to build this up incrementally by solving specific use cases > as they come up. To start with we would like to tackle the use cases > important to us. We primarily care about fully linked executables and not > relocatable files. I plan to implement conversion from ELF to binary first. > After that I plan on implementing stripping for ELF executables. > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170602/f3807549/attachment.html>
Hi Jake, I'm basically going to echo Sean here and don't have a lot to add to what he said. Being able to get the writing aspect into a decent place is good. You'll want to take a look at dsymutil as an approach if not necessarily what a final approach should look like. Perhaps sharing code with lld would work too. Anyhow, feel free to send messages with what you need and any help or support as I'm particularly interested in this. Thanks! -eric On Thu, Jun 1, 2017 at 5:21 PM Jake Ehrlich via llvm-dev < llvm-dev at lists.llvm.org> wrote:> LLVM already implements its own version of almost all of binutils. The > exceptions to this rule are objcopy and strip. This is a proposal to > implement > an llvm version of objcopy/strip to complete llvm’s binutils. > > Several projects only use gnu binutils because of objcopy/strip. LLVM > itself > uses objcopy in fact. Chromium and Fuchsia currently use objcopy as well. > If you > want to distribute your build tools this is a problem due to licensing. > It’s > also a bit of a blemish on LLVM because LLVM could be made more self > sufficient > if there was an llvm version of objcopy. Additionally Chromium is one of > the > popular benchmarks for LLVM so it would be nice if Chromium didn’t have to > use > binutils. Using > [elftoolchain](https://sourceforge.net/p/elftoolchain/wiki/Home/) > solves the licensing issue for Fuchsia but is elf specific and only solves > the > issue for Fuchsia. I propose implementing llvm-objcopy to be a minimum > viable > replacement for objcopy. > > I’ve gone though the sources of LLVM, Clang, Chromium, and Fuchsia to try > and > find the major use cases of objcopy. Here is a list of use cases I have > found > and which projects use them. This list includes some use cases not found in > these 4 projects. > > 1. Use Case: Stripping debug information of an executable to a file > Who uses it: LLVM, Fuchsia, Chromium > > ```sh > objcopy --only-keep-debug foo foo.debug > objcopy --strip-debug foo foo > ``` > > [Example use]( > https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake > ) > When it is useful: > This reduces the size of the file for distribution while maintaining > the debug > information in a file for later use. Anyone distributing an executable > in > anyway could benefit from this. > > 2. Use Case: Stripping debug information of a relocatable object to a file > > Who uses it: None of the 4 projects considered > > ```sh > objcopy --only-keep-debug foo.o foo.debug > objcopy --strip-debug foo.o foo.o > ``` > > When it is useful: > In distribution of an SDK in the form of an archive it would be nice to > strip > this information. This allows debug information to be distributed > separately. > > 3. Use Case: Stripping debug information of a shared library to a file > Who uses it: None of the 4 projects > > ```sh > objcopy --only-keep-debug foo.so foo.debug > objcopy --strip-debug foo.so foo.so > ``` > > When is it Useful: > Same benefits as the previous case. If you want to distribute a library > this > option allows you to distribute a smaller binary while maintaining the > ability > to debug. > > 4. Use Case: Stripping an executable > Who uses it: None of the 4 projects > > ```sh > objcopy --strip-all foo foo > ``` > > When is it useful: > Anytime an executable is being distributed and there is no reason to > keep > debugging information. This makes the executable smaller than simply > stripping debug info and doesn't produce an extra file. > > 5. Use Case: “Complete stripping” an executable > Who uses it: None of the 4 projects > ```sh > eu-strip --strip-sections foo > ``` > When is it useful: > This is an extreme form of stripping that even strips the section > headers > since they are not needed for loading. This is useful in the same > contexts as > stripping but some tools and dynamic linkers may be confused by it. > This is > possibly only valid on ELF unlike general stripping which is a valid > option on > multiple platforms. > > 6. Use Case: DWARF fission > Who uses it: Clang, Fuchsia, Chromium > > ```sh > objcopy --extract-dwo foo foo.debug > objcopy --strip-dwo foo foo > ``` > > [Example use 1]( > https://github.com/llvm-mirror/clang/blob/3efd04e48004628cfaffead00ecb1c206b0b6cb2/lib/Driver/ToolChains/CommonArgs.cpp > ) > [Example use 2]( > https://github.com/llvm-mirror/clang/blob/a0badfbffbee71c2c757d580fc852d2124dadc5a/test/Driver/split-debug.s > ) > > When is it useful: > DWARF fission can be used to speed up large builds. In some cases > builds can > be too large to be handled and DWARF fission makes this manageable. > DWARF > fission is useful in almost any project of sufficient size. > > 7. Use Case: Converting an executable to binary > Who uses it: Fuchsia > > ```sh > objcopy -O binary magenta.elf magenta.bin > ``` > > [Example use]( > https://fuchsia.googlesource.com/magenta/+/master/make/build.mk#20) > > When is it useful: > For kernels and embedded applications that need just the raw segments. > > 8. Use Case: Adding a gdb index > Who uses it: Chromium > > ```sh > gdb -batch foo -ex "save gdb-index dir" -ex quit > objcopy --add-section .gdb_index="dir/foo.gdb-index" \ > --set-section-flags .gdb_index=readonly foo foo > ``` > > [Example use]( > https://cs.chromium.org/chromium/src/build/gdb-add-index?type=cs&q=objcopy&l=71 > ) > > When is it useful: > Adding a gdb index reduces startup time for debugging an application. > Any > sufficiently large program with a sufficiently large amount of debug > information can potentially benefit from this. > > 9. Use Case: Converting between formats > Who uses it: Fuchsia (only in Magenta GCC build) > > ```sh > objcopy --target=pei-x86-64 magenta.elf megenta.pe > ``` > > [Example use]( > https://fuchsia.googlesource.com/magenta/+/master/bootloader/build.mk#97) > > When is it useful: > This is primarily useful when you can’t directly target a needed format. > > 10. Use Case: Removing symbols not needed for relocation > Who uses it: Chromium > > ```sh > objcopy --strip-unneeded foo foo > ``` > > [Example use]( > https://cs.chromium.org/chromium/src/third_party/libevdev/src/common.mk?type=cs&q=objcopy&l=397 > ) > > When is it useful: > This is useful when shipping an SDK or some relocatable binaries. > > 11. Use Case: Removing local symbols > Who uses it: LLVM > > ```sh > objcopy --discard-all foo foo > ``` > > [Example use]( > https://github.com/llvm-mirror/llvm/blob/cd789d8cfe12aa374e66eafc748f4fc06e149ca7/cmake/modules/AddLLVM.cmake > ) > (hidden in definition of “strip_command” using strip instead of > objcopy and > using -x instead of --discard-all) > > When is it useful: > Anytime you don’t need locals for debugging this can be useful. > > 12. Use Case: Removing a specific unwanted section > Who uses it: LLVM > > ```sh > objcopy --remove-section=.debug_aranges foo foo > ``` > > [Example use]( > https://github.com/llvm-mirror/llvm/blob/93e6e5414ded14bcbb233baaaa5567132fee9a0c/test/DebugInfo/Inputs/fission-ranges.cc > ) > > When is it useful: > This is useful when you know that you have an unwanted section that > isn’t > removed by one of the other stripping options. This can also be used to > remove an existing section for replacement by a new section. > > We would like to build this up incrementally by solving specific use cases > as they come up. To start with we would like to tackle the use cases > important to us. We primarily care about fully linked executables and not > relocatable files. I plan to implement conversion from ELF to binary first. > After that I plan on implementing stripping for ELF executables. > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170602/254b02b0/attachment.html>
On Fri, Jun 2, 2017 at 2:34 PM, Ed Maste via llvm-dev < llvm-dev at lists.llvm.org> wrote:> One additional use case for you: converting from a binary to an ELF object > file > ``` > objcopy -I binary -O elf64-x86-64 foo.bin foo.o > ``` > This is sometimes used for embedding binary files for use by drivers and > such. >Yea, unfortunately the command-line you actually end up needing is more like: objcopy -I binary -Bi386:x86-64 -Oelf64-x86-64 --rename-section .data=.rodata,alloc,load,readonly,data,contents --add-section .note.GNU-stack=/dev/null Having to manually invoke objcopy and know what to specify for the -B and -O options, and to know you need the .note.GNU-stack section, and how to move it into rodata...it's really all quite terrible. Nobody should have to do that. :( There's also the "-b binary" flag to GNU ld (both bfd and gold). But, you typically need to do a dedicated "link" for that. You do: ld -r -b binary picture.jpg -o foo.o How does ld know what output format to use here? It's gotta just choose the default, which is kinda poor...or the user needs to know how to spell an "emulation" and output format... You could imagine trying to use -Wl to put it with the compile command, but what do you use to switch back to the normal object format? gcc main.c -Wl,-b -Wl,binary -Wl,picture.jpg -Wl,-b -Wl,<<something to undo binary mode?>> So, anyways, while this is _possible_ with objcopy, it'd sure be nice if you never needed to use it for that... (BTW, Apple ld actually has an option "-sectcreate SEGNAME SECTNAME INPUT_FILE", and the clang driver will pass it through to the linker.) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170602/2a354042/attachment.html>