Adve, Vikram Sadanand via llvm-dev
2016-Feb-08 16:31 UTC
[llvm-dev] [cfe-dev] [RFC] Embedding Bitcode in Object Files
Steven, How much of the code you’re upstreaming is specific to MacOS? Is it close to being usable on Linux? Some of the comments about Mach-O make it sound MacOS-specific, but you also talked about unifying it with the .llvmbc implementation previously. FYI, two more use cases, which we are interested in: (1) Autotuning the generated code on a target machine, using the embedded bitcode as a starting point. We have a limited prototype that searches through combinations of Clang command-line options, similar to Milepost GCC. We assume for now that we have a linked bitcode, e.g., the output of LTO; to be usable in practice, the bitcode would need to be embedded with the binary. (2) Dynamic optimization, using the LLVM bitcode for subsets of the program. --Vikram // Vikram S. Adve // Professor, Department of Computer Science // University of Illinois at Urbana-Champaign // http://llvm.org <http://llvm.org/> // On 2/4/16, 6:20 PM, "llvm-dev on behalf of via llvm-dev" <llvm-dev-bounces at lists.llvm.org on behalf of llvm-dev at lists.llvm.org> wrote:>Date: Thu, 04 Feb 2016 14:59:03 -0800 >From: Steven Wu via llvm-dev <llvm-dev at lists.llvm.org> >To: Sergei Larin <slarin at codeaurora.org> >Cc: llvm-dev at lists.llvm.org, cfe-dev <cfe-dev at lists.llvm.org> >Subject: Re: [llvm-dev] [cfe-dev] [RFC] Embedding Bitcode in Object > Files >Message-ID: <1C9491F7-D36F-46C6-9F4E-8DBC4CB703D8 at apple.com> >Content-Type: text/plain; charset=utf-8 > >Hi Sergei and Rafael > >Thanks for the comment! > >In terms of bitcode section, my plan is to make "__LLVM, __bitcode" section the MachO version of ".llvmbc" section. In latest Darwin OS, "__LLVM" segment will not be loaded by dyld when you try to execute a binary with embedded bitcode which is a plus for this feature. > >And for the command line, Sergei has the correct idea about the motivation behind this. We want to have enough information to recreate the same binary from the embedded bitcode (at least when compiled with the same compiler). Here is an example: >$ clang -fembed-bitcode -O0 test.c -c -### >"clang" "-cc1" (...lots of options...) "-o" "test.bc" "-x" "c" "test.c" <--- First stage >"clang" "-cc1" "-triple" "x86_64-apple-macosx10.11.0" "-emit-obj" "-fembed-bitcode" "-O0" "-disable-llvm-optzns" "-o" "test.o" "-x" "ir" "test.bc" <--- Second stage >If we record all the options from the second stage, we can recreate the same object file using the exact same command. So, yes, they are cc1 flags. I understand they are no stable but second stage can only have a handful of options that: 1. affects codegen. 2. not embedded in the bitcode that should be record. This list should be shrinking towards zero eventually (not sure about -O0 and other optimization options). If we have to rename them before removing them from the embedding option list, we can provide upgrade for them. > >This feature is orthogonal to LTO. For my current implementation, "-flto -fembed-bitcode" is the same as "-flto". Linker need to have the logic to handle a llvm bitcode file (treated as LTO) and a macho file with embedded bitcode (treated as normal link) differently. > >Thanks > >Steven > > >>On Feb 4, 2016, at 2:18 PM, Sergei Larin <slarin at codeaurora.org> wrote: >>Steven, >> I would like to echo Rafael's comments. >>My general understanding is that given an object file with embedded IR I should be able to reproduce the same object. >>Everything else should be "supporting" that objective... which might include relevant flags and transformations leading _to_ this IR and _from_ this IR to the given object code. >>Does my understanding matches your overall goal? >>Thanks. >>Sergei >>--- >>Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation >>>-----Original Message----- >>>From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Rafael >>>Espíndola via llvm-dev >>>Sent: Thursday, February 04, 2016 4:01 PM >>>To: Steven Wu >>>Cc: LLVM Developers Mailing List; cfe-dev >>>Subject: Re: [llvm-dev] [cfe-dev] [RFC] Embedding Bitcode in Object Files >>>On 3 February 2016 at 14:01, Steven Wu via llvm-dev <llvm- >>>dev at lists.llvm.org> wrote: >>>>Hi Peter >>>>It is not currently related because we started the implementation >>>>before Thin-LTO gets proposed in the community but our "__LLVM, >>>>__bitcode" section is pretty much the same as ".llvmbc" section. Note >>>>".llvmbc" doesn't really follow the section naming convention for >>>>MachO objects. I am hoping to unify them during the upstream of the >>>implementation. >>>That would be my main request. Seems like a nice feature, but we should >>>have one implementation of it :-) >>>BTW, can you explain a bit why you need things like "-O0" recorded? In case >>>you want to go from bitcode back to object file one file at a time (no LTO)? Is >>>that orthogonal? That is, should the command line be included in .bc files >>>too? What is the command line option that is included, the -cc1 or the driver >>>one? >>>There was some discussion on the past about which options get run in clang if >>>given -flto. For example, it seems likely that a more conservative inlining pass >>>would be a good thing to not remove opportunities for the link time inlining. >>>What would happen with "-flto -fembed-bitcode"? Would the bitcode be the >>>same as with just -flto and the object file less optimized? >>>Cheers, >>>Rafael >>>_______________________________________________ >>>LLVM Developers mailing list >>>llvm-dev at lists.llvm.org >>>http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >