Stephen Hines
2014-Nov-06 23:18 UTC
[LLVMdev] Using the unused "version" field in the bitcode wrapper (redux)
On Tue, Nov 4, 2014 at 9:10 PM, Bruce Hoult <bruce at hoult.org> wrote:> On Wed, Nov 5, 2014 at 2:30 PM, Sean Silva <chisophugis at gmail.com> wrote: > >> > Does Apple support library/middleware providers shipping bitcode instead >> >>> > of object code? >>> >>> No. >>> >> >> Are there ever any plans to do so? >> (this question also goes out to every other vendor that is shipping an >> LTO toolchain or plans to. Chad?) >> >> I'm just trying to figure out how much of a Sony-specific issue this is. >> > > The new Andoid ART compiler compiles Dalvik bytecode in standard APKs to > native code on the phone at install time. It also has in the source code > a, possibly still experimental, "portable mode" that compiles to LLVM > bitcode instead. >This is completely inaccurate. ART's portable mode used LLVM's IRBuilder to construct IR and then lower it immediately, as it is an ahead-of-time compiler that executes on the device itself. It never stores the IR out to disk. Applications continue to use dex for portable distribution.> > I assume (but don't know) this means this would happen on the application > developer's host machine and then be distributed in the Play Store (or > otherwise) as bitcode. > > Which would raise large and definite bitcode versioning problems. >On the other hand, Android RenderScript does use LLVM bitcode as its portable storage format. We have bitcode writers stretching back to 2 funky versions of LLVM 2.9 (really pre-3.0). The default writer for us is based on 3.2, and we upconvert any pre-3.x bitcode that we have to 3.2 (or something more modern if we can). There have definitely been issue in the past (attribute changes, removal/update of opcodes, etc.), but we have always found a way to adapt. We are definitely aware that any 4.x change could break our readers/writers, although we remain hopeful that we will adapt once again. All of our bitcode conversion/translation tools are available in the public Android open source project. Note that we only need to deal with the C subset of things (so no exception handling), and we currently will drop/ignore any metadata that might break things (so debugging has been a challenge). Steve> > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141106/bfcf6aa6/attachment.html>
Rafael EspĂndola
2014-Nov-07 18:42 UTC
[LLVMdev] Using the unused "version" field in the bitcode wrapper (redux)
Reading the bitcode reader while working on another issues I found that we already have a version in the bitcode itself (not the darwin wrapper) and it is used! It is stored with the bitc::MODULE_CODE_VERSION. It is used to select relative ids, which impacts the entire bitcode, and so it makes sense to be based on a version. If we ever have a new feature that could not be otherwise detected, bumping the number is a reasonable way of making sure old versions of llvm will reject new bitcode instead of misinterpreting it. *If* once we get to 4.0 it looks like there would be a big win by dropping support for 3.x in 4.1 *and* the stuff we want to drop support for is not easy to identify, then bumping this in 4.0 would also make sense. Cheers, Rafael
Robinson, Paul
2014-Nov-21 21:31 UTC
[LLVMdev] Using the unused "version" field in the bitcode wrapper (redux)
> Reading the bitcode reader while working on another issues I found > that we already have a version in the bitcode itself (not the darwin > wrapper) and it is used! It is stored with the > bitc::MODULE_CODE_VERSION. It is used to select relative ids, which > impacts the entire bitcode, and so it makes sense to be based on a > version. > > If we ever have a new feature that could not be otherwise detected, > bumping the number is a reasonable way of making sure old versions of > llvm will reject new bitcode instead of misinterpreting it.Right, that version number is used to resolve *ambiguities* in how to interpret some chunk of bitcode. It is not a generic bitcode version scheme, because most bitcode format changes involve things like adding new operands or opcodes, which are easily identified without needing an explicit version number. The scenario I am most concerned about is this: - We as a vendor publish toolchain #12 based on SVN r250000. - During subsequent LLVM development, changes happen (!). For example, a new key letter 'g' in the Data Layout. This is not a bitcode ambiguity so MODULE_CODE_VERSION is unchanged. - We as a vendor publish toolchain #13 based on SVN r300000. - Some middleware provider publishes libIncrediblyUseful.bc built using spiffy new toolchain #13. - Some hapless game developer tries to use libIncrediblyUseful.bc but is still on toolchain #12. This causes an error during some LTO build phase, of course; the question is, what kind of error and how does Hapless Game Developer know what to do? We as compiler developers want to see something along the lines of "unknown data layout specifier." That kind of diagnostic is seriously helpful to the LLVM community, because it describes the actual problem. This does *nothing* for Hapless Game Developer. HGD wants to see "this bitcode file was generated by a newer version, I don't understand how to interpret it" because _that's_ the actual problem. The "actual problem" is context dependent. How can we account for that? Proposed solution: Whether to emit a bitcode wrapper becomes a target-dependent predicate. Bitcode is written by Module, which already has target info attached, so it's a matter of picking some convenient place to keep that info. Initially only Darwin would do this, but it would be a step up from the current explicit triple check. The wrapper has a standard header, same as the current header: - Magic - Version - BitcodeOffset - BitcodeSize The target can supply additional data to put after the header (and before the actual bitcode starts). Darwin would supply the CPUType field like it does now. This is 100% compatible with what exists today, but will be easy to extend for (ahem) other vendors who want wrappers. Any vendor who supports bitcode as a long-lived on-disk format should specify that it wants a wrapper. It is the vendor's responsibility to provide sensible version numbers for successive toolchain releases. The LLVM project does not specify how to come up with version numbers. We default to zero (so Darwin automatically gets its historical value). NOTE: This solution explicitly does NOT solve the "bitcode must be understandable to older toolchains" problem. What it DOES solve is the "older toolchains must provide an easily understood diagnostic when presented with newer bitcode files" problem. Vendor toolchain release scenarios: 1) Releasing based on arbitrary trunk revisions. The vendor's toolchain release number, encoded in to 32 bits, is likely to serve well as the bitcode wrapper version number. If you release strictly from trunk (not release branches) then the SVN revision number from the LLVM repo can also serve this purpose. 2) Releasing strictly based on LLVM releases. Using the LLVM version number, encoded into 32 bits, is a pretty reasonable alternative. Even if you release multiple toolchains from the same LLVM release, the bitcode formats will be the same, so the bitcode wrapper version number can also be the same. --paulr P.S. I think the illustrative example of a new DataLayout specifier would reach an llvm_unreachable, and not emit a proper diagnostic at all. This is part of the generic diagnostics-from-LLVM problem.
Seemingly Similar Threads
- [LLVMdev] Using the unused "version" field in the bitcode wrapper (redux)
- [LLVMdev] Using the unused "version" field in the bitcode wrapper (redux)
- [LLVMdev] Using the unused "version" field in the bitcode wrapper (redux)
- [LLVMdev] Using the unused "version" field in the bitcode wrapper (redux)
- [LLVMdev] [PATCH] fix warning: 'NumFolded' defined but not used