JF Bastien
2014-Dec-19 20:59 UTC
[LLVMdev] [Patches][RFC] What to do about bitcode streaming.
Hi Rafael, Would you mind waiting for Derek to come back from vacation to discuss this? We do use this code and could improve how it's used and tested within LLVM. Derek is the best person to discuss this, he'll be back in mid-January. Thanks, JF On Fri, Dec 19, 2014 at 6:41 AM, Rafael Espíndola < rafael.espindola at gmail.com> wrote:> > > I CC'ed llvmdev to put a few more eyes on the "what's the right > > direction?" question. > > > > IMO these both look like huge improvements. The streaming interface was > > the strangest part of the bitcode reader when I was trying to figure out > > how it all fit together for the use-list-order work. > > > > Personally I favour the "just delete it" path, but maybe there's > > something I'm missing, and the other patch looks great too. > > > > I didn't read carefully yet (although I noticed two quirks in the first > > patch that I've pointed out below) -- I'll have a closer look once > > you've decided on a direction. > > Thanks. Looks like the clang-format I had installed was a bit too old. > I have also rebased and removed the debug output. > > Cheers, > Rafael > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141219/31ad1c5c/attachment.html>
Rafael Espíndola
2014-Dec-19 21:03 UTC
[LLVMdev] [Patches][RFC] What to do about bitcode streaming.
On 19 December 2014 at 15:59, JF Bastien <jfb at chromium.org> wrote:> Hi Rafael, > > Would you mind waiting for Derek to come back from vacation to discuss this? > We do use this code and could improve how it's used and tested within LLVM. > Derek is the best person to discuss this, he'll be back in mid-January.What about the known-size patch? Can you check if that would work for you? The issue is not so much having it tested. It is having a sane and easy to understand interface. Cheers, Rafael
Jan Voung
2014-Dec-19 23:56 UTC
[LLVMdev] [Patches][RFC] What to do about bitcode streaming.
Hi Rafael, We will try out your patch and check to see how it will fit. You also talked about "It might be even possible to drop the requirement for the size to be known: Replace the call to AtEndOfStream by just trying to read more and checking if it failed, but that is a bit more than I wanted to do for this." That is to remove some calls to getSize()? Is there any expectation that getPointer() will always a pointer within a contiguous region of memory? That is, getPointer() is non-virtual and always refers to the Data field, but the overriding implementation could still dynamically grow some buffer and change Data to point to a new buffer (could that be "protected" instead?)? *) We do have a fork of the bitcode reader and bitstream reader. That said, we still use the upstream bitcode reader / bitstream reader in the browser w/ PNaCl. This is used for debugging non-guaranteed-to-be-stable temporary copies of apps in the browser ( https://developer.chrome.com/native-client/devguide/devcycle/debugging#debugging-pnacl-pexes-pepper-35-or-later ). That said, the overlapped compile/download is probably not a big deal for the debugging use case. *) It looks like gzip encoded files will have content-length set to the size of the *gzipped* body, not the size of the original bitcode: http://stackoverflow.com/questions/3819280/content-length-when-using-http-compression. So we'll need to be careful and treat that as "unknown", which is a bit unfortunate because gzip helps reduce bandwidth requirements. *) Otherwise, not all HTTP responses include a content-length, for various reasons, but I don't know how common this is, but I can try to ask chromium-dev or some other mailing list and see how common that is. *) The HTTP response can lie, or the server can be buggy, and give the wrong content-length. I think we can try to do the safe thing here, but will have to be careful. ==== Misc review comments: class DataStreamer { -public: - /// Fetch bytes [start-end) from the stream, and write them to the - /// buffer pointed to by buf. Returns the number of bytes actually written. - virtual size_t GetBytes(unsigned char *buf, size_t len) = 0; + uint64_t Size; + const uint8_t *Data; + + virtual bool advanceToPosImpl(uint64_t Pos) = 0; +public: + const uint8_t *getPointer(uint64_t Pos, unsigned Size); + DataStreamer(); virtual ~DataStreamer(); -}; -DataStreamer *getDataFileStreamer(const std::string &Filename, - std::string *Err); + uint64_t getSize() const { return Size; } + void setSize(uint64_t Val) { Size = Val; } + +protected: + // It is valid to access Data[0..MaxPos) + uint64_t MaxPos; + void init(uint64_t Size, const uint8_t *Data); +}; } Would be nice to document why there is an advanceToPosImpl, etc. Otherwise the patch is removing a bunch of code/comment that refer to the streaming use case, so if that use case is no longer known, then I imagine folks would try to simplify things further =) - // At this point, if there are any function bodies, the current bit is - // pointing to the END_BLOCK record after them. Now make sure the rest - // of the bits in the module have been read. - if (NextUnreadBit) - ParseModule(true); Hmm, didn't read too closely, but just looks different from what it used to do... + const uint8_t *OrigBufPtr = BufPtr; // If we have a wrapper header, parse it and ignore the non-bc file contents. // The magic number is 0x0B17C0DE stored in little endian. if (isBitcodeWrapper(BufPtr, BufEnd)) if (SkipBitcodeWrapperHeader(BufPtr, BufEnd, true)) return Error(BitcodeError::InvalidBitcodeWrapperHeader); //... - if (isBitcodeWrapper(buf, buf + 4)) { - const unsigned char *bitcodeStart = buf; - const unsigned char *bitcodeEnd = buf + 16; - SkipBitcodeWrapperHeader(bitcodeStart, bitcodeEnd, false); - Bytes.dropLeadingBytes(bitcodeStart - buf); - Bytes.setKnownObjectSize(bitcodeEnd - bitcodeStart); - } + uint64_t Skip = BufPtr - OrigBufPtr; + uint64_t Size = BufEnd - BufPtr + Skip; + StreamFile->setSize(Size); + Stream.JumpToBit(Skip * 8); Not quite clear to me that this is the same. Do we have a test of llvm-dis w/ files that have a bitcode wrapper + misc data surrounding the true bitcode contents? The old dropLeadingBytes would track how many bytes were skipped so that later, reading address N would mean reading N + BytesSkipped. It also subtracted the skipped bytes from BytesRead, and I haven't looked closely enough to see if there is similar state being tracked anymore. @@ -218,18 +214,13 @@ public: void freeState(); bool canSkipToPos(size_t pos) const { - // pos can be skipped to if it is a valid address or one byte past the end. - return pos == 0 || BitStream->getBitcodeBytes().isValidAddress( - static_cast<uint64_t>(pos - 1)); + size_t Size = BitStream->getSize(); + return pos < Size; } Is the "pos can be skipped ... or one byte past the end" behavior still needed, or was it some artifact of how isValidAddress worked? On Fri Dec 19 2014 at 1:03:30 PM Rafael Espíndola < rafael.espindola at gmail.com> wrote:> On 19 December 2014 at 15:59, JF Bastien <jfb at chromium.org> wrote: > > Hi Rafael, > > > > Would you mind waiting for Derek to come back from vacation to discuss > this? > > We do use this code and could improve how it's used and tested within > LLVM. > > Derek is the best person to discuss this, he'll be back in mid-January. > > What about the known-size patch? Can you check if that would work for you? > > The issue is not so much having it tested. It is having a sane and > easy to understand interface. > > Cheers, > Rafael >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141219/f8db3e31/attachment.html>