thr3ads.net - llvm dev - [LLVMdev] MemoryBuffer/raw_ostream hybrid for linker? [May 2012]

If this information is useful, please help other people find it:
Share via:

Nick Kledzik

2012-May-04 01:10 UTC

[LLVMdev] MemoryBuffer/raw_ostream hybrid for linker?

Existing llvm code tends to use raw_ostream for writing files.  But raw_ostream
is not a good match for a linker for a couple of reasons:

1) When the linker creates an executable, the file needs the 'x' bit
set.  Currently raw_fd_ostream has no way to set that.

2) The Unix conformance suite actually has some test cases where the linker is
run and the output file does exists but is not writable, or is not writable but
is in a writable directory, or with funky umask values.   raw_fd_ostream
interface has no way to match those semantics.

3) On darwin we have found the linker performs better if it opens the output
file, truncates it to the output size, then mmaps in the file, then writes
directly into that memory buffer.  This avoids the memory copy from the private
buffer to the OS file system buffer in the write() syscall.

4) In the model we are using for lld, a streaming output interface is not
optimal.   Currently, lld copies chunks of code from the (read-only) input
files, to a temporary buffer, then applies any fixups (relocations), then
streams out that temporary buffer.  If instead we had a big output buffer, the
linker could copy the code chunks directly to the output buffer and apply the
fixups there, avoiding an extra copy.

Is there an existing solution for these issues in llvm I've overlooked?  
I've searched the bug database and did not find any similar requests.

Should I propose a new llvm/Support/ class?

-Nick

Chris Lattner

2012-May-04 17:10 UTC

head link

[LLVMdev] MemoryBuffer/raw_ostream hybrid for linker?

On May 3, 2012, at 6:10 PM, Nick Kledzik wrote:
> Existing llvm code tends to use raw_ostream for writing files.  But
raw_ostream is not a good match for a linker for a couple of reasons:
> 
> 1) When the linker creates an executable, the file needs the 'x'
bit set.  Currently raw_fd_ostream has no way to set that.
If this were the only problem, I'd suggest just generalizing raw_fd_ostream
to support this use case.  It would be straight-forward to do.
> 2) The Unix conformance suite actually has some test cases where the linker
is run and the output file does exists but is not writable, or is not writable
but is in a writable directory, or with funky umask values.   raw_fd_ostream
interface has no way to match those semantics.
If this were the only problem :), I would suggest using a new raw_ostream
subclass, where you do custom stuff to get the file system stuff happening that
you want, but reuse all the streaming API aspect of raw_ostream.
> 3) On darwin we have found the linker performs better if it opens the
output file, truncates it to the output size, then mmaps in the file, then
writes directly into that memory buffer.  This avoids the memory copy from the
private buffer to the OS file system buffer in the write() syscall.
This should also be possible with raw_ostream.
> 4) In the model we are using for lld, a streaming output interface is not
optimal.
This is a show-stopper for raw_ostream.  I really don't want raw_ostream to
support seeking or other non-stream behavior. :)
> CIs there an existing solution for these issues in llvm I've
overlooked?   I've searched the bug database and did not find any similar
requests.
> 
> Should I propose a new llvm/Support/ class?
Yes please.  We have a variety of other places in the codebase that are using
open/close/read etc directly (grep for #include's of unistd.h).  Using these
APIs is annoying because they are really low level (an expose nonsense like
EINTR handling) and that windows make them annoying to use.

Having a better wrapper for doing real low-level file system stuff (including
seeking) makes perfect sense for llvm/Support!

-Chris

Nick Kledzik

2012-May-07 19:56 UTC

head link

[LLVMdev] [RFC] llvm/include/Support/OutputBuffer.h

For the reasons listed in my 03-May-2012 email, I am proposing a new
llvm/Support class for using in writing binary files:

/// OutputBuffer - This interface provides simple way to create an in-memory
/// buffer which when done will be written to a file. During the lifetime of 
/// these  objects, the content or existence of the specified file is undefined.
/// That is, creating an OutputBuffer for a file may immediately remove the 
/// file.
/// If the OutputBuffer is committed, the target file's content will become 
/// the buffer content at the time of the commit.  If the OutputBuffer is not 
/// committed, the file will be deleted in the OutputBuffer buffer destructor.
class OutputBuffer {
public:
  enum Flags {
    F_executable = 1, /// set the 'x' bit on the resulting file
  }; 

  /// Factory method to create an OutputBuffer object which manages a read/write
  /// buffer of the specified size. When committed, the buffer will be written
  /// to the file at the specified path.  
  static error_code createFile(StringRef filePath, Flags flags, size_t size, 
                               OwningPtr<OutputBuffer> &result);
  

  /// Returns a pointer to the start of the buffer.
  uint8_t *bufferStart();
  
  /// Returns a pointer to the end of the buffer.
  uint8_t *bufferEnd();
  
  /// Returns size of the buffer.
  size_t size();
    
  /// Flushes the content of the buffer to its file and deallocates the 
  /// buffer.  If commit() is not called before this object's destructor
  /// is called, the file is deleted in the destructor. The optional parameter
  /// is used if it turns out you want the file size to be smaller than
  /// initially requested.
  void commit(int64_t newSmallerSize = -1);
};


The Flags will probable need to be extended over time to handle other clients
needs.

For Unix/Darwin, my plan is to implement this by:
1) delete the file
2) create a new file with a random name in same directory
3) truncate the file to the new size
4) mmap() in the file r/w
5) On commit, unmap the file, rename() to final name
6) In destructor, if not committed, unmap, delete the randomly named file
 
I'll leave the windows implementation empty and let someone with windows
experience do the implementation.

Comments? Suggestions?

-Nick

> On May 3, 2012, at 6:10 PM, Nick Kledzik wrote:
> Existing llvm code tends to use raw_ostream for writing files.  But
raw_ostream is not a good match for a linker for a couple of reasons:
> 
> 1) When the linker creates an executable, the file needs the 'x'
bit set.  Currently raw_fd_ostream has no way to set that.
> 
> 2) The Unix conformance suite actually has some test cases where the linker
is run and the output file does exists but is not writable, or is not writable
but is in a writable directory, or with funky umask values.   raw_fd_ostream
interface has no way to match those semantics.
> 
> 3) On darwin we have found the linker performs better if it opens the
output file, truncates it to the output size, then mmaps in the file, then
writes directly into that memory buffer.  This avoids the memory copy from the
private buffer to the OS file system buffer in the write() syscall.
> 
> 4) In the model we are using for lld, a streaming output interface is not
optimal.   Currently, lld copies chunks of code from the (read-only) input
files, to a temporary buffer, then applies any fixups (relocations), then
streams out that temporary buffer.  If instead we had a big output buffer, the
linker could copy the code chunks directly to the output buffer and apply the
fixups there, avoiding an extra copy.
> -------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120507/17766bbd/attachment.html>

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - May 2012 - [LLVMdev] MemoryBuffer/raw_ostream hybrid for linker?

[LLVMdev] MemoryBuffer/raw_ostream hybrid for linker?

[LLVMdev] MemoryBuffer/raw_ostream hybrid for linker?

[LLVMdev] [RFC] llvm/include/Support/OutputBuffer.h

Seemingly Similar Threads