thr3ads.net - llvm dev - [LLVMdev] SmallString + raw_svector_ostream combination should be more efficient [May 2015]

If this information is useful, please help other people find it:
Share via:

Rafael Espíndola

2015-Apr-30 19:48 UTC

[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

I don't think we should make flush virtual. Why do you need to do it?
Can't you set up the base class to write to use the tail of the memory
region as the buffer?



On 24 April 2015 at 06:46, Yaron Keren <yaron.keren at gmail.com>
wrote:> Hi,
>
> Is this what you're thinking about?
> The code is not tested yet, I'd like to know if the overall direction
and
> design are correct.
>
> Yaron
>
>
> 2015-04-21 0:10 GMT+03:00 Yaron Keren <yaron.keren at gmail.com>:
>>
>> Sean, thanks for reminding this, Alp did commit a class derived from
>> raw_svector_ostream conatining an internal SmallString he called
>> small_string_ostream. The commit was reverted after a day due to a
>> disagreement about the commit approval and apparently abandoned.
>>
>>
>>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223393.html
>>
>>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223557.html
>>
>>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140630/223986.html
>>
>> The class Alp comitted did solve the possible mismatch between the
>> SmallString and the stream by making the SmallString private to the
class.
>> This however did not treat the root problem, the duplication of the
>> information about the buffer between SmallString and the stream.
>>
>> I can make performance measurements but even if the performance
difference
>> is neglible (and looking at all the code paths and conditionals of
>> non-inlined functions at raw_ostream.cpp that's hard to believe),
the
>> current SmallString-raw_svector_ostream is simply wrong.
>>
>> Personally, after the previous "Alp" discussion I found and
fixed several
>> such issues in my out-of-tree code only to make new similar new error
>> earlier this year, which I caught only months after, when Rafael
committed
>> the pwrite which reminded me the issue. Due ot the information
duplication
>> Rafael commit also had a bug and Mehdi reports similar experience. Back
then
>> Alp reported similar problems he found in LLVM code which were
hopefully
>> fixed otherwise.
>>
>> In addition to the information duplication, the design is quite
confusing
>> to use
>>
>> - Should one use the stream.str() function or the SmallString itself?
>> - flush or str?
>> - How do you properly clear the combination for reuse (that was my
>> mistake, I forgot to resync after OS.clear())?
>>
>> It's safe to say this design pattern is very easy to get wrong one
way or
>> another, we got burned by it multiple times and it should be replaced.
>>
>> Alp suggested a derived class containing its own SmallString.
That's a
>> simple and effective approach to avoid the bugs mentioned but does not
fix
>> the memory or runtime issues. Small as they may be, why have them at a
>> fundemental data structure?
>>
>> I was thinking about going further avoiding all duplication with a
>> templated class derived with its own internal buffer storage.
>>
>> Rafael suggested a more modular approach, a derived adpater class
adapter
>> to a *simple* buffer (or nullptr for fully-dynamic operation) which
also
>> won't duplicate any information the buffer is dumb and has no
state.
>>
>> That solution sounds simple, efficient and safe to use. The
implementation
>> would be actually simpler then raw_svector_ostream since all the
>> coordination logic is not required anymore.
>>
>>
>> 2015-04-20 22:17 GMT+03:00 Sean Silva <chisophugis at gmail.com>:
>>>
>>>
>>>
>>> On Sun, Apr 19, 2015 at 7:40 AM, Yaron Keren <yaron.keren at
gmail.com>
>>> wrote:
>>>>
>>>> A very common code pattern in LLVM is
>>>>
>>>>  SmallString<128> S;
>>>>  raw_svector_ostream OS(S);
>>>>  OS<< ...
>>>>  Use OS.str()
>>>>
>>>> While raw_svector_ostream is smart to share the text buffer
itself, it's
>>>> inefficient keeping two sets of pointers to the same buffer:
>>>>
>>>>  In SmallString: void *BeginX, *EndX, *CapacityX
>>>>  In raw_ostream: char *OutBufStart, *OutBufEnd, *OutBufCur
>>>>
>>>> Moreover, at runtime the two sets of pointers need to be
coordinated
>>>> between the SmallString and raw_svector_ostream using
>>>> raw_svector_ostream::init, raw_svector_ostream::pwrite,
>>>> raw_svector_ostream::resync and
raw_svector_ostream::write_impl.
>>>> All these functions have non-inlined implementations in
raw_ostream.cpp.
>>>>
>>>> Finally, this may cause subtle bugs if S is modified without
calling
>>>> OS::resync(). This is too easy to do by mistake.
>>>>
>>>> In this frequent case usage the client does not really care
about S
>>>> being a SmallString with its many useful string helper
function. It's just
>>>> boilerplate code for raw_svector_ostream. But it does cost
three extra
>>>> pointers, some runtime performance and possible bugs.
>>>
>>>
>>> I agree the bugs are real (Alp proposed something a while back
regarding
>>> this?), but you will need to provide measurements to justify the
cost in
>>> runtime performance. One technique I have used in the past to
measure these
>>> sorts of things I call "stuffing": take the operation
that you want to
>>> measure, then essentially change the logic so that you pay the cost
2 times,
>>> 3 times, etc. You can then look at the trend in performance as N
varies and
>>> extrapolate back to the case where N = 0 (i.e. you don't pay
the cost).
>>>
>>> For example, in one situation where I used this method it was to
measure
>>> the cost of stat'ing files (sys::fs::status) across a holistic
build, using
>>> only "time" on the command line (it was on Windows and I
didn't have any
>>> tools like DTrace available that can directly measure this). In
order to do
>>> this, I changed sys::fs::status to call stat N times instead of 1,
and
>>> measured with N=1 N=2 N=3 etc. The result was that the difference
between
>>> the N and N+1 versions was about 1-2% across N=1..10 (or whatever I
>>> measured). In order to negate caching and other confounding
effects, it is
>>> important to try different distributions of stats; e.g. the extra
stats are
>>> on the same file as the "real" stat vs. the extra stats
are on nonexistent
>>> files in the same directory as the "real" file vs. parent
directories of the
>>> "real" file; if these match up fairly well (they did),
then you have some
>>> indication that the "stuffing" is measuring what you want
to measure.
>>>
>>> So e.g. if you think the cost of 3 extra pointers is significant,
then
>>> "stuff" the struct with 3, 6, 9, ... extra pointers and
measure the
>>> difference in performance (e.g. measure the time building a real
project).
>>>
>>> -- Sean Silva
>>>
>>>>
>>>>
>>>> To solve all three issues, would it make sense to have
>>>> raw_ostream-derived container with a its own SmallString like
templated-size
>>>> built-in buffer?
>>>>
>>>
>

Yaron Keren

2015-May-01 03:02 UTC

head link

[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

Yes, we should do without virtual flush. The problem was raw_char_ostream
should not be flushed - it's always current - so I overloaded flush to a
no-op. A cleaner solution is attached, adding a DirectBuffer mode to the
raw_ostream.

Also added a simple performance comparison project between raw_char_ostream
and raw_svector_ostream. On Window 7 x64 machine, raw_char_ostream was
three times faster than raw_svector_ostream when the provided buffer size
is large enough and two times as fast with one dynamic allocation resize.


2015-04-30 22:48 GMT+03:00 Rafael Espíndola <rafael.espindola at
gmail.com>:
> I don't think we should make flush virtual. Why do you need to do it?
> Can't you set up the base class to write to use the tail of the memory
> region as the buffer?
>
>
>
> On 24 April 2015 at 06:46, Yaron Keren <yaron.keren at gmail.com>
wrote:
> > Hi,
> >
> > Is this what you're thinking about?
> > The code is not tested yet, I'd like to know if the overall
direction and
> > design are correct.
> >
> > Yaron
> >
> >
> > 2015-04-21 0:10 GMT+03:00 Yaron Keren <yaron.keren at
gmail.com>:
> >>
> >> Sean, thanks for reminding this, Alp did commit a class derived
from
> >> raw_svector_ostream conatining an internal SmallString he called
> >> small_string_ostream. The commit was reverted after a day due to a
> >> disagreement about the commit approval and apparently abandoned.
> >>
> >>
> >>
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223393.html
> >>
> >>
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223557.html
> >>
> >>
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140630/223986.html
> >>
> >> The class Alp comitted did solve the possible mismatch between the
> >> SmallString and the stream by making the SmallString private to
the
> class.
> >> This however did not treat the root problem, the duplication of
the
> >> information about the buffer between SmallString and the stream.
> >>
> >> I can make performance measurements but even if the performance
> difference
> >> is neglible (and looking at all the code paths and conditionals of
> >> non-inlined functions at raw_ostream.cpp that's hard to
believe), the
> >> current SmallString-raw_svector_ostream is simply wrong.
> >>
> >> Personally, after the previous "Alp" discussion I found
and fixed
> several
> >> such issues in my out-of-tree code only to make new similar new
error
> >> earlier this year, which I caught only months after, when Rafael
> committed
> >> the pwrite which reminded me the issue. Due ot the information
> duplication
> >> Rafael commit also had a bug and Mehdi reports similar experience.
Back
> then
> >> Alp reported similar problems he found in LLVM code which were
hopefully
> >> fixed otherwise.
> >>
> >> In addition to the information duplication, the design is quite
> confusing
> >> to use
> >>
> >> - Should one use the stream.str() function or the SmallString
itself?
> >> - flush or str?
> >> - How do you properly clear the combination for reuse (that was my
> >> mistake, I forgot to resync after OS.clear())?
> >>
> >> It's safe to say this design pattern is very easy to get wrong
one way
> or
> >> another, we got burned by it multiple times and it should be
replaced.
> >>
> >> Alp suggested a derived class containing its own SmallString.
That's a
> >> simple and effective approach to avoid the bugs mentioned but does
not
> fix
> >> the memory or runtime issues. Small as they may be, why have them
at a
> >> fundemental data structure?
> >>
> >> I was thinking about going further avoiding all duplication with a
> >> templated class derived with its own internal buffer storage.
> >>
> >> Rafael suggested a more modular approach, a derived adpater class
> adapter
> >> to a *simple* buffer (or nullptr for fully-dynamic operation)
which also
> >> won't duplicate any information the buffer is dumb and has no
state.
> >>
> >> That solution sounds simple, efficient and safe to use. The
> implementation
> >> would be actually simpler then raw_svector_ostream since all the
> >> coordination logic is not required anymore.
> >>
> >>
> >> 2015-04-20 22:17 GMT+03:00 Sean Silva <chisophugis at
gmail.com>:
> >>>
> >>>
> >>>
> >>> On Sun, Apr 19, 2015 at 7:40 AM, Yaron Keren <yaron.keren
at gmail.com>
> >>> wrote:
> >>>>
> >>>> A very common code pattern in LLVM is
> >>>>
> >>>>  SmallString<128> S;
> >>>>  raw_svector_ostream OS(S);
> >>>>  OS<< ...
> >>>>  Use OS.str()
> >>>>
> >>>> While raw_svector_ostream is smart to share the text
buffer itself,
> it's
> >>>> inefficient keeping two sets of pointers to the same
buffer:
> >>>>
> >>>>  In SmallString: void *BeginX, *EndX, *CapacityX
> >>>>  In raw_ostream: char *OutBufStart, *OutBufEnd, *OutBufCur
> >>>>
> >>>> Moreover, at runtime the two sets of pointers need to be
coordinated
> >>>> between the SmallString and raw_svector_ostream using
> >>>> raw_svector_ostream::init, raw_svector_ostream::pwrite,
> >>>> raw_svector_ostream::resync and
raw_svector_ostream::write_impl.
> >>>> All these functions have non-inlined implementations in
> raw_ostream.cpp.
> >>>>
> >>>> Finally, this may cause subtle bugs if S is modified
without calling
> >>>> OS::resync(). This is too easy to do by mistake.
> >>>>
> >>>> In this frequent case usage the client does not really
care about S
> >>>> being a SmallString with its many useful string helper
function. It's
> just
> >>>> boilerplate code for raw_svector_ostream. But it does cost
three extra
> >>>> pointers, some runtime performance and possible bugs.
> >>>
> >>>
> >>> I agree the bugs are real (Alp proposed something a while back
> regarding
> >>> this?), but you will need to provide measurements to justify
the cost
> in
> >>> runtime performance. One technique I have used in the past to
measure
> these
> >>> sorts of things I call "stuffing": take the
operation that you want to
> >>> measure, then essentially change the logic so that you pay the
cost 2
> times,
> >>> 3 times, etc. You can then look at the trend in performance as
N
> varies and
> >>> extrapolate back to the case where N = 0 (i.e. you don't
pay the cost).
> >>>
> >>> For example, in one situation where I used this method it was
to
> measure
> >>> the cost of stat'ing files (sys::fs::status) across a
holistic build,
> using
> >>> only "time" on the command line (it was on Windows
and I didn't have
> any
> >>> tools like DTrace available that can directly measure this).
In order
> to do
> >>> this, I changed sys::fs::status to call stat N times instead
of 1, and
> >>> measured with N=1 N=2 N=3 etc. The result was that the
difference
> between
> >>> the N and N+1 versions was about 1-2% across N=1..10 (or
whatever I
> >>> measured). In order to negate caching and other confounding
effects,
> it is
> >>> important to try different distributions of stats; e.g. the
extra
> stats are
> >>> on the same file as the "real" stat vs. the extra
stats are on
> nonexistent
> >>> files in the same directory as the "real" file vs.
parent directories
> of the
> >>> "real" file; if these match up fairly well (they
did), then you have
> some
> >>> indication that the "stuffing" is measuring what you
want to measure.
> >>>
> >>> So e.g. if you think the cost of 3 extra pointers is
significant, then
> >>> "stuff" the struct with 3, 6, 9, ... extra pointers
and measure the
> >>> difference in performance (e.g. measure the time building a
real
> project).
> >>>
> >>> -- Sean Silva
> >>>
> >>>>
> >>>>
> >>>> To solve all three issues, would it make sense to have
> >>>> raw_ostream-derived container with a its own SmallString
like
> templated-size
> >>>> built-in buffer?
> >>>>
> >>>
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150501/823c6eb2/attachment.html>
-------------- next part --------------
Index: include/llvm/Support/raw_ostream.h
==================================================================---
include/llvm/Support/raw_ostream.h	(revision 236300)
+++ include/llvm/Support/raw_ostream.h	(working copy)
@@ -52,17 +52,19 @@
   ///               OutBufEnd - OutBufStart >= 1).
   ///
   /// If buffered, then the raw_ostream owns the buffer if (BufferMode =-  ///
InternalBuffer); otherwise the buffer has been set via SetBuffer and is
-  /// managed by the subclass.
+  /// InternalBuffer or BufferMode == DirectBuffer); otherwise the buffer has
+  /// been set via SetBuffer and is managed by the subclass.
   ///
   /// If a subclass installs an external buffer using SetBuffer then it can
wait
   /// for a \see write_impl() call to handle the data which has been put into
   /// this buffer.
+protected:
   char *OutBufStart, *OutBufEnd, *OutBufCur;
 
   enum BufferKind {
     Unbuffered = 0,
     InternalBuffer,
+    DirectBuffer,
     ExternalBuffer
   } BufferMode;
 
@@ -132,7 +134,7 @@
  
//===--------------------------------------------------------------------===//
 
   void flush() {
-    if (OutBufCur != OutBufStart)
+    if (BufferMode != DirectBuffer && OutBufCur != OutBufStart)
       flush_nonempty();
   }
 
@@ -208,8 +210,8 @@
   /// satisfy std::isprint into an escape sequence.
   raw_ostream &write_escaped(StringRef Str, bool UseHexEscapes = false);
 
-  raw_ostream &write(unsigned char C);
-  raw_ostream &write(const char *Ptr, size_t Size);
+  virtual raw_ostream &write(unsigned char C);
+  virtual raw_ostream &write(const char *Ptr, size_t Size);
 
   // Formatted output, see the format() function in Support/Format.h.
   raw_ostream &operator<<(const format_object_base &Fmt);
@@ -297,13 +299,13 @@
   /// unbuffered.
   const char *getBufferStart() const { return OutBufStart; }
 
+  /// Install the given buffer and mode.
+  void SetBufferAndMode(char *BufferStart, size_t Size, BufferKind Mode);
+
  
//===--------------------------------------------------------------------===//
   // Private Interface
  
//===--------------------------------------------------------------------===//
 private:
-  /// Install the given buffer and mode.
-  void SetBufferAndMode(char *BufferStart, size_t Size, BufferKind Mode);
-
   /// Flush the current buffer, which is known to be non-empty. This outputs
the
   /// currently buffered data and resets the buffer to empty.
   void flush_nonempty();
@@ -481,6 +483,51 @@
   }
 };
 
+/// A raw_ostream that writes to a char array. This is a simple adaptor class.
+/// This class does not encounter output errors.
+class raw_char_ostream : public raw_pwrite_stream {
+  void write_impl(const char *Ptr, size_t Size) override{};
+  void pwrite_impl(const char *Ptr, size_t Size, uint64_t Offset) override {
+    memcpy(OutBufStart + Offset, Ptr, Size);
+  }
+  uint64_t current_pos() const override { return 0; }
+  void grow(size_t Size) {
+    size_t CurBytesAvailable = OutBufEnd - OutBufCur;
+    if (Size > CurBytesAvailable) {
+      size_t CurPos = OutBufCur - OutBufStart;
+      OutBufCur = OutBufStart;
+      size_t NewSize = size_t(NextPowerOf2(CurPos + Size + 2));
+      SetBufferAndMode(new char[NewSize], NewSize, DirectBuffer);
+      OutBufCur = OutBufStart + CurPos;
+    }
+  }
+
+public:
+  explicit raw_char_ostream(char *Buffer, unsigned BufferSize) {
+    if (Buffer) {
+      SetBufferAndMode(Buffer, BufferSize, ExternalBuffer);
+    } else {
+      if (!BufferSize)
+        BufferSize = preferred_buffer_size();
+      SetBufferAndMode(new char[BufferSize], BufferSize, DirectBuffer);
+    }
+  }
+  ~raw_char_ostream() override { OutBufCur = OutBufStart; }
+  raw_ostream &write(unsigned char C) override {
+    grow(1);
+    *OutBufCur++ = C;
+    return *this;
+  }
+  raw_ostream &write(const char *Ptr, size_t Size) override {
+    grow(Size);
+    memcpy(OutBufCur, Ptr, Size);
+    OutBufCur += Size;
+    return *this;
+  }
+  /// Returns the buffer.
+  StringRef str() { return StringRef(OutBufStart, GetNumBytesInBuffer()); }
+};
+
 /// A raw_ostream that writes to an SmallVector or SmallString.  This is a
 /// simple adaptor class. This class does not encounter output errors.
 class raw_svector_ostream : public raw_pwrite_stream {
Index: lib/Support/raw_ostream.cpp
==================================================================---
lib/Support/raw_ostream.cpp	(revision 236300)
+++ lib/Support/raw_ostream.cpp	(working copy)
@@ -65,7 +65,7 @@
   assert(OutBufCur == OutBufStart &&
          "raw_ostream destructor called with non-empty buffer!");
 
-  if (BufferMode == InternalBuffer)
+  if (BufferMode == InternalBuffer || BufferMode == DirectBuffer)
     delete [] OutBufStart;
 }
 
@@ -95,7 +95,7 @@
   // child buffer management logic will be in write_impl).
   assert(GetNumBytesInBuffer() == 0 && "Current buffer is
non-empty!");
 
-  if (BufferMode == InternalBuffer)
+  if (BufferMode == InternalBuffer || BufferMode == DirectBuffer)
     delete [] OutBufStart;
   OutBufStart = BufferStart;
   OutBufEnd = OutBufStart+Size;
Index: projects/raw_char_ostream_speed/CMakeLists.txt
==================================================================---
projects/raw_char_ostream_speed/CMakeLists.txt	(revision 0)
+++ projects/raw_char_ostream_speed/CMakeLists.txt	(working copy)
@@ -0,0 +1,7 @@
+set(LLVM_LINK_COMPONENTS
+  Support
+)
+
+add_llvm_tool(raw_char_ostream
+  raw_char_ostream_speed.cpp
+)
Index: projects/raw_char_ostream_speed/raw_char_ostream_speed.cpp
==================================================================---
projects/raw_char_ostream_speed/raw_char_ostream_speed.cpp	(revision 0)
+++ projects/raw_char_ostream_speed/raw_char_ostream_speed.cpp	(working copy)
@@ -0,0 +1,43 @@
+#include "llvm/ADT/SmallString.h"
+#include "llvm/Support/raw_ostream.h"
+
+#include <ctime>
+
+using namespace llvm;
+
+namespace {
+const StringRef FuncName +   
"thadddddddddsrttttttttttttttttttttttttjouhojkirjdjdsjoiuosiasddjisjsrof";
+
+void writeStream(raw_ostream &OS) {
+  OS << "_Z" << FuncName.size() << FuncName
<< 'v';
+}
+
+const unsigned BufferSize = 128;
+void test_raw_char_ostream() {
+  char Buffer[BufferSize];
+  raw_char_ostream OS(Buffer, sizeof(Buffer));
+  writeStream(OS);
+}
+
+void test_raw_svector_ostream() {
+  SmallString<BufferSize> Buffer;
+  raw_svector_ostream OS(Buffer);
+  writeStream(OS);
+}
+
+void testIt(void (*F)()) {
+  std::clock_t StartTime = std::clock();
+  for (unsigned i = 1e7; i; --i)
+    F();
+  std::clock_t EndTime = std::clock();
+  unsigned ms = unsigned(double(EndTime - StartTime) * 1000.0 /
CLOCKS_PER_SEC);
+  outs() << ms << "ms\n";
+}
+}
+
+int main() {
+  testIt(test_raw_char_ostream);
+  testIt(test_raw_svector_ostream);
+  return 0;
+}

Sean Silva

2015-May-02 01:39 UTC

head link

[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

Could you dig into why:
+  raw_ostream &write(unsigned char C) override {
+    grow(1);
+    *OutBufCur++ = C;
+    return *this;
+  }

Is 3 times as fast as raw_svector_ostream? I don't see a good reason why
that should be any faster than:

  raw_ostream &operator<<(char C) {
    if (OutBufCur >= OutBufEnd)
      return write(C);
    *OutBufCur++ = C;
    return *this;
  }


You might just be seeing the difference between having write inlined vs.
non-inlined, in which case your patch might be a complicated way to pull
the likely cases of some raw_ostream methods into the header.


-- Sean Silva

On Thu, Apr 30, 2015 at 8:02 PM, Yaron Keren <yaron.keren at gmail.com>
wrote:
> Yes, we should do without virtual flush. The problem was raw_char_ostream
> should not be flushed - it's always current - so I overloaded flush to
a
> no-op. A cleaner solution is attached, adding a DirectBuffer mode to the
> raw_ostream.
>
> Also added a simple performance comparison project
> between raw_char_ostream and raw_svector_ostream. On Window 7 x64
> machine, raw_char_ostream was three times faster than raw_svector_ostream
> when the provided buffer size is large enough and two times as fast with
> one dynamic allocation resize.
>
>
> 2015-04-30 22:48 GMT+03:00 Rafael Espíndola <rafael.espindola at
gmail.com>:
>
>> I don't think we should make flush virtual. Why do you need to do
it?
>> Can't you set up the base class to write to use the tail of the
memory
>> region as the buffer?
>>
>>
>>
>> On 24 April 2015 at 06:46, Yaron Keren <yaron.keren at gmail.com>
wrote:
>> > Hi,
>> >
>> > Is this what you're thinking about?
>> > The code is not tested yet, I'd like to know if the overall
direction
>> and
>> > design are correct.
>> >
>> > Yaron
>> >
>> >
>> > 2015-04-21 0:10 GMT+03:00 Yaron Keren <yaron.keren at
gmail.com>:
>> >>
>> >> Sean, thanks for reminding this, Alp did commit a class
derived from
>> >> raw_svector_ostream conatining an internal SmallString he
called
>> >> small_string_ostream. The commit was reverted after a day due
to a
>> >> disagreement about the commit approval and apparently
abandoned.
>> >>
>> >>
>> >>
>>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223393.html
>> >>
>> >>
>>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140623/223557.html
>> >>
>> >>
>>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140630/223986.html
>> >>
>> >> The class Alp comitted did solve the possible mismatch between
the
>> >> SmallString and the stream by making the SmallString private
to the
>> class.
>> >> This however did not treat the root problem, the duplication
of the
>> >> information about the buffer between SmallString and the
stream.
>> >>
>> >> I can make performance measurements but even if the
performance
>> difference
>> >> is neglible (and looking at all the code paths and
conditionals of
>> >> non-inlined functions at raw_ostream.cpp that's hard to
believe), the
>> >> current SmallString-raw_svector_ostream is simply wrong.
>> >>
>> >> Personally, after the previous "Alp" discussion I
found and fixed
>> several
>> >> such issues in my out-of-tree code only to make new similar
new error
>> >> earlier this year, which I caught only months after, when
Rafael
>> committed
>> >> the pwrite which reminded me the issue. Due ot the information
>> duplication
>> >> Rafael commit also had a bug and Mehdi reports similar
experience.
>> Back then
>> >> Alp reported similar problems he found in LLVM code which were
>> hopefully
>> >> fixed otherwise.
>> >>
>> >> In addition to the information duplication, the design is
quite
>> confusing
>> >> to use
>> >>
>> >> - Should one use the stream.str() function or the SmallString
itself?
>> >> - flush or str?
>> >> - How do you properly clear the combination for reuse (that
was my
>> >> mistake, I forgot to resync after OS.clear())?
>> >>
>> >> It's safe to say this design pattern is very easy to get
wrong one way
>> or
>> >> another, we got burned by it multiple times and it should be
replaced.
>> >>
>> >> Alp suggested a derived class containing its own SmallString.
That's a
>> >> simple and effective approach to avoid the bugs mentioned but
does not
>> fix
>> >> the memory or runtime issues. Small as they may be, why have
them at a
>> >> fundemental data structure?
>> >>
>> >> I was thinking about going further avoiding all duplication
with a
>> >> templated class derived with its own internal buffer storage.
>> >>
>> >> Rafael suggested a more modular approach, a derived adpater
class
>> adapter
>> >> to a *simple* buffer (or nullptr for fully-dynamic operation)
which
>> also
>> >> won't duplicate any information the buffer is dumb and has
no state.
>> >>
>> >> That solution sounds simple, efficient and safe to use. The
>> implementation
>> >> would be actually simpler then raw_svector_ostream since all
the
>> >> coordination logic is not required anymore.
>> >>
>> >>
>> >> 2015-04-20 22:17 GMT+03:00 Sean Silva <chisophugis at
gmail.com>:
>> >>>
>> >>>
>> >>>
>> >>> On Sun, Apr 19, 2015 at 7:40 AM, Yaron Keren
<yaron.keren at gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> A very common code pattern in LLVM is
>> >>>>
>> >>>>  SmallString<128> S;
>> >>>>  raw_svector_ostream OS(S);
>> >>>>  OS<< ...
>> >>>>  Use OS.str()
>> >>>>
>> >>>> While raw_svector_ostream is smart to share the text
buffer itself,
>> it's
>> >>>> inefficient keeping two sets of pointers to the same
buffer:
>> >>>>
>> >>>>  In SmallString: void *BeginX, *EndX, *CapacityX
>> >>>>  In raw_ostream: char *OutBufStart, *OutBufEnd,
*OutBufCur
>> >>>>
>> >>>> Moreover, at runtime the two sets of pointers need to
be coordinated
>> >>>> between the SmallString and raw_svector_ostream using
>> >>>> raw_svector_ostream::init,
raw_svector_ostream::pwrite,
>> >>>> raw_svector_ostream::resync and
raw_svector_ostream::write_impl.
>> >>>> All these functions have non-inlined implementations
in
>> raw_ostream.cpp.
>> >>>>
>> >>>> Finally, this may cause subtle bugs if S is modified
without calling
>> >>>> OS::resync(). This is too easy to do by mistake.
>> >>>>
>> >>>> In this frequent case usage the client does not really
care about S
>> >>>> being a SmallString with its many useful string helper
function.
>> It's just
>> >>>> boilerplate code for raw_svector_ostream. But it does
cost three
>> extra
>> >>>> pointers, some runtime performance and possible bugs.
>> >>>
>> >>>
>> >>> I agree the bugs are real (Alp proposed something a while
back
>> regarding
>> >>> this?), but you will need to provide measurements to
justify the cost
>> in
>> >>> runtime performance. One technique I have used in the past
to measure
>> these
>> >>> sorts of things I call "stuffing": take the
operation that you want to
>> >>> measure, then essentially change the logic so that you pay
the cost 2
>> times,
>> >>> 3 times, etc. You can then look at the trend in
performance as N
>> varies and
>> >>> extrapolate back to the case where N = 0 (i.e. you
don't pay the
>> cost).
>> >>>
>> >>> For example, in one situation where I used this method it
was to
>> measure
>> >>> the cost of stat'ing files (sys::fs::status) across a
holistic build,
>> using
>> >>> only "time" on the command line (it was on
Windows and I didn't have
>> any
>> >>> tools like DTrace available that can directly measure
this). In order
>> to do
>> >>> this, I changed sys::fs::status to call stat N times
instead of 1, and
>> >>> measured with N=1 N=2 N=3 etc. The result was that the
difference
>> between
>> >>> the N and N+1 versions was about 1-2% across N=1..10 (or
whatever I
>> >>> measured). In order to negate caching and other
confounding effects,
>> it is
>> >>> important to try different distributions of stats; e.g.
the extra
>> stats are
>> >>> on the same file as the "real" stat vs. the
extra stats are on
>> nonexistent
>> >>> files in the same directory as the "real" file
vs. parent directories
>> of the
>> >>> "real" file; if these match up fairly well (they
did), then you have
>> some
>> >>> indication that the "stuffing" is measuring what
you want to measure.
>> >>>
>> >>> So e.g. if you think the cost of 3 extra pointers is
significant, then
>> >>> "stuff" the struct with 3, 6, 9, ... extra
pointers and measure the
>> >>> difference in performance (e.g. measure the time building
a real
>> project).
>> >>>
>> >>> -- Sean Silva
>> >>>
>> >>>>
>> >>>>
>> >>>> To solve all three issues, would it make sense to have
>> >>>> raw_ostream-derived container with a its own
SmallString like
>> templated-size
>> >>>> built-in buffer?
>> >>>>
>> >>>
>> >
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150501/d6e980d6/attachment.html>

llvm dev - May 2015 - [LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient

[LLVMdev] SmallString + raw_svector_ostream combination should be more efficient