thr3ads.net - llvm dev - [llvm-dev] RFC: Should SmallVectors be smaller? [Jun 2018]

If this information is useful, please help other people find it:
Share via:

Duncan P. N. Exon Smith via llvm-dev

2018-Jun-21 16:52 UTC

[llvm-dev] RFC: Should SmallVectors be smaller?

I've been curious for a while whether SmallVectors have the right
speed/memory tradeoff.  It would be straightforward to shave off a couple of
pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could
afford to test for small-mode vs. large-mode.

The current scheme works out to something like this:
```
template <class T, size_t SmallCapacity>
struct SmallVector {
  T *BeginX, *EndX, *CapacityX;
  T Small[SmallCapacity];

  bool isSmall() const { return BeginX == Small; }
  T *begin() { return BeginX; }
  T *end() { return EndX; }
  size_t size() const { return EndX - BeginX; }
  size_t capacity() const { return CapacityX - BeginX; }
};
```

In the past I used something more like:
```
template <class T, size_t SmallCapacity>
struct SmallVector2 {
  unsigned Size;
  unsigned Capacity;
  union {
    T Small[SmallCapacity];
    T *Large;
  };

  bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit shaved
off of Capacity.
  T *begin() { return isSmall() ? Small : Large; }
  T *end() { return begin() + Size; }
  size_t size() const { return Size; }
  size_t capacity() const { return Capacity; }
};
```

I'm curious whether this scheme would be really be slower in practice (as a
complete replacement for `SmallVector` in ADT).  I wonder, has anyone profiled
something like this before?  If so, in what context?  on what workloads?

Duncan

Chris Lattner via llvm-dev

2018-Jun-22 01:38 UTC

head link

[llvm-dev] RFC: Should SmallVectors be smaller?

> On Jun 21, 2018, at 9:52 AM, Duncan P. N. Exon Smith via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> 
> I've been curious for a while whether SmallVectors have the right
speed/memory tradeoff.  It would be straightforward to shave off a couple of
pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could
afford to test for small-mode vs. large-mode.
Something like this could definitely work, but most smallvectors are on the
stack.  They are intentionally used when sizeof(smallvector) isn’t important, so
I don’t think this optimization will pay off.

Out of curiosity, what brings this up?

-Chris

> 
> The current scheme works out to something like this:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector {
>  T *BeginX, *EndX, *CapacityX;
>  T Small[SmallCapacity];
> 
>  bool isSmall() const { return BeginX == Small; }
>  T *begin() { return BeginX; }
>  T *end() { return EndX; }
>  size_t size() const { return EndX - BeginX; }
>  size_t capacity() const { return CapacityX - BeginX; }
> };
> ```
> 
> In the past I used something more like:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector2 {
>  unsigned Size;
>  unsigned Capacity;
>  union {
>    T Small[SmallCapacity];
>    T *Large;
>  };
> 
>  bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit
shaved off of Capacity.
>  T *begin() { return isSmall() ? Small : Large; }
>  T *end() { return begin() + Size; }
>  size_t size() const { return Size; }
>  size_t capacity() const { return Capacity; }
> };
> ```
> 
> I'm curious whether this scheme would be really be slower in practice
(as a complete replacement for `SmallVector` in ADT).  I wonder, has anyone
profiled something like this before?  If so, in what context?  on what
workloads?
> 
> Duncan
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Dean Michael Berris via llvm-dev

2018-Jun-22 02:01 UTC

head link

[llvm-dev] RFC: Should SmallVectors be smaller?

> On 22 Jun 2018, at 02:52, Duncan P. N. Exon Smith via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
> 
> I've been curious for a while whether SmallVectors have the right
speed/memory tradeoff.  It would be straightforward to shave off a couple of
pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could
afford to test for small-mode vs. large-mode.
> 
> The current scheme works out to something like this:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector {
>  T *BeginX, *EndX, *CapacityX;
>  T Small[SmallCapacity];
> 
>  bool isSmall() const { return BeginX == Small; }
>  T *begin() { return BeginX; }
>  T *end() { return EndX; }
>  size_t size() const { return EndX - BeginX; }
>  size_t capacity() const { return CapacityX - BeginX; }
> };
> ```
> 
> In the past I used something more like:
> ```
> template <class T, size_t SmallCapacity>
> struct SmallVector2 {
>  unsigned Size;
>  unsigned Capacity;
>  union {
>    T Small[SmallCapacity];
>    T *Large;
>  };
> 
>  bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit
shaved off of Capacity.
>  T *begin() { return isSmall() ? Small : Large; }
>  T *end() { return begin() + Size; }
>  size_t size() const { return Size; }
>  size_t capacity() const { return Capacity; }
> };
> ```
> 
> I'm curious whether this scheme would be really be slower in practice
(as a complete replacement for `SmallVector` in ADT).  I wonder, has anyone
profiled something like this before?  If so, in what context?  on what
workloads?
> 
Doesn’t this scheme have a problem with undefined behaviour, since you may be
changing the active member of the union when capacity grows larger than
SmallCapacity?

-- Dean

Bekket McClane via llvm-dev

2018-Jun-22 02:30 UTC

head link

[llvm-dev] RFC: Should SmallVectors be smaller?

To Dean,

I think Duncan’s approach prohibit any usage of Small after the capacity grow
over SmallCapacity.
So when the capacity exceed SmallCapacity, one should:
1. Allocate memory on heap
2. Copy data from  Small to that chunk
3. Assign pointer of that chunk to Large

As long as you always access Large after growth, there would be no data lose.

Bekket> Dean Michael Berris via llvm-dev <llvm-dev at lists.llvm.org> 於
2018年6月21日 下午10:01 寫道：
> 
> 
> 
>> On 22 Jun 2018, at 02:52, Duncan P. N. Exon Smith via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>> 
>> I've been curious for a while whether SmallVectors have the right
speed/memory tradeoff.  It would be straightforward to shave off a couple of
pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could
afford to test for small-mode vs. large-mode.
>> 
>> The current scheme works out to something like this:
>> ```
>> template <class T, size_t SmallCapacity>
>> struct SmallVector {
>> T *BeginX, *EndX, *CapacityX;
>> T Small[SmallCapacity];
>> 
>> bool isSmall() const { return BeginX == Small; }
>> T *begin() { return BeginX; }
>> T *end() { return EndX; }
>> size_t size() const { return EndX - BeginX; }
>> size_t capacity() const { return CapacityX - BeginX; }
>> };
>> ```
>> 
>> In the past I used something more like:
>> ```
>> template <class T, size_t SmallCapacity>
>> struct SmallVector2 {
>> unsigned Size;
>> unsigned Capacity;
>> union {
>>   T Small[SmallCapacity];
>>   T *Large;
>> };
>> 
>> bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit
shaved off of Capacity.
>> T *begin() { return isSmall() ? Small : Large; }
>> T *end() { return begin() + Size; }
>> size_t size() const { return Size; }
>> size_t capacity() const { return Capacity; }
>> };
>> ```
>> 
>> I'm curious whether this scheme would be really be slower in
practice (as a complete replacement for `SmallVector` in ADT).  I wonder, has
anyone profiled something like this before?  If so, in what context?  on what
workloads?
>> 
> 
> Doesn’t this scheme have a problem with undefined behaviour, since you may
be changing the active member of the union when capacity grows larger than
SmallCapacity?
> 
> -- Dean
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Duncan P. N. Exon Smith via llvm-dev

2018-Jun-22 04:16 UTC

head link

[llvm-dev] RFC: Should SmallVectors be smaller?

>> On Jun 21, 2018, at 18:38, Chris Lattner <clattner at nondot.org>
wrote:
>> 
>> 
>> 
>> On Jun 21, 2018, at 9:52 AM, Duncan P. N. Exon Smith via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>> 
>> I've been curious for a while whether SmallVectors have the right
speed/memory tradeoff.  It would be straightforward to shave off a couple of
pointers (1 pointer/4B on 32-bit; 2 pointers/16B on 64-bit) if users could
afford to test for small-mode vs. large-mode.
> 
> Something like this could definitely work, but most smallvectors are on the
stack.  They are intentionally used when sizeof(smallvector) isn’t important, so
I don’t think this optimization will pay off.
For better or worse (mostly worse), there are a ton of SmallVector fields in
data structures, including some even nested inside other SmallVectors (e.g., see
the cleanup in r235229).  Often these data structures are heap-allocated.
> Out of curiosity, what brings this up?
I've noticed that Clang is using more stack recently (we're seeing more
crashes from template recursion; it seems the template recursion limit needs to
shrink), and somehow that train of thought led to this.

I share your skepticism that it will help stack usage much, but
SmallVector/SmallVectorImpl is so ubiquitous, it could help the heap a bit.  And
if it doesn’t hurt runtime performance in practice, there’s no reason to fork
the data structure.

If no one has measured before I might try it some time. 
> -Chris
> 
> 
>> 
>> The current scheme works out to something like this:
>> ```
>> template <class T, size_t SmallCapacity>
>> struct SmallVector {
>> T *BeginX, *EndX, *CapacityX;
>> T Small[SmallCapacity];
>> 
>> bool isSmall() const { return BeginX == Small; }
>> T *begin() { return BeginX; }
>> T *end() { return EndX; }
>> size_t size() const { return EndX - BeginX; }
>> size_t capacity() const { return CapacityX - BeginX; }
>> };
>> ```
>> 
>> In the past I used something more like:
>> ```
>> template <class T, size_t SmallCapacity>
>> struct SmallVector2 {
>> unsigned Size;
>> unsigned Capacity;
>> union {
>>   T Small[SmallCapacity];
>>   T *Large;
>> };
>> 
>> bool isSmall() const { return Capacity == SmallCapacity; } // Or a bit
shaved off of Capacity.
>> T *begin() { return isSmall() ? Small : Large; }
>> T *end() { return begin() + Size; }
>> size_t size() const { return Size; }
>> size_t capacity() const { return Capacity; }
>> };
>> ```
>> 
>> I'm curious whether this scheme would be really be slower in
practice (as a complete replacement for `SmallVector` in ADT).  I wonder, has
anyone profiled something like this before?  If so, in what context?  on what
workloads?
>> 
>> Duncan
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180621/9253450e/attachment.html>

llvm dev - Jun 2018 - RFC: Should SmallVectors be smaller?

[llvm-dev] RFC: Should SmallVectors be smaller?

[llvm-dev] RFC: Should SmallVectors be smaller?

[llvm-dev] RFC: Should SmallVectors be smaller?

[llvm-dev] RFC: Should SmallVectors be smaller?

[llvm-dev] RFC: Should SmallVectors be smaller?