David Blaikie via llvm-dev
2020-Dec-03 18:26 UTC
[llvm-dev] RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
On Wed, Dec 2, 2020 at 9:38 PM Chris Lattner <clattner at nondot.org> wrote:> On Dec 2, 2020, at 3:52 PM, David Blaikie <dblaikie at gmail.com> wrote: > >> I’m pretty skeptical of #2 as an approach, because inline and out of line >> design points are pretty different when sizeof(T) is 4 or 8, and when >> sizeof(thevector) matters, e.g. in 2D cases. >> > > To me, this is more like std::string, which is (2) - and more, to me, > about the contract of the type - std::string doesn't guarantee iterator > validity over swap/move, where std::vector does. If std::vector didn't have > this guarantee, it'd be possible to have small vector optimizations in > std::vector the same way std::string does - without the user-facing > customization, mind you. > > > Yeah I definitely prefer to use standard types if we can, I think there is > compelling upthread rationale for why using them in general causes > complexity. Maybe as future C++ standards improve we can drop them. I > hope someday that ArrayRef and StringRef can drop away in time for example. >Sorry - wasn't meaning to complain/rail with the "why not std::vector". I was meaning to highlight that I think a better conceptual abstraction than "small" (1 or more inlined objects) and "not small" (0 inlined objects) it'd be better to have an abstraction more like std::string - where the smallness isn't part of the API, as such. I'm suggesting that we should have two types: Reduced sizeof(t) and guaranteed iterator validity over swap/move (ie: must not use an inline buffer) The rest: Like std::string. It might have an inline buffer, it might have zero, depending on size, etc. But importantly it's not guaranteed to maintain iterator/pointer validity on move/swap, and sizeof is optimized for stack usage. Neither of these really make sense being called "SmallVector" I suppose. And I'd dislike calling (2) "Vector" even though it's likely the more popular one - because it'd have subtly different semantics from std::vector regarding iterator invalidation.> So I think (2) has some legs - but the problem for me is that if we name > the type llvm::Vector, which already gets a bit close to std::vector > (unqualified "Vector" seems like fairly subtle code to me - but I'm leaning > towards being OK with that aspect if other peopel are) and it has different > iterator invalidation/move semantics than std::vector, I think that could > be problematic. > > > Not sure if I want to defend this, but we have lots of precedent for this, > including llvm::sort etc. >For sure - I'm not trying to advocate for avoiding having a std::vector replacement in LLVM (excuse the several negatives there). And some differences between the C++ standard APIs and the LLVM ones is expected - llvm::sort's and many of the other alternatives are fairly "obvious" I'd say, it sorts a range instead of a pair of iterators - I don't think anyone would find that surprising. Something with subtly different iterator/pointer invalidation semantics under a near exactly matching name wouldn't be such a good fit, though.> llvm::Vector has some other nice but not compatible things, e.g. > pop_back_val() >I'm not too worried about some extra API surface area like that - but a difference in iterator invalidation (especially one that was dependent on the sizeof(T)) would be a bit subtle/concerning to me. Ultimately, what I'm saying is: I think having the thing we currently call SmallVector generalised to "the thing that does not guarantee iterator validity on swap/move and may have a time/sizeof tradeoff that leans towards larger sizeof/faster time" (& such a thing could have an inline storage of 0 sometimes, if that's the right sizeof/speed tradeoff) would be a good thing - though the name might need changing if "Small" is misleading for such a generalisation. & yeah, the straight up "llvm::Vector" name would be closer to/mostly matching std::vector's semantics with regard to iterator validity, etc. Not sure if I'm making sense/just going around in circles at this point. - Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201203/1aaca83d/attachment.html>
Chris Lattner via llvm-dev
2020-Dec-05 16:56 UTC
[llvm-dev] RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
> On Dec 3, 2020, at 10:26 AM, David Blaikie <dblaikie at gmail.com> wrote: > I was meaning to highlight that I think a better conceptual abstraction than "small" (1 or more inlined objects) and "not small" (0 inlined objects) it'd be better to have an abstraction more like std::string - where the smallness isn't part of the API, as such.But we have SmallString as a dual to std::string precisely because we need that distinction. Speaking of, the whole default argument thing should probably be applied to SmallString, SmallDenseMap and other types as well.> I'm suggesting that we should have two types: > > Reduced sizeof(t) and guaranteed iterator validity over swap/move (ie: must not use an inline buffer) > The rest: Like std::string. It might have an inline buffer, it might have zero, depending on size, etc. But importantly it's not guaranteed to maintain iterator/pointer validity on move/swap, and sizeof is optimized for stack usage. > > Neither of these really make sense being called "SmallVector" I suppose. And I'd dislike calling (2) "Vector" even though it's likely the more popular one - because it'd have subtly different semantics from std::vector regarding iterator invalidation.I think I see what you’re going for here, and I agree that std::string is a slightly different situation than std::vector given that some implementations have a small string optimizations already. I feel like you’re prioritizing the iterator invalidation behavior as the core difference between the two types, whereas I’m prioritizing the performance/allocation-behavior aspect. I feel like this is a major difference in practice that API users should think about, and they should be prepared to deal with the iterator invalidation issues as necessary. Is the "Reduced sizeof(t) and guaranteed iterator validity over swap/move (ie: must not use an inline buffer)” use case important enough to have a “string-y” type? As you say, we don’t have a solution for this in the current code other than std::vector. I haven’t heard of this being a significant enough problem to be worth “fixing”, and I don’t think we can drop the inline vs out-of-line distinction.>> So I think (2) has some legs - but the problem for me is that if we name the type llvm::Vector, which already gets a bit close to std::vector (unqualified "Vector" seems like fairly subtle code to me - but I'm leaning towards being OK with that aspect if other peopel are) and it has different iterator invalidation/move semantics than std::vector, I think that could be problematic. > > Not sure if I want to defend this, but we have lots of precedent for this, including llvm::sort etc. > > For sure - I'm not trying to advocate for avoiding having a std::vector replacement in LLVM (excuse the several negatives there). And some differences between the C++ standard APIs and the LLVM ones is expected - llvm::sort's and many of the other alternatives are fairly "obvious" I'd say, it sorts a range instead of a pair of iterators - I don't think anyone would find that surprising. Something with subtly different iterator/pointer invalidation semantics under a near exactly matching name wouldn't be such a good fit, though.The difference I meant was that it forwards transparently to array_pod_sort / qsort which is a pretty big behavioral difference. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201205/d5c7b403/attachment.html>