Mehdi AMINI via llvm-dev
2020-Nov-16 22:43 UTC
[llvm-dev] RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
On Mon, Nov 16, 2020 at 2:12 PM David Blaikie <dblaikie at gmail.com> wrote:> On Mon, Nov 16, 2020 at 1:55 PM Mehdi AMINI <joker.eph at gmail.com> wrote: > > On Mon, Nov 16, 2020 at 12:55 PM David Blaikie <dblaikie at gmail.com> > wrote: > >> > >> I will say I'm not a huge fan of adding even more names for things in > >> this fairly core/common use case (now we'll have even more vector > >> names to pick from) - can we use default template arguments so we can > >> write SmallVector<T> instead of SmallVector<T, N> and would that > >> address some of the use cases proposed here? > > > > > > I won't claim it is perfect, but the added names are a compromise over > rounds of reviews with the folks in the revision. In particular I'm quite > concerned that a default value for N on the SmallVector does not carry the > intent the same way, and is too easy to miss in review (or while reading > code). > > I'm not quite following here - what sort of problems do you anticipate > by readers missing the default value for N? >Reading code `SmallVector<SomeType>` does not express that it is *intentional* to leave of the value for N. It can easily be just forgotten, and it could easily be implicitly `0`, and as a reviewer (or code reader later) I don't know if this is was intentional or not. This is why I am quite opposed to "loosen" the current requirement on SmallVector: a different name for a different intent is better fitting to me.> > > To me the drawbacks are outweighing the benefits too much. > > Also, `SmallVector<LargeType>` would end-up with N==0 implicitly, > without an easy way to figure it out that there is no actual inline storage > while reading the code. > > Why would it be necessary to figure that out? If we generally feel the > right default inline storage happens to be zero in that case and that > SmallVector will pick a good default (including possibly 0) that seems > like a good thing to me. (why would we call out zero specially, if we > want to avoid calling out other values explicitly) >I think this is an API that is "easy to misuse", I don't see any advantage to it while it has drawbacks: why would we do that?> > > An alternative was to reserve the default to only "small object" so that > N isn't zero, but there isn't a platform independent way of doing that and > keep the code portable I believe. So `SVec<T>` is really saying: "I am > willing to pay for some limite inline storage if possible but I don't have > a N in mind". > > That quoted statement still sounds like it could encapsulate the > possibility of 0 too, though. "limited inline storage if possible" > could be "and the answer to that constraint is zero in this case - but > if you happen to make the T smaller, it could become non-zero > organically/without the need to touch this code" (as the N could vary > organically between non-zero values as struct sizes change too) >Yes the quoted statement says exactly that! This is why I don't like having this behavior on SmallVector: this is a different semantics / intent that the programmers have and this is something that I like being able to read differently. SVec does not accept a `N`: if I read SVec<Type> there can't be a possible accidental omission of N here. It has to be a choice between `SmallVector<SomeType, 4>` and `SVec<SomeType>` but not `SmallVector<SomeType>`.> Finally the simple `llvm::Vector` case to replace `SmallVector<T, 0>` is > because it is frequently preferable to `std::vector` but still isn't > readable or immediately intuitive and so is rarely used in practice (see > https://www.llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h > for the documented points on N=0). > > I'm not sure llvm::Vector/Vec would necessarily make it more often > used in practice. Maybe a bunch of cleanup and more specific wording > in the style guide that would ban std::vector might help more there. > > Though I agree that it may be suitable to have a clear name for > SmallVector<T, 0> since the "Small" is slightly misleading (though not > entirely - it's important for the "SmallVectorImpl" benefits of > SmallVector - so renaming it to Vector and still using it via > SmallVectorImpl<T>& might also be confusing). > > >> Got any sense of the total value here? Major savings to be had? > >> > >> (if SmallVector<T> can do what your proposed SVec<T> does, that leaves > >> the Vec<T> - could you expound on the benefits of SmallVector<T, 0> > >> over std::vector<T>? I guess the SmallVectorImpl generic algorithm > >> opportunities? Though that's rarely needed compared to ArrayRef.) > >> If SmallVector<T> would suffice then maybe Vec<T> could be > >> ZeroSmallVector<T>? Not sure. > > > > > > ZeroSmallVector<T> does not really address your "more vector names to > pick from" concerns, > > Somewhat - if SmallVector<T> can be used instead of SVec<T> then we > have one fewer name and less convention to base the Vec<T> on (since > it won't have the SVec<T> sibling), so picking a name closer to the > existing SmallVector might be more viable/helpful. > > > and it is longer than `SmallVector<T, 0>`: shouldn't we aim for the > "default case" to be the easiest to reach / most intuitive to pick? > `llvm::Vec` looks like "just a vector". > > Somewhat - but Vec is an abbreviation that isn't really used otherwise > (if we consider SVec as well, I'd still push back on the introduction > of the abbreviation for both cases)One aspect of the naming is to avoid one not being a prefix of the other (IDE completion, etc.) or them being too close to differentiate.> and llvm::Vector would be only a > case separation away form a standard name (when considering the > unqualified name - which is likely to come up a fair bit, as we'll see > "Vector" used unqualified a lot). >Note: this is why we converged with llvm::Vec and not llvm::Vector in the revision I wouldn't entirely object to SmallVector<T> getting a smart default> buffer size, and llvm::Vector being an alias for SmallVector<T, 0> - I > don't feel /super/ great about it, but yeah, if we're going to push > the Programmer's Manual encouragement further in terms of > reducing/removing use of std::vector, then maybe llvm::Vector isn't > the worst way to do that. > > (it might be that this would benefit from being two separate > discussions, though - one on how to provide smart defaults for > SmallVector<T> and one on if/how to push std::vector deprecation in > favor of SmallVector<T, 0> (semantically speaking - no matter how it's > spelled) further along) >For historical purpose: the review was actually originally only adding a default for SmallVector and nothing else :) We only converged to the current proposal after a few rounds of reviews trying to tradeoff amongst folks in the review.> > > > > Cheers, > > > > -- > > Mehdi > > > > > >> > >> > >> On Fri, Nov 13, 2020 at 2:06 PM Sean Silva via llvm-dev > >> <llvm-dev at lists.llvm.org> wrote: > >> > > >> > We've pretty happy now with a patch that adds two wrappers around > SmallVector that make it 1) more convenient to use and 2) will tend to > mitigate misuse of SmallVector. We think it's ready for wider discussion: > https://reviews.llvm.org/D90884 > >> > > >> > SVec<T> is a convenience alias for SmallVector<T, N> with N chosen > automatically to keep its size under 64 Bytes (that heuristic is easy to > change though). The recommendation in the patch is to use this "on the > stack, where a "small" number of elements are expected". > >> > > >> > Vec<T> is a convenience alias for SmallVector<T, 0>. It lets us get > the (little-known?) benefits of SmallVector even when it has no inline > elements (see > https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h). The > recommendation in the patch is to use this when the SmallVector is on the > heap. > >> > > >> > A lot of this is boiled out from the discussion in > https://groups.google.com/g/llvm-dev/c/q1OyHZy8KVc/m/1l_AasOLBAAJ?pli=1 > >> > > >> > The goals here are twofold: > >> > > >> > 1. convenience: not having to read/write "N", or do an extra > edit/recompile cycle if you forgot it > >> > > >> > 2. avoiding pathological cases: The choice of N is usually > semi-arbitrary in our experience, and if one isn't careful, can result in > sizeof(SmallVector) becoming huge, especially in the case of nested > SmallVectors. This patch avoids pathological cases in two ways: > >> > A. SVec<T>'s heuristic keeps sizeof(SVec<T>) bounded, which > prevents pathological size amplifications like in `SmallVector<struct > {SmallVector<T, 4> a, b; }, 4>`, where the small sizes effectively multiply > together. Of course, users can always write SmallVector<T, N> explicitly to > bypass this, but the need for that seems rare. > >> > B. SmallVector<T, 0> feels "weird to write" for most folks, even > though it is frequently the right choice. Vec<T> mitigates that by "looking > natural". > >> > > >> > I'm surfacing this as an RFC to get feedback on a couple higher-level > points: > >> > - does everybody agree that SVec<T> and Vec<T> are useful to have? > >> > - get wider consensus around suggesting these as "defaults" (see my > updates to ProgrammersManual.rst in the patch) > >> > - how much we want to bulk migrate code vs let it organically grow. > Replacing SmallVector<T, 0> with Vec<T> should be completely mechanical. > Replacing SmallVector<T, N> for general N would be a lot more work. > >> > - of course: naming. SVector/Vector were floated in the patch as well > and seem ok. SmallVec was rejected as it was a prefix of SmallVector > (messes up autocomplete). > >> > > >> > Looking forward to a world with fewer guessed SmallVector sizes, > >> > > >> > -- Sean Silva > >> > _______________________________________________ > >> > LLVM Developers mailing list > >> > llvm-dev at lists.llvm.org > >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201116/37c6bee4/attachment-0001.html>
David Blaikie via llvm-dev
2020-Nov-17 00:10 UTC
[llvm-dev] RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
On Mon, Nov 16, 2020 at 2:44 PM Mehdi AMINI <joker.eph at gmail.com> wrote:> > > > On Mon, Nov 16, 2020 at 2:12 PM David Blaikie <dblaikie at gmail.com> wrote: >> >> On Mon, Nov 16, 2020 at 1:55 PM Mehdi AMINI <joker.eph at gmail.com> wrote: >> > On Mon, Nov 16, 2020 at 12:55 PM David Blaikie <dblaikie at gmail.com> wrote: >> >> >> >> I will say I'm not a huge fan of adding even more names for things in >> >> this fairly core/common use case (now we'll have even more vector >> >> names to pick from) - can we use default template arguments so we can >> >> write SmallVector<T> instead of SmallVector<T, N> and would that >> >> address some of the use cases proposed here? >> > >> > >> > I won't claim it is perfect, but the added names are a compromise over rounds of reviews with the folks in the revision. In particular I'm quite concerned that a default value for N on the SmallVector does not carry the intent the same way, and is too easy to miss in review (or while reading code). >> >> I'm not quite following here - what sort of problems do you anticipate >> by readers missing the default value for N? > > > Reading code `SmallVector<SomeType>` does not express that it is *intentional* to leave of the value for N. It can easily be just forgotten, and it could easily be implicitly `0`, and as a reviewer (or code reader later) I don't know if this is was intentional or not. This is why I am quite opposed to "loosen" the current requirement on SmallVector: a different name for a different intent is better fitting to me.I don't know why it being implicitly zero is noteworthy - anymore than it being implicitly 1 or 3, etc. As for whether it's intentional: I think if we're moving in this direction that's proposed, the idea is that by default we don't want to be explicit about the N - so in general we won't be. And sometimes someone will want to be explicit and add an N, but otherwise the reasonable default is not to have an N. I think the intent is clear enough - consider other default template parameters like customized deleters on a unique_ptr: I don't wonder if someone intended to have a customized deleter on every unique_ptr, one assumes that wasn't required/intended unless it's added. It seems like the idea is for non-explicit N to be a safe/common default, and explicit N to be noteworthy/require some justification or scrutiny.>> > To me the drawbacks are outweighing the benefits too much. >> > Also, `SmallVector<LargeType>` would end-up with N==0 implicitly, without an easy way to figure it out that there is no actual inline storage while reading the code. >> >> Why would it be necessary to figure that out? If we generally feel the >> right default inline storage happens to be zero in that case and that >> SmallVector will pick a good default (including possibly 0) that seems >> like a good thing to me. (why would we call out zero specially, if we >> want to avoid calling out other values explicitly) > > > I think this is an API that is "easy to misuse", I don't see any advantage to it while it has drawbacks: why would we do that?What kind of misuse do you have in mind? To me it seems like it would be consistent with the idea that we have built rules about what good default inline sizes are - and there's no need for us to think about it on every use, we just let the default do what it's meant to do.>> > An alternative was to reserve the default to only "small object" so that N isn't zero, but there isn't a platform independent way of doing that and keep the code portable I believe. So `SVec<T>` is really saying: "I am willing to pay for some limite inline storage if possible but I don't have a N in mind". >> >> That quoted statement still sounds like it could encapsulate the >> possibility of 0 too, though. "limited inline storage if possible" >> could be "and the answer to that constraint is zero in this case - but >> if you happen to make the T smaller, it could become non-zero >> organically/without the need to touch this code" (as the N could vary >> organically between non-zero values as struct sizes change too) > > Yes the quoted statement says exactly that!Sorry, it seems we're jumbling up the two issues: Whether the type name should be different when asking for implicit default inline storage length and, separately, whether explicit zero length storage should use a different name. The discussion above, I thought, was about the latter, not the former - but your response seeems to be about the former rather than the latter. Two separate discussions/threads/patches may help keep the communication more clear.> This is why I don't like having this behavior on SmallVector: this is a different semantics / intent that the programmers have and this is something that I like being able to read differently. > SVec does not accept a `N`: if I read SVec<Type> there can't be a possible accidental omission of N here. It has to be a choice between `SmallVector<SomeType, 4>` and `SVec<SomeType>` but not `SmallVector<SomeType>`.To discuss this issue of whether accepting a default size versus having an explicit size is a different semantic - as above, I think it's worth considering/comparing this to other templates and their default template arguments. Things like std::vector's customizable allocators (you. don't have to use a different name for the container when you specify a custom allocation scheme for std::vector - but we don't wonder every time we see std::vector<T> whether the user meant to customize the allocator - we accept the default is usualy intended (as the default buffer size would be usually intended) unless it's specified - if during review we think a custom allocator (or non-default buffer size) might be merited, we could ask about it - we probably would even if the user had written SVec<T> the same way we ask about other data structures today "did you mean SmallVector<T, 3>"?), unique_ptr's custom deleters, etc.>> > Finally the simple `llvm::Vector` case to replace `SmallVector<T, 0>` is because it is frequently preferable to `std::vector` but still isn't readable or immediately intuitive and so is rarely used in practice (see https://www.llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h for the documented points on N=0). >> >> I'm not sure llvm::Vector/Vec would necessarily make it more often >> used in practice. Maybe a bunch of cleanup and more specific wording >> in the style guide that would ban std::vector might help more there. >> >> Though I agree that it may be suitable to have a clear name for >> SmallVector<T, 0> since the "Small" is slightly misleading (though not >> entirely - it's important for the "SmallVectorImpl" benefits of >> SmallVector - so renaming it to Vector and still using it via >> SmallVectorImpl<T>& might also be confusing). >> >> >> Got any sense of the total value here? Major savings to be had? >> >> >> >> (if SmallVector<T> can do what your proposed SVec<T> does, that leaves >> >> the Vec<T> - could you expound on the benefits of SmallVector<T, 0> >> >> over std::vector<T>? I guess the SmallVectorImpl generic algorithm >> >> opportunities? Though that's rarely needed compared to ArrayRef.) >> >> If SmallVector<T> would suffice then maybe Vec<T> could be >> >> ZeroSmallVector<T>? Not sure. >> > >> > >> > ZeroSmallVector<T> does not really address your "more vector names to pick from" concerns, >> >> Somewhat - if SmallVector<T> can be used instead of SVec<T> then we >> have one fewer name and less convention to base the Vec<T> on (since >> it won't have the SVec<T> sibling), so picking a name closer to the >> existing SmallVector might be more viable/helpful. >> >> > and it is longer than `SmallVector<T, 0>`: shouldn't we aim for the "default case" to be the easiest to reach / most intuitive to pick? `llvm::Vec` looks like "just a vector". >> >> Somewhat - but Vec is an abbreviation that isn't really used otherwise >> (if we consider SVec as well, I'd still push back on the introduction >> of the abbreviation for both cases) > > One aspect of the naming is to avoid one not being a prefix of the other (IDE completion, etc.) or them being too close to differentiate. > >> and llvm::Vector would be only a >> case separation away form a standard name (when considering the >> unqualified name - which is likely to come up a fair bit, as we'll see >> "Vector" used unqualified a lot). > > Note: this is why we converged with llvm::Vec and not llvm::Vector in the revisionYeah - certainly a tricky set of tradeoffs (autocomplete, similarity with standard names, etc). Perhaps a bit of a discussion of what other libraries do around this (no doubt there are a bunch of C++ libraries that have "here's the default vector you shuold use instead of std::vector for these reasons" situations).>> I wouldn't entirely object to SmallVector<T> getting a smart default >> buffer size, and llvm::Vector being an alias for SmallVector<T, 0> - I >> don't feel /super/ great about it, but yeah, if we're going to push >> the Programmer's Manual encouragement further in terms of >> reducing/removing use of std::vector, then maybe llvm::Vector isn't >> the worst way to do that. >> >> (it might be that this would benefit from being two separate >> discussions, though - one on how to provide smart defaults for >> SmallVector<T> and one on if/how to push std::vector deprecation in >> favor of SmallVector<T, 0> (semantically speaking - no matter how it's >> spelled) further along) > > > For historical purpose: the review was actually originally only adding a default for SmallVector and nothing else :) > We only converged to the current proposal after a few rounds of reviews trying to tradeoff amongst folks in the review.Perhaps you could summarize some of those discussions/decisions in more detail here? As the RFC stands, these are my comments - I know it's always a tradeoff of which audiences are brought in at what stages (though often folks send ADT/Support changes my way during review - I should probably just set up a Herald rule for these things). Sounds like maybe I am in agreement with the original version of the review, then. - Dave>> > >> > Cheers, >> > >> > -- >> > Mehdi >> > >> > >> >> >> >> >> >> On Fri, Nov 13, 2020 at 2:06 PM Sean Silva via llvm-dev >> >> <llvm-dev at lists.llvm.org> wrote: >> >> > >> >> > We've pretty happy now with a patch that adds two wrappers around SmallVector that make it 1) more convenient to use and 2) will tend to mitigate misuse of SmallVector. We think it's ready for wider discussion: https://reviews.llvm.org/D90884 >> >> > >> >> > SVec<T> is a convenience alias for SmallVector<T, N> with N chosen automatically to keep its size under 64 Bytes (that heuristic is easy to change though). The recommendation in the patch is to use this "on the stack, where a "small" number of elements are expected". >> >> > >> >> > Vec<T> is a convenience alias for SmallVector<T, 0>. It lets us get the (little-known?) benefits of SmallVector even when it has no inline elements (see https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h). The recommendation in the patch is to use this when the SmallVector is on the heap. >> >> > >> >> > A lot of this is boiled out from the discussion in https://groups.google.com/g/llvm-dev/c/q1OyHZy8KVc/m/1l_AasOLBAAJ?pli=1 >> >> > >> >> > The goals here are twofold: >> >> > >> >> > 1. convenience: not having to read/write "N", or do an extra edit/recompile cycle if you forgot it >> >> > >> >> > 2. avoiding pathological cases: The choice of N is usually semi-arbitrary in our experience, and if one isn't careful, can result in sizeof(SmallVector) becoming huge, especially in the case of nested SmallVectors. This patch avoids pathological cases in two ways: >> >> > A. SVec<T>'s heuristic keeps sizeof(SVec<T>) bounded, which prevents pathological size amplifications like in `SmallVector<struct {SmallVector<T, 4> a, b; }, 4>`, where the small sizes effectively multiply together. Of course, users can always write SmallVector<T, N> explicitly to bypass this, but the need for that seems rare. >> >> > B. SmallVector<T, 0> feels "weird to write" for most folks, even though it is frequently the right choice. Vec<T> mitigates that by "looking natural". >> >> > >> >> > I'm surfacing this as an RFC to get feedback on a couple higher-level points: >> >> > - does everybody agree that SVec<T> and Vec<T> are useful to have? >> >> > - get wider consensus around suggesting these as "defaults" (see my updates to ProgrammersManual.rst in the patch) >> >> > - how much we want to bulk migrate code vs let it organically grow. Replacing SmallVector<T, 0> with Vec<T> should be completely mechanical. Replacing SmallVector<T, N> for general N would be a lot more work. >> >> > - of course: naming. SVector/Vector were floated in the patch as well and seem ok. SmallVec was rejected as it was a prefix of SmallVector (messes up autocomplete). >> >> > >> >> > Looking forward to a world with fewer guessed SmallVector sizes, >> >> > >> >> > -- Sean Silva >> >> > _______________________________________________ >> >> > LLVM Developers mailing list >> >> > llvm-dev at lists.llvm.org >> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Mehdi AMINI via llvm-dev
2020-Nov-17 00:47 UTC
[llvm-dev] RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
On Mon, Nov 16, 2020 at 4:10 PM David Blaikie <dblaikie at gmail.com> wrote:> On Mon, Nov 16, 2020 at 2:44 PM Mehdi AMINI <joker.eph at gmail.com> wrote: > > > > > > > > On Mon, Nov 16, 2020 at 2:12 PM David Blaikie <dblaikie at gmail.com> > wrote: > >> > >> On Mon, Nov 16, 2020 at 1:55 PM Mehdi AMINI <joker.eph at gmail.com> > wrote: > >> > On Mon, Nov 16, 2020 at 12:55 PM David Blaikie <dblaikie at gmail.com> > wrote: > >> >> > >> >> I will say I'm not a huge fan of adding even more names for things in > >> >> this fairly core/common use case (now we'll have even more vector > >> >> names to pick from) - can we use default template arguments so we can > >> >> write SmallVector<T> instead of SmallVector<T, N> and would that > >> >> address some of the use cases proposed here? > >> > > >> > > >> > I won't claim it is perfect, but the added names are a compromise > over rounds of reviews with the folks in the revision. In particular I'm > quite concerned that a default value for N on the SmallVector does not > carry the intent the same way, and is too easy to miss in review (or while > reading code). > >> > >> I'm not quite following here - what sort of problems do you anticipate > >> by readers missing the default value for N? > > > > > > Reading code `SmallVector<SomeType>` does not express that it is > *intentional* to leave of the value for N. It can easily be just forgotten, > and it could easily be implicitly `0`, and as a reviewer (or code reader > later) I don't know if this is was intentional or not. This is why I am > quite opposed to "loosen" the current requirement on SmallVector: a > different name for a different intent is better fitting to me. > > I don't know why it being implicitly zero is noteworthy - anymore than > it being implicitly 1 or 3, etc. > > As for whether it's intentional: I think if we're moving in this > direction that's proposed, the idea is that by default we don't want > to be explicit about the N - so in general we won't be. And sometimes > someone will want to be explicit and add an N, but otherwise the > reasonable default is not to have an N. I think the intent is clear > enough - consider other default template parameters like customized > deleters on a unique_ptr: I don't wonder if someone intended to have a > customized deleter on every unique_ptr, one assumes that wasn't > required/intended unless it's added. It seems like the idea is for > non-explicit N to be a safe/common default, and explicit N to be > noteworthy/require some justification or scrutiny. >I get your point, but I don't share the conclusion here: I don't believe that universally not specifying the N on non-trivial types is a good default. It only makes sense to me for small types, hence it is error prone.> > >> > To me the drawbacks are outweighing the benefits too much. > >> > Also, `SmallVector<LargeType>` would end-up with N==0 implicitly, > without an easy way to figure it out that there is no actual inline storage > while reading the code. > >> > >> Why would it be necessary to figure that out? If we generally feel the > >> right default inline storage happens to be zero in that case and that > >> SmallVector will pick a good default (including possibly 0) that seems > >> like a good thing to me. (why would we call out zero specially, if we > >> want to avoid calling out other values explicitly) > > > > > > I think this is an API that is "easy to misuse", I don't see any > advantage to it while it has drawbacks: why would we do that? > > What kind of misuse do you have in mind?It does not convey the intent, it is too easy to misuse: i.e. you expect a small storage but you don't have one. This is enough to me to outweigh the benefits of the direction and prefer the status quo.> To me it seems like it would > be consistent with the idea that we have built rules about what good > default inline sizes are - and there's no need for us to think about > it on every use, we just let the default do what it's meant to do. > > >> > An alternative was to reserve the default to only "small object" so > that N isn't zero, but there isn't a platform independent way of doing that > and keep the code portable I believe. So `SVec<T>` is really saying: "I am > willing to pay for some limite inline storage if possible but I don't have > a N in mind". > >> > >> That quoted statement still sounds like it could encapsulate the > >> possibility of 0 too, though. "limited inline storage if possible" > >> could be "and the answer to that constraint is zero in this case - but > >> if you happen to make the T smaller, it could become non-zero > >> organically/without the need to touch this code" (as the N could vary > >> organically between non-zero values as struct sizes change too) > > > > Yes the quoted statement says exactly that! > > Sorry, it seems we're jumbling up the two issues: Whether the type > name should be different when asking for implicit default inline > storage length and, separately, whether explicit zero length storage > should use a different name. The discussion above, I thought, was > about the latter, not the former - but your response seeems to be > about the former rather than the latter.You're answering my quote about SVec, which is the type that is about implicit N, Vec (no S!) is about the N=0 case. The quote is also about the inline storage implicit N, it isn't clear to me how you link this to the 0 case (which does not have inline storage)?>> Two separate discussions/threads/patches may help keep the > communication more clear. > > > This is why I don't like having this behavior on SmallVector: this is a > different semantics / intent that the programmers have and this is > something that I like being able to read differently. > > SVec does not accept a `N`: if I read SVec<Type> there can't be a > possible accidental omission of N here. It has to be a choice between > `SmallVector<SomeType, 4>` and `SVec<SomeType>` but not > `SmallVector<SomeType>`. > > To discuss this issue of whether accepting a default size versus > having an explicit size is a different semantic - as above, I think > it's worth considering/comparing this to other templates and their > default template arguments. Things like std::vector's customizable > allocators (you. don't have to use a different name for the container > when you specify a custom allocation scheme for std::vector - but we > don't wonder every time we see std::vector<T> whether the user meant > to customize the allocator - we accept the default is usualy intended > (as the default buffer size would be usually intended) unless it's > specified - if during review we think a custom allocator (or > non-default buffer size) might be merited, we could ask about it - we > probably would even if the user had written SVec<T> the same way we > ask about other data structures today "did you mean SmallVector<T, > 3>"?), unique_ptr's custom deleters, etc. >I don't necessarily connect with the custom allocator / custom delete analogy right now, they seem too different to me. I'd look instead at SmallDenseMap and DenseMap maybe, or `std::map` and `std::unordered_map` which have different class names and aren't just differentiated with a trait passed as template argument.> > >> > Finally the simple `llvm::Vector` case to replace `SmallVector<T, 0>` > is because it is frequently preferable to `std::vector` but still isn't > readable or immediately intuitive and so is rarely used in practice (see > https://www.llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h > for the documented points on N=0). > >> > >> I'm not sure llvm::Vector/Vec would necessarily make it more often > >> used in practice. Maybe a bunch of cleanup and more specific wording > >> in the style guide that would ban std::vector might help more there. > >> > >> Though I agree that it may be suitable to have a clear name for > >> SmallVector<T, 0> since the "Small" is slightly misleading (though not > >> entirely - it's important for the "SmallVectorImpl" benefits of > >> SmallVector - so renaming it to Vector and still using it via > >> SmallVectorImpl<T>& might also be confusing). > >> > >> >> Got any sense of the total value here? Major savings to be had? > >> >> > >> >> (if SmallVector<T> can do what your proposed SVec<T> does, that > leaves > >> >> the Vec<T> - could you expound on the benefits of SmallVector<T, 0> > >> >> over std::vector<T>? I guess the SmallVectorImpl generic algorithm > >> >> opportunities? Though that's rarely needed compared to ArrayRef.) > >> >> If SmallVector<T> would suffice then maybe Vec<T> could be > >> >> ZeroSmallVector<T>? Not sure. > >> > > >> > > >> > ZeroSmallVector<T> does not really address your "more vector names to > pick from" concerns, > >> > >> Somewhat - if SmallVector<T> can be used instead of SVec<T> then we > >> have one fewer name and less convention to base the Vec<T> on (since > >> it won't have the SVec<T> sibling), so picking a name closer to the > >> existing SmallVector might be more viable/helpful. > >> > >> > and it is longer than `SmallVector<T, 0>`: shouldn't we aim for the > "default case" to be the easiest to reach / most intuitive to pick? > `llvm::Vec` looks like "just a vector". > >> > >> Somewhat - but Vec is an abbreviation that isn't really used otherwise > >> (if we consider SVec as well, I'd still push back on the introduction > >> of the abbreviation for both cases) > > > > One aspect of the naming is to avoid one not being a prefix of the other > (IDE completion, etc.) or them being too close to differentiate. > > > >> and llvm::Vector would be only a > >> case separation away form a standard name (when considering the > >> unqualified name - which is likely to come up a fair bit, as we'll see > >> "Vector" used unqualified a lot). > > > > Note: this is why we converged with llvm::Vec and not llvm::Vector in > the revision > > Yeah - certainly a tricky set of tradeoffs (autocomplete, similarity > with standard names, etc). Perhaps a bit of a discussion of what other > libraries do around this (no doubt there are a bunch of C++ libraries > that have "here's the default vector you shuold use instead of > std::vector for these reasons" situations). >I had looked into this when the revision was sent, but I couldn't find a library with a "vector with inlined storage" and a default N value. Both Abseil and Folly have a SmallVector equivalent, Abseil does not have a default: https://github.com/abseil/abseil-cpp/blob/master/absl/container/inlined_vector.h#L69-L71 Folly has a default hard-coded to 1: https://github.com/facebook/folly/blob/master/folly/small_vector.h#L430 Folly has also another class, FBVector, which is really an alternative to std::vector with a different memory management strategy (growth factor, etc.). I don't think Abseil has anything else. Do you have other ideas of related work to look for?> > >> I wouldn't entirely object to SmallVector<T> getting a smart default > >> buffer size, and llvm::Vector being an alias for SmallVector<T, 0> - I > >> don't feel /super/ great about it, but yeah, if we're going to push > >> the Programmer's Manual encouragement further in terms of > >> reducing/removing use of std::vector, then maybe llvm::Vector isn't > >> the worst way to do that. > >> > >> (it might be that this would benefit from being two separate > >> discussions, though - one on how to provide smart defaults for > >> SmallVector<T> and one on if/how to push std::vector deprecation in > >> favor of SmallVector<T, 0> (semantically speaking - no matter how it's > >> spelled) further along) > > > > > > For historical purpose: the review was actually originally only adding a > default for SmallVector and nothing else :) > > We only converged to the current proposal after a few rounds of reviews > trying to tradeoff amongst folks in the review. > > Perhaps you could summarize some of those discussions/decisions in > more detail here? As the RFC stands, these are my comments - I know > it's always a tradeoff of which audiences are brought in at what > stages (though often folks send ADT/Support changes my way during > review - I should probably just set up a Herald rule for these > things). > > Sounds like maybe I am in agreement with the original version of the > review, then. >Likely: the review evolved because I opposed to changing the status quo in this direction I guess. Best, -- Mehdi> > - Dave > > > >> > > >> > Cheers, > >> > > >> > -- > >> > Mehdi > >> > > >> > > >> >> > >> >> > >> >> On Fri, Nov 13, 2020 at 2:06 PM Sean Silva via llvm-dev > >> >> <llvm-dev at lists.llvm.org> wrote: > >> >> > > >> >> > We've pretty happy now with a patch that adds two wrappers around > SmallVector that make it 1) more convenient to use and 2) will tend to > mitigate misuse of SmallVector. We think it's ready for wider discussion: > https://reviews.llvm.org/D90884 > >> >> > > >> >> > SVec<T> is a convenience alias for SmallVector<T, N> with N chosen > automatically to keep its size under 64 Bytes (that heuristic is easy to > change though). The recommendation in the patch is to use this "on the > stack, where a "small" number of elements are expected". > >> >> > > >> >> > Vec<T> is a convenience alias for SmallVector<T, 0>. It lets us > get the (little-known?) benefits of SmallVector even when it has no inline > elements (see > https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h). The > recommendation in the patch is to use this when the SmallVector is on the > heap. > >> >> > > >> >> > A lot of this is boiled out from the discussion in > https://groups.google.com/g/llvm-dev/c/q1OyHZy8KVc/m/1l_AasOLBAAJ?pli=1 > >> >> > > >> >> > The goals here are twofold: > >> >> > > >> >> > 1. convenience: not having to read/write "N", or do an extra > edit/recompile cycle if you forgot it > >> >> > > >> >> > 2. avoiding pathological cases: The choice of N is usually > semi-arbitrary in our experience, and if one isn't careful, can result in > sizeof(SmallVector) becoming huge, especially in the case of nested > SmallVectors. This patch avoids pathological cases in two ways: > >> >> > A. SVec<T>'s heuristic keeps sizeof(SVec<T>) bounded, which > prevents pathological size amplifications like in `SmallVector<struct > {SmallVector<T, 4> a, b; }, 4>`, where the small sizes effectively multiply > together. Of course, users can always write SmallVector<T, N> explicitly to > bypass this, but the need for that seems rare. > >> >> > B. SmallVector<T, 0> feels "weird to write" for most folks, even > though it is frequently the right choice. Vec<T> mitigates that by "looking > natural". > >> >> > > >> >> > I'm surfacing this as an RFC to get feedback on a couple > higher-level points: > >> >> > - does everybody agree that SVec<T> and Vec<T> are useful to have? > >> >> > - get wider consensus around suggesting these as "defaults" (see > my updates to ProgrammersManual.rst in the patch) > >> >> > - how much we want to bulk migrate code vs let it organically > grow. Replacing SmallVector<T, 0> with Vec<T> should be completely > mechanical. Replacing SmallVector<T, N> for general N would be a lot more > work. > >> >> > - of course: naming. SVector/Vector were floated in the patch as > well and seem ok. SmallVec was rejected as it was a prefix of SmallVector > (messes up autocomplete). > >> >> > > >> >> > Looking forward to a world with fewer guessed SmallVector sizes, > >> >> > > >> >> > -- Sean Silva > >> >> > _______________________________________________ > >> >> > LLVM Developers mailing list > >> >> > llvm-dev at lists.llvm.org > >> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201116/119b711e/attachment.html>
Reasonably Related Threads
- RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
- RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
- RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
- RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.
- RFC: [SmallVector] Adding SVec<T> and Vec<T> convenience wrappers.