On Jul 24, 2011, at 12:09 AM, David Blaikie wrote:>> Yes, exactly. I'm just saying that I think the additional clarity
of:
>> "foo" + Twine('x')
>>
>> is worth the inconvenience.
>
> Ok, attached a modified version of my patch with an Twine(char),
> Twine(unsigned char), and Twine(signed char). All three are explicit &
> have included test cases.
Ok, great.
> Speaking of which - is there any LLVM dev policy or preference around
> C-style casts versus C++-style casts? Twine is littered with the
Nope, use the one that is the most clear in a given case. Except in extreme
cases, the C++ style casts seem overly verbose and don't add a lot of value.
>>> As a side note on Twine's design: Is there a particular reason
it uses
>>> void*s rather than unions?
>>
>> I'm not sure what you're proposing specifically.
>
> I've attached an version of my fix that includes the union use I was
> referring to (sorry for being vague in the previous email). Though if
> we're considering a more explicitly structured solution with
> templates/subtypes/etc we could skip the whole mess of enums &
> unions/casts.
aha, that *is* much nicer! Applied with a few tweaks (remove some extraneous
'explicit' keywords and wrapped to 80 cols). Thanks!
>>> and chars rather than enums?
>>
>> char vs enum is because of visual studio compatibility and because
enums often are stored as 32-bit values instead of 8-bit values.
>
> Curious - seems VS >2k3 (
> http://msdn.microsoft.com/en-us/library/2dzy4k6e(v=VS.80).aspx ) has
> supported specifying enum backing type (so you can specify it as char:
> "enum foo : char { ... }") if that was the feature you were
wanting to
> use but avoiding.
Right, but that requires VS #ifdefs. You can also use enum bitfields, but VS
has different promotion rules from them than other compilers.
>> Yes, from a general API design perspective, I hate having to force a
choice between "convenience to implementor of an API to just take a
StringRef" vs "convenience to client of API for it to take a
Twine". It really stinks.
>
> Ok - glad we're on the same page there, wasn't quite sure whether
we
> were talking past one another in previous emails.
>
> I have a couple of ideas about how to lower the developer cost of
> using Twine ubiquitously. I'll try to describe them in some detail.
Ok!
>> I was chatting with Howard Hinnant about this and he suggested
replacing Twine with a template metaprogramming expression-template system. I
haven't thought through all the details, but perhaps this would allow us to
get the best of both worlds?
>
> I think I'd prefer a template metaprogramming solution to Twine just
> from a purist perspective compared to all the
> casting/reinterpreting/enums (even with the union) - that sort of
> stuff makes me a bit uncomfortable, but I realize it's well contained
> enough that it's not too risky (& I know some of the other tricks
LLVM
> uses like low bit twiddling in pointers, etc, so this seems relatively
> sane in the grand scheme of things).
Right. As long as the damage is well contained and isolated in the class, I
don't care much about stuff like this. One of the great features of C++ is
that it lets you do whatever insane thing you want and draw a nice interface
around it :)
> 1) the easy solution: create a StringRef subclass (or new type with a
> similarly expressive API) that wraps a buffer that it may or may not
> use. Give it a ctor that takes a Twine. The use case then goes from
> this:
>
> SmallVectorImpl<char> v;
> StringRef s = theTwine.GetStringRef(v);
>
> to this:
>
> TwineString s(theTwine);
>
> (& the ctor looks like: TwineString(const Twine& t) { *this >
t.GetStringRef(this->buffer); })
> So any Twine sink now has to pay one line of code to get a usable
> string out of a Twine. Cheap to implement, easy to fix existing use
> cases (privatize GetStringRef(SmallVectorImpl<char>) and make
> TwineString a friend - fix everything that fails to compile).
>
> This keeps things simple & seems to be "good enough" to me,
but we
> could perhaps do better (at the very least, again, if we did do
> better, we could go back & remove TwineString & again fix all the
> places that fail to compile with whatever new hotness we invent)
This definitely seems like a step forward. While it is kinda gross, a subclass
of StringRef is probably the lowest friction path to do this.
> Hmm - new thought: what would happen if my argument type was simply
> TwineString? If I knew my function needed access to the values (like
> the Host.cpp example) but didn't have any buffer of its own that it
> needed to use, that seems OK. This new thought perhaps renders
> suggestion (2) uninteresting. Functions that know they always need the
> string contents take a TwineString. Functions that might be able to
> skip realizing the string entirely - or have their own buffer (eg: a
> member std::string) to populate - take a Twine & explicitly construct
> a TwineString only in the path that needs the value. That seems
> optimal to me.
I'm dubious of this. This is exposing an implementation detail to clients
that they shouldn't have to know, and if passed by value, this could be
expensive.
> 1.1) We probably want a better way to get a Twine into a std::string -
> while the change to Triple's ctor works because returning a
> std::string from a function into a ctor can do fancy in-place
> construction - in any other case like "SetFoo(const Twine& t) {
> this->s = t; }" we'd want something better. Perhaps simply
> t.AssignTo(s);
Adding a Twine::assignTo method would make sense to me!
> 2) This one isn't quite as fleshed out, but see if the general outline
> makes sense: Introduce an extra type that's Twine-esque, but is only
> the top level (so it's implicitly constructible from a const
Twine&),
> rather than all the branches of a Twine. If a function wants to accept
> a Twine that it needs to manipulate, it can mark its argument as
> TwineString2 (for want of a better name, just using 2 to distinguish
> from the (1) example). TwineString2 has a buffer in it, but it starts
> uninitialized. Any attempt to retrieve/inspect values causes the
> string to be initialized. [downside: all string operations on a
> TwineString2 incur an "if not initialized" check]
>
> Twines could be constructed from TwineString2s if the value needed to
> be passed on to another function. (& still take advantage of any
> initialized string buffer it might have, rather than having to
> remanifest from its child nodes repeatedly)
>
> One fancy enhancement to this is that TwineString2 could partially
> manifest - eg: if you want ts[2] it could walk its children (dropping
> its references to children & updating the buffer with the contents at
> each step) until it found the 2nd character - then stop, with a
> partially expanded tree/buffer. Though there's probably nothing
> stopping us implementing such smarts in the (1) solution eventually
> either.
I think you're flying way over my head here :)
-Chris