Zachary Turner via llvm-dev
2016-Oct-12 19:35 UTC
[llvm-dev] RFC: General purpose type-safe formatting library
You get compile time checking automatically when we can use c++14 though. If you use it with a string literal, you'll get compile time checking, otherwise you won't. Here's a different example though. Suppose you're writing a tool which prints formatted output, and the field width is specified by the user. Now you NEED to build the format string at runtime, there's no other way. Off the top of my head, lldb does this already when printing disassembly and stack frames. The column widths are user settings On Wed, Oct 12, 2016 at 12:23 PM Mehdi Amini <mehdi.amini at apple.com> wrote:> On Oct 12, 2016, at 12:08 PM, Zachary Turner <zturner at google.com> wrote: > > I thought I did. :) Passing format strings between functions is very > useful. For example, imagine wanting to write a function like > printRange(const char *Fmt, std::vector<int> Items); > > > I’m not sure I understand your example? > Do you mean you want the range to be in the format? If so Why? I would > rather write something like: > > printRange(“{per_elts_fmt}”, /* separator */ “, ", begin, end); > > This isn't possible if your format string MUST be a string literal > > > I haven’t seen a convincing example yet to support this. I may miss the > obvious, but you haven’t shown it either. > One could find a way to *compose* format in a compile-time-safe more > efficiently. > > Equally importantly, I don't see a good reason to disallow runtime format > strings. > > > No compile time checking, bug hiding, not robust. > (i.e. you may not “crash”, but you may still don’t print what you want / > expect in every case). > > — > Mehdi > > > > > On Wed, Oct 12, 2016 at 11:59 AM Mehdi Amini <mehdi.amini at apple.com> > wrote: > > On Oct 12, 2016, at 11:38 AM, Zachary Turner <zturner at google.com> wrote: > > I don't object to compile time checking *as long as it doesn't severely > detract from brevity*. > > > > At the same time, I do object to *preventing* runtime format strings. > > > You haven’t answered: why? > > — > Mehdi > > > > When we have C++14, we can make every member of StringRef constexpr, and > at that point we will get compile time checking mostly "for free" without > preventing runtime format strings. For example, given a constexpr-aware > implementation of StringRef, if you were to write: os.format("literal > format", a, b, c) you would get all the compile time checking, such as > ensuring that the number of arguments matches the highest index in the > format string, and ensuring there are enough arguments for every > placeholder. But if you wrote os.format(s, a, b, c) you would still get > runtime checking of the format strings. > > As long as the runtime implementation doesn't exhibit UB when things don't > match up, and it kindly asserts to warn you of the problem in the test > suite, support runtime format strings can be very helpful. For example, it > could allow you to wrap a call to format in some other function, like: > > template<typename... Ts> > void wrap_format(const char *Format, Ts &&... Args) { > dbgs().format(Format, ConvertArg(Args)...); > } > > On Wed, Oct 12, 2016 at 11:24 AM Mehdi Amini <mehdi.amini at apple.com> > wrote: > > On Oct 12, 2016, at 7:12 AM, Zachary Turner via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Ahh, UDLs also wouldn't permit non literal format strings, which is a deal > breaker imo > > > Why? > Somehow the goal pursued by Pavel (which you didn’t object per-se) is to > provide *compile* time checking. > This imply that you cannot decouple the construction of the format and the > argument list. > > — > Mehdi > > On Wed, Oct 12, 2016 at 7:03 AM Zachary Turner <zturner at google.com> wrote: > > I'm not sure that would work well. The implementation relies on being able > to index into the parameter pack. How would you do that if each parameter > is streamed in? > > "{0} {1}"_fs(1, 2) > > Could perhaps work, but it looks a little strange to me. > > Fwiw i agree format_string is long. Ideally it would be called format, but > that's taken. > > Another option is os.format("{0}", 7), and have format_string("{0}", 7) > return a std::string. > On Wed, Oct 12, 2016 at 6:43 AM Aaron Ballman <aaron at aaronballman.com> > wrote: > > >> 1. os << format_string("Test"); // writes "test" > >> 2. os << format_string("{0}", 7); // writes "7" > > > > > > The "<< format_string(..." is ... really verbose for me. It also makes me > > strongly feel like this produces a string rather than a streamable > entity. > > I wonder if we could use UDLs instead? > > os << "Test" << "{0}"_fs << 7; > > ~Aaron > > > > > I'm not a huge fan of streaming, but if we want to go this route, I'd > very > > much like to keep the syntax short and sweet. "format" is pretty great > for > > that. If this is going to fully subsume its use cases, can we eventually > get > > that to be the name? > > > > (While I don't like streaming, I'm not trying to fight that battle > here...) > > > > Also, you should probably look at what is quickly becoming a popular C++ > > library in this space: https://github.com/fmtlib/fmt > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/0d1b7dcd/attachment.html>
Mehdi Amini via llvm-dev
2016-Oct-12 19:40 UTC
[llvm-dev] RFC: General purpose type-safe formatting library
> On Oct 12, 2016, at 12:35 PM, Zachary Turner <zturner at google.com> wrote: > > You get compile time checking automatically when we can use c++14 though. If you use it with a string literal, you'll get compile time checking, otherwise you won’t.I understand that, but that doesn’t really address my concerns.> > Here's a different example though. Suppose you're writing a tool which prints formatted output, and the field width is specified by the user.> Now you NEED to build the format string at runtime, there's no other wayMaybe the problem is using a string to format this in the first place. For example, you could wrap the object you want to print with an adaptor in charge of padding to the right till you reach the column width. format(“{0}”, rPad(col_width, my_object));> . Off the top of my head, lldb does this already when printing disassembly and stack frames. The column widths are user settings > > > On Wed, Oct 12, 2016 at 12:23 PM Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: >> On Oct 12, 2016, at 12:08 PM, Zachary Turner <zturner at google.com <mailto:zturner at google.com>> wrote: >> >> I thought I did. :) Passing format strings between functions is very useful. For example, imagine wanting to write a function like printRange(const char *Fmt, std::vector<int> Items); > > I’m not sure I understand your example? > Do you mean you want the range to be in the format? If so Why? I would rather write something like: > > printRange(“{per_elts_fmt}”, /* separator */ “, ", begin, end); > >> This isn't possible if your format string MUST be a string literal > > I haven’t seen a convincing example yet to support this. I may miss the obvious, but you haven’t shown it either. > One could find a way to *compose* format in a compile-time-safe more efficiently. > >> Equally importantly, I don't see a good reason to disallow runtime format strings. > > No compile time checking, bug hiding, not robust. > (i.e. you may not “crash”, but you may still don’t print what you want / expect in every case). > > — > Mehdi > > > >> >> On Wed, Oct 12, 2016 at 11:59 AM Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: >>> On Oct 12, 2016, at 11:38 AM, Zachary Turner <zturner at google.com <mailto:zturner at google.com>> wrote: >>> >>> I don't object to compile time checking *as long as it doesn't severely detract from brevity*. >> >> >>> At the same time, I do object to *preventing* runtime format strings. >> >> You haven’t answered: why? >> >> — >> Mehdi >> >> >>> >>> When we have C++14, we can make every member of StringRef constexpr, and at that point we will get compile time checking mostly "for free" without preventing runtime format strings. For example, given a constexpr-aware implementation of StringRef, if you were to write: os.format("literal format", a, b, c) you would get all the compile time checking, such as ensuring that the number of arguments matches the highest index in the format string, and ensuring there are enough arguments for every placeholder. But if you wrote os.format(s, a, b, c) you would still get runtime checking of the format strings. >>> >>> As long as the runtime implementation doesn't exhibit UB when things don't match up, and it kindly asserts to warn you of the problem in the test suite, support runtime format strings can be very helpful. For example, it could allow you to wrap a call to format in some other function, like: >>> >>> template<typename... Ts> >>> void wrap_format(const char *Format, Ts &&... Args) { >>> dbgs().format(Format, ConvertArg(Args)...); >>> } >>> >>> On Wed, Oct 12, 2016 at 11:24 AM Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: >>>> On Oct 12, 2016, at 7:12 AM, Zachary Turner via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>> >>>> Ahh, UDLs also wouldn't permit non literal format strings, which is a deal breaker imo >>> >>> Why? >>> Somehow the goal pursued by Pavel (which you didn’t object per-se) is to provide *compile* time checking. >>> This imply that you cannot decouple the construction of the format and the argument list. >>> >>> — >>> Mehdi >>> >>>> On Wed, Oct 12, 2016 at 7:03 AM Zachary Turner <zturner at google.com <mailto:zturner at google.com>> wrote: >>>> I'm not sure that would work well. The implementation relies on being able to index into the parameter pack. How would you do that if each parameter is streamed in? >>>> >>>> "{0} {1}"_fs(1, 2) >>>> >>>> Could perhaps work, but it looks a little strange to me. >>>> >>>> Fwiw i agree format_string is long. Ideally it would be called format, but that's taken. >>>> >>>> Another option is os.format("{0}", 7), and have format_string("{0}", 7) return a std::string. >>>> On Wed, Oct 12, 2016 at 6:43 AM Aaron Ballman <aaron at aaronballman.com <mailto:aaron at aaronballman.com>> wrote: >>>> >> 1. os << format_string("Test"); // writes "test" >>>> >> 2. os << format_string("{0}", 7); // writes "7" >>>> > >>>> > >>>> > The "<< format_string(..." is ... really verbose for me. It also makes me >>>> > strongly feel like this produces a string rather than a streamable entity. >>>> >>>> I wonder if we could use UDLs instead? >>>> >>>> os << "Test" << "{0}"_fs << 7; >>>> >>>> ~Aaron >>>> >>>> > >>>> > I'm not a huge fan of streaming, but if we want to go this route, I'd very >>>> > much like to keep the syntax short and sweet. "format" is pretty great for >>>> > that. If this is going to fully subsume its use cases, can we eventually get >>>> > that to be the name? >>>> > >>>> > (While I don't like streaming, I'm not trying to fight that battle here...) >>>> > >>>> > Also, you should probably look at what is quickly becoming a popular C++ >>>> > library in this space: https://github.com/fmtlib/fmt <https://github.com/fmtlib/fmt> >>>> > >>>> > _______________________________________________ >>>> > LLVM Developers mailing list >>>> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >>>> > >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/b18463ea/attachment.html>
Zachary Turner via llvm-dev
2016-Oct-12 19:46 UTC
[llvm-dev] RFC: General purpose type-safe formatting library
That's less efficient, more verbose, involves extra copies, and doesn't allow you to take full advantage of the library's mechanism for formatting user-defined types using different presentation styles. Just to be clear, no other format libraries that exist today mandate string literal format strings. And it would be an understatement to say that I would be strongly opposed to such a requirement. I would be fine providing UDLs for the case where you have a string literal format string and encouraging people to use it wherever possible, but I don't consider providing *only* UDL-based formatting (or any mechanism that mandates string literals) a viable option. On Wed, Oct 12, 2016 at 12:40 PM Mehdi Amini <mehdi.amini at apple.com> wrote:> On Oct 12, 2016, at 12:35 PM, Zachary Turner <zturner at google.com> wrote: > > You get compile time checking automatically when we can use c++14 though. > If you use it with a string literal, you'll get compile time checking, > otherwise you won’t. > > > I understand that, but that doesn’t really address my concerns. > > > Here's a different example though. Suppose you're writing a tool which > prints formatted output, and the field width is specified by the user. > > > > Now you NEED to build the format string at runtime, there's no other way > > > Maybe the problem is using a string to format this in the first place. > > For example, you could wrap the object you want to print with an adaptor > in charge of padding to the right till you reach the column width. > > format(“{0}”, rPad(col_width, my_object)); > > > > > > > > . Off the top of my head, lldb does this already when printing disassembly > and stack frames. The column widths are user settings > > > > On Wed, Oct 12, 2016 at 12:23 PM Mehdi Amini <mehdi.amini at apple.com> > wrote: > > On Oct 12, 2016, at 12:08 PM, Zachary Turner <zturner at google.com> wrote: > > I thought I did. :) Passing format strings between functions is very > useful. For example, imagine wanting to write a function like > printRange(const char *Fmt, std::vector<int> Items); > > > I’m not sure I understand your example? > Do you mean you want the range to be in the format? If so Why? I would > rather write something like: > > printRange(“{per_elts_fmt}”, /* separator */ “, ", begin, end); > > This isn't possible if your format string MUST be a string literal > > > I haven’t seen a convincing example yet to support this. I may miss the > obvious, but you haven’t shown it either. > One could find a way to *compose* format in a compile-time-safe more > efficiently. > > Equally importantly, I don't see a good reason to disallow runtime format > strings. > > > No compile time checking, bug hiding, not robust. > (i.e. you may not “crash”, but you may still don’t print what you want / > expect in every case). > > — > Mehdi > > > > > On Wed, Oct 12, 2016 at 11:59 AM Mehdi Amini <mehdi.amini at apple.com> > wrote: > > On Oct 12, 2016, at 11:38 AM, Zachary Turner <zturner at google.com> wrote: > > I don't object to compile time checking *as long as it doesn't severely > detract from brevity*. > > > > At the same time, I do object to *preventing* runtime format strings. > > > You haven’t answered: why? > > — > Mehdi > > > > When we have C++14, we can make every member of StringRef constexpr, and > at that point we will get compile time checking mostly "for free" without > preventing runtime format strings. For example, given a constexpr-aware > implementation of StringRef, if you were to write: os.format("literal > format", a, b, c) you would get all the compile time checking, such as > ensuring that the number of arguments matches the highest index in the > format string, and ensuring there are enough arguments for every > placeholder. But if you wrote os.format(s, a, b, c) you would still get > runtime checking of the format strings. > > As long as the runtime implementation doesn't exhibit UB when things don't > match up, and it kindly asserts to warn you of the problem in the test > suite, support runtime format strings can be very helpful. For example, it > could allow you to wrap a call to format in some other function, like: > > template<typename... Ts> > void wrap_format(const char *Format, Ts &&... Args) { > dbgs().format(Format, ConvertArg(Args)...); > } > > On Wed, Oct 12, 2016 at 11:24 AM Mehdi Amini <mehdi.amini at apple.com> > wrote: > > On Oct 12, 2016, at 7:12 AM, Zachary Turner via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Ahh, UDLs also wouldn't permit non literal format strings, which is a deal > breaker imo > > > Why? > Somehow the goal pursued by Pavel (which you didn’t object per-se) is to > provide *compile* time checking. > This imply that you cannot decouple the construction of the format and the > argument list. > > — > Mehdi > > On Wed, Oct 12, 2016 at 7:03 AM Zachary Turner <zturner at google.com> wrote: > > I'm not sure that would work well. The implementation relies on being able > to index into the parameter pack. How would you do that if each parameter > is streamed in? > > "{0} {1}"_fs(1, 2) > > Could perhaps work, but it looks a little strange to me. > > Fwiw i agree format_string is long. Ideally it would be called format, but > that's taken. > > Another option is os.format("{0}", 7), and have format_string("{0}", 7) > return a std::string. > On Wed, Oct 12, 2016 at 6:43 AM Aaron Ballman <aaron at aaronballman.com> > wrote: > > >> 1. os << format_string("Test"); // writes "test" > >> 2. os << format_string("{0}", 7); // writes "7" > > > > > > The "<< format_string(..." is ... really verbose for me. It also makes me > > strongly feel like this produces a string rather than a streamable > entity. > > I wonder if we could use UDLs instead? > > os << "Test" << "{0}"_fs << 7; > > ~Aaron > > > > > I'm not a huge fan of streaming, but if we want to go this route, I'd > very > > much like to keep the syntax short and sweet. "format" is pretty great > for > > that. If this is going to fully subsume its use cases, can we eventually > get > > that to be the name? > > > > (While I don't like streaming, I'm not trying to fight that battle > here...) > > > > Also, you should probably look at what is quickly becoming a popular C++ > > library in this space: https://github.com/fmtlib/fmt > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/604099ee/attachment-0001.html>
Tim Shen via llvm-dev
2016-Oct-12 20:01 UTC
[llvm-dev] RFC: General purpose type-safe formatting library
On Wed, Oct 12, 2016 at 12:35 PM Zachary Turner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> You get compile time checking automatically when we can use c++14 though. > If you use it with a string literal, you'll get compile time checking, > otherwise you won't. >Even with C++14, os.format("literal format", a, b, c) cannot do the compile-time checking (I maybe wrong with understanding C++14 constexpr). You probably need to add a overloaded version like `os.format(static_format("literal format"), a, b, c)`, or `os.format("literal format"_fmt, a, b, c)` to hold the compile-time checked version. But anyway, the current interface os.format(const char*, ...) is forward-compatible. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/1dc0be25/attachment.html>
Zachary Turner via llvm-dev
2016-Oct-12 20:07 UTC
[llvm-dev] RFC: General purpose type-safe formatting library
Couldn't you define a class FormatString like this: class FormatString { template<int N> constexpr FormatString(const char (&S)[N]) { tokenize(); } FormatString(const char *s) {} }; Then define the format function as format(const FormatString &S, Ts &&...Args) The implicit conversion from string literal would go to the constexpr constructor which could tokenize the string at compile time, while implicit conversion from non-literal would be tokenized at runtime. If that doesn't work, then like you said, you could use a UDL to force the checking at compile time. On Wed, Oct 12, 2016 at 1:01 PM Tim Shen <timshen at google.com> wrote:> On Wed, Oct 12, 2016 at 12:35 PM Zachary Turner via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > You get compile time checking automatically when we can use c++14 though. > If you use it with a string literal, you'll get compile time checking, > otherwise you won't. > > > Even with C++14, os.format("literal format", a, b, c) cannot do the > compile-time checking (I maybe wrong with understanding C++14 constexpr). > You probably need to add a overloaded version like > `os.format(static_format("literal format"), a, b, c)`, or > `os.format("literal format"_fmt, a, b, c)` to hold the compile-time checked > version. > > But anyway, the current interface os.format(const char*, ...) is > forward-compatible. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/b06be627/attachment.html>
Zachary Turner via llvm-dev
2016-Oct-13 03:07 UTC
[llvm-dev] RFC: General purpose type-safe formatting library
On Wed, Oct 12, 2016 at 12:40 PM Mehdi Amini <mehdi.amini at apple.com> wrote:> On Oct 12, 2016, at 12:35 PM, Zachary Turner <zturner at google.com> wrote: > > You get compile time checking automatically when we can use c++14 though. > If you use it with a string literal, you'll get compile time checking, > otherwise you won’t. > > > I understand that, but that doesn’t really address my concerns. > > > Here's a different example though. Suppose you're writing a tool which > prints formatted output, and the field width is specified by the user. > > > > Now you NEED to build the format string at runtime, there's no other way > > > Maybe the problem is using a string to format this in the first place. > > For example, you could wrap the object you want to print with an adaptor > in charge of padding to the right till you reach the column width. > > format(“{0}”, rPad(col_width, my_object)); >FWIW I do think that literal format strings will handle 90% or more of uses. I just don't see the benefit of needlessly banning the other cases. Because all that's going to happen is someone is going to resort to using snprintf etc, which is exactly the problem I'm trying to solve. It's literally no extra effort to support runtime format strings, and it makes the library more flexible as a result. I'm willing to start with UDLs only because I think it will get us quite far, but as soon as I need to pass a format string through an intermediate function or something like that, I will probably check in the 3 extra lines of code to add a const char* overload format function. FWIW, there's no easy way to add compile time checking of format strings until C++14, regardless of whether use UDLs or not. So that doesn't change either way. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161013/40637f49/attachment.html>