thr3ads.net - llvm dev - [llvm-dev] RFC: General purpose type-safe formatting library [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Zachary Turner via llvm-dev

2016-Oct-12 17:28 UTC

[llvm-dev] RFC: General purpose type-safe formatting library

On Wed, Oct 12, 2016 at 10:13 AM James Y Knight <jyknight at google.com>
wrote:
>
>
> I wonder what use cases you envision for this? Why does LLVM need a super
> extensible flexible formatting library? I mean -- if you were developing
> this as a standalone project, that seems like maybe a nice feature. But I
> see no rationale as to why LLVM should include it.
>We were discussing this on IRC chat the other night, but I believe many
people underestimate the need for string formatting.  Here are some data
points:

1. There are currently 1,637 calls to llvm::format() across the codebase,
and this doesn't include calls to format_hex(), format_decimal(), and the
other variants.
2. LLVM consists of a large number (20+ at a minimum) of focused tools
(llc, lli, llvm-dwarfdump, llvm-objdump, etc) whose sole purpose is to
output formatted text.  Consider the use case of printing a verbose
disassembly listing which is fed into FileCheck.
3. Even the "flagship" tools such as clang have need for string
formatting
when writing diagnostic messages.
4. LLDB in particular has this kind of thing *everywhere*.  I'm talking
about anywhere from 3-50+ times *per function* (and that's not an
exaggeration) for logging purposes.

That said, LLVM already includes a formatting library.  llvm::format().  So
what would be the rationale *against* a better, safer, and easier version
of the same thing?

>
> That is to say: wouldn't a much-simpler printf replacement, implemented
> with variadic templates instead of C varargs (and which therefore
doesn't
> require size/signedness prefixes on %d) be sufficient for LLVM?
>
> You can do that as a drop-in improvement for llvm::format, replacing the
> call to snprintf inside the implementation with a new implementation that
> actually uses the type information.
>How would you format user-defined types using this?  I gave an example
earlier:  Consider you have a start time and an end time in std::chrono
types, and you want to print the start, end, and duration.  The code to do
this using llvm::format() or stream operators is horrible.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/bef65fa9/attachment.html>

James Y Knight via llvm-dev

2016-Oct-12 18:38 UTC

head link

[llvm-dev] RFC: General purpose type-safe formatting library

On Wed, Oct 12, 2016 at 1:28 PM, Zachary Turner <zturner at google.com>
wrote:
>
> On Wed, Oct 12, 2016 at 10:13 AM James Y Knight <jyknight at
google.com>
> wrote:
>
>>
>>
>> I wonder what use cases you envision for this? Why does LLVM need a
super
>> extensible flexible formatting library? I mean -- if you were
developing
>> this as a standalone project, that seems like maybe a nice feature. But
I
>> see no rationale as to why LLVM should include it.
>>
> We were discussing this on IRC chat the other night, but I believe many
> people underestimate the need for string formatting.  Here are some data
> points:
>
> 1. There are currently 1,637 calls to llvm::format() across the codebase,
> and this doesn't include calls to format_hex(), format_decimal(), and
the
> other variants.
> 2. LLVM consists of a large number (20+ at a minimum) of focused tools
> (llc, lli, llvm-dwarfdump, llvm-objdump, etc) whose sole purpose is to
> output formatted text.  Consider the use case of printing a verbose
> disassembly listing which is fed into FileCheck.
> 3. Even the "flagship" tools such as clang have need for string
formatting
> when writing diagnostic messages.
> 4. LLDB in particular has this kind of thing *everywhere*.  I'm talking
> about anywhere from 3-50+ times *per function* (and that's not an
> exaggeration) for logging purposes.
>
> That said, LLVM already includes a formatting library.  llvm::format().
> So what would be the rationale *against* a better, safer, and easier
> version of the same thing?
>
The arguments against for me are roughly:
1. It introduces a new formatting language that people need to learn.
2. People will still continue using printf-style formattings strings, too,
because everyone **always** does, whenever anyone's ever introduced another
formatting language anywhere.
3. The extensible formatting support is a) not obviously necessary, and b)
will be more difficult to understand for readers, versus calling a function
with normal function arguments.

>
>> That is to say: wouldn't a much-simpler printf replacement,
implemented
>> with variadic templates instead of C varargs (and which therefore
doesn't
>> require size/signedness prefixes on %d) be sufficient for LLVM?
>>
>
>> You can do that as a drop-in improvement for llvm::format, replacing
the
>> call to snprintf inside the implementation with a new implementation
that
>> actually uses the type information.
>>
> How would you format user-defined types using this?  I gave an example
> earlier:  Consider you have a start time and an end time in std::chrono
> types, and you want to print the start, end, and duration.  The code to do
> this using llvm::format() or stream operators is horrible.
>
I'd call a function that returns a string, and print the string.
E.g.:
format("Started at %s, ended at %s",
  format_date("%d/%m/%Y %T", start_time),
  format_date("%d/%m/%Y %T", end_time));
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/ef365c50/attachment-0001.html>

Zachary Turner via llvm-dev

2016-Oct-12 18:48 UTC

head link

[llvm-dev] RFC: General purpose type-safe formatting library

On Wed, Oct 12, 2016 at 11:38 AM James Y Knight <jyknight at google.com>
wrote:
> On Wed, Oct 12, 2016 at 1:28 PM, Zachary Turner <zturner at
google.com>
> wrote:
>
>
> On Wed, Oct 12, 2016 at 10:13 AM James Y Knight <jyknight at
google.com>
> wrote:
>
>
>
> I wonder what use cases you envision for this? Why does LLVM need a super
> extensible flexible formatting library? I mean -- if you were developing
> this as a standalone project, that seems like maybe a nice feature. But I
> see no rationale as to why LLVM should include it.
>
> We were discussing this on IRC chat the other night, but I believe many
> people underestimate the need for string formatting.  Here are some data
> points:
>
> 1. There are currently 1,637 calls to llvm::format() across the codebase,
> and this doesn't include calls to format_hex(), format_decimal(), and
the
> other variants.
> 2. LLVM consists of a large number (20+ at a minimum) of focused tools
> (llc, lli, llvm-dwarfdump, llvm-objdump, etc) whose sole purpose is to
> output formatted text.  Consider the use case of printing a verbose
> disassembly listing which is fed into FileCheck.
> 3. Even the "flagship" tools such as clang have need for string
formatting
> when writing diagnostic messages.
> 4. LLDB in particular has this kind of thing *everywhere*.  I'm talking
> about anywhere from 3-50+ times *per function* (and that's not an
> exaggeration) for logging purposes.
>
> That said, LLVM already includes a formatting library.  llvm::format().
> So what would be the rationale *against* a better, safer, and easier
> version of the same thing?
>
>
> The arguments against for me are roughly:
> 1. It introduces a new formatting language that people need to learn.
>We learn new things every day.  Among the new things that people would need
to learn, I would rank this among the least difficult we can think of.  The
syntax is familiar to anyone who has ever used Python or C# (which is
probably most people here).

> 2. People will still continue using printf-style formattings strings, too,
> because everyone **always** does, whenever anyone's ever introduced
another
> formatting language anywhere.
>Not if the end-state is that we remove llvm::format()

> 3. The extensible formatting support is a) not obviously necessary, and b)
> will be more difficult to understand for readers, versus calling a function
> with normal function arguments.
>I disagree.  I would be surprised if anyone thinks

os.format("Start: {0}, End: {1}, Duration: {2:ms} milliseconds",
start,
end, end-start);

is harder to understand than pretty much anything else you could possibly
write.

>
>
>
> That is to say: wouldn't a much-simpler printf replacement, implemented
> with variadic templates instead of C varargs (and which therefore
doesn't
> require size/signedness prefixes on %d) be sufficient for LLVM?
>
>
> You can do that as a drop-in improvement for llvm::format, replacing the
> call to snprintf inside the implementation with a new implementation that
> actually uses the type information.
>
> How would you format user-defined types using this?  I gave an example
> earlier:  Consider you have a start time and an end time in std::chrono
> types, and you want to print the start, end, and duration.  The code to do
> this using llvm::format() or stream operators is horrible.
>
>
> I'd call a function that returns a string, and print the string.
> E.g.:
> format("Started at %s, ended at %s",
>   format_date("%d/%m/%Y %T", start_time),
>   format_date("%d/%m/%Y %T", end_time));
>We take care to make our stream based formatting as efficient as possible
since it is used so pervasively throughout LLVM.  There are quite a few
unnecessary copies in here, and more room for programmer error in doing the
formatting.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161012/084f1172/attachment.html>

llvm dev - Oct 2016 - RFC: General purpose type-safe formatting library

[llvm-dev] RFC: General purpose type-safe formatting library

[llvm-dev] RFC: General purpose type-safe formatting library

[llvm-dev] RFC: General purpose type-safe formatting library