thr3ads.net - llvm dev - [llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function. [Jan 2019]

If this information is useful, please help other people find it:
Share via:

Clement Courbet via llvm-dev

2019-Jan-04 10:27 UTC

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

Thanks for the suggestions Hal,

So if I understand correctly, you're recommending we add a module flag
<https://llvm.org/docs/LangRef.html#module-flags-metadata> to LLVM,
something like:

!llvm.module.flags = !{..., !123}
!123 = !{i32 1, !"memeq_lib_function", !"user_memeq"}

I've given it a try in the following patch: https://reviews.llvm.org/D56311
If this sounds reasonable I can start working on adding a CodeGenOptions to
clang to see what this entails.

I don't think the function attribute works here because we want this to be
globally enabled instead of per-function (but maybe I misunderstood what
you were suggesting).


On Thu, Jan 3, 2019 at 6:40 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:
>
> On 1/3/19 3:29 AM, Clement Courbet via llvm-dev wrote:
>
> Hi all,
>
> We'd like to suggest *adding a -memeq-lib-function* flag to allow the
> user to specify a `*memeq()*` function to improve string equality check
> performance.
>
> Hi, Clement,
>
> We really shouldn't be adding backend flags for anything at this point
> (except for debugging and the like). A function attribute should be fine,
> or global metadata if necessary. A function attribute should play better
> with LTO, and so that's generally the recommended design point.
>
>
>
> Right now, when llvm encounters a *string equality check*, e.g. `if
> (memcmp(a, b, s) == 0)`, it tries  to expand to an equality comparison if
> `s` is a small compile-time constant, and falls back on calling `memcmp()`
> else.
>
> This is sub-optimal because memcmp has to compute much more than equality.
>
> We propose adding a way for the user to specify a `memeq` library function
> (e.g. `-memeq-lib-function=user_memeq`) which will be called instead of
> `memcmp()` when the result of the memcmp call is only used for equality
> comparison.
>
> `memeq` can be made much more efficient than `memcmp` because equality
> comparison is trivially parallel while lexicographic ordering has a chain
> dependency.
>
> We measured an very large improvement of this approach on our internal
> codebase. A significant portion of this improvement comes from the stl,
> typically `std::string::operator==()`.
>
> Note that this is a *backend-only change*. Because the c family of
> languages do not have a standard `memeq()` (posix used to have `bcmp()` but
> it was removed in 2001), c/c++ code cannot communicate the equality
> comparison semantics to the compiler.
>
> We did not add an RTLIB entry for memeq because the user environment is
> not guaranteed to contain a `memeq()` function as the libc has no such
> concept.
>
> If there is interest, we could also contribute our optimized `memeq` to
> compiler-rt.
>
>
> That would be useful.
>
> Thanks again,
>
> Hal
>
>
>
> A proof of concept patch for this for this RFC can be found here:
> https://reviews.llvm.org/D56248
>
> Comments & suggestions welcome !
> Thanks,
>
> Clement
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190104/845aadba/attachment.html>

Clement Courbet via llvm-dev

2019-Jan-04 12:46 UTC

head link

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

> I don't think the function attribute works here because we want this tobe globally enabled instead of per-function (but maybe I misunderstood what
you were suggesting).

Hm, actually I think you might have been referring to a function attribute
at call site, same as `trap-func-name` in
`CodeGenModule::ConstructDefaultFnAttrList`:

if (AttrOnCallSite) {
    // Attributes that should go on the call site only.
    if (!CodeGenOpts.SimplifyLibCalls ||
        CodeGenOpts.isNoBuiltinFunc(Name.data()))
      FuncAttrs.addAttribute(llvm::Attribute::NoBuiltin);
    if (!CodeGenOpts.TrapFuncName.empty())
      FuncAttrs.addAttribute("trap-func-name",
CodeGenOpts.TrapFuncName);
  }

The nice thing is that `trap-func-name` paves the way as it's a pretty
similar use case. Here's a v3 diff with this approach for comparison:
https://reviews.llvm.org/D56313.



On Fri, Jan 4, 2019 at 11:27 AM Clement Courbet <courbet at google.com>
wrote:
> Thanks for the suggestions Hal,
>
> So if I understand correctly, you're recommending we add a module flag
> <https://llvm.org/docs/LangRef.html#module-flags-metadata> to LLVM,
> something like:
>
> !llvm.module.flags = !{..., !123}
> !123 = !{i32 1, !"memeq_lib_function", !"user_memeq"}
>
> I've given it a try in the following patch:
> https://reviews.llvm.org/D56311
> If this sounds reasonable I can start working on adding a CodeGenOptions
> to clang to see what this entails.
>
> I don't think the function attribute works here because we want this to
be
> globally enabled instead of per-function (but maybe I misunderstood what
> you were suggesting).
>
>
> On Thu, Jan 3, 2019 at 6:40 PM Finkel, Hal J. <hfinkel at anl.gov>
wrote:
>
>>
>> On 1/3/19 3:29 AM, Clement Courbet via llvm-dev wrote:
>>
>> Hi all,
>>
>> We'd like to suggest *adding a -memeq-lib-function* flag to allow
the
>> user to specify a `*memeq()*` function to improve string equality check
>> performance.
>>
>> Hi, Clement,
>>
>> We really shouldn't be adding backend flags for anything at this
point
>> (except for debugging and the like). A function attribute should be
fine,
>> or global metadata if necessary. A function attribute should play
better
>> with LTO, and so that's generally the recommended design point.
>>
>>
>>
>> Right now, when llvm encounters a *string equality check*, e.g. `if
>> (memcmp(a, b, s) == 0)`, it tries  to expand to an equality comparison
if
>> `s` is a small compile-time constant, and falls back on calling
`memcmp()`
>> else.
>>
>> This is sub-optimal because memcmp has to compute much more than
equality.
>>
>> We propose adding a way for the user to specify a `memeq` library
>> function (e.g. `-memeq-lib-function=user_memeq`) which will be called
>> instead of `memcmp()` when the result of the memcmp call is only used
for
>> equality comparison.
>>
>> `memeq` can be made much more efficient than `memcmp` because equality
>> comparison is trivially parallel while lexicographic ordering has a
chain
>> dependency.
>>
>> We measured an very large improvement of this approach on our internal
>> codebase. A significant portion of this improvement comes from the stl,
>> typically `std::string::operator==()`.
>>
>> Note that this is a *backend-only change*. Because the c family of
>> languages do not have a standard `memeq()` (posix used to have `bcmp()`
but
>> it was removed in 2001), c/c++ code cannot communicate the equality
>> comparison semantics to the compiler.
>>
>> We did not add an RTLIB entry for memeq because the user environment is
>> not guaranteed to contain a `memeq()` function as the libc has no such
>> concept.
>>
>> If there is interest, we could also contribute our optimized `memeq` to
>> compiler-rt.
>>
>>
>> That would be useful.
>>
>> Thanks again,
>>
>> Hal
>>
>>
>>
>> A proof of concept patch for this for this RFC can be found here:
>> https://reviews.llvm.org/D56248
>>
>> Comments & suggestions welcome !
>> Thanks,
>>
>> Clement
>>
>> _______________________________________________
>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190104/d6f5bf32/attachment.html>

Finkel, Hal J. via llvm-dev

2019-Jan-04 16:45 UTC

head link

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

On 1/4/19 6:46 AM, Clement Courbet wrote:> I don't think the function attribute works here because we want this to
be globally enabled instead of per-function (but maybe I misunderstood what you
were suggesting).

Our general scheme for these kinds of things is to add a per-function attribute,
and then have the frontend add the attribute to every function in the module.
The rationale for this scheme comes from how it interacts with the
separate-compilation/LTO model: Imagine I have some flag affecting code
generation, say -fenable-memeq, and I use that flag when compiling some source
files and not others, and then I use LTO, I want the option to apply only to
code in those functions which came from translation units compiled with the
-fenable-memeq flag.

That having been said, I see several reasons to favor the per-call-site
attribute over the function-level attribute in this case:

 1. With function-level attributes, there's always the question of what to
do with inlining. Either you block inlining upon an attribute mismatch, or you
allow it and drop the conflicting attributes in some conservative sense. With
call-site attributes, this is not an issue.

 2. The attribute will apply rarely, and so while putting the attribute on all
functions will increase the bitcode size / memory footprint in the common case,
having it only on relevant call sites will not have that overhead.

 3. While we have one function now, there could be a large number, and encoding
these all in function-level attributes on every function will become unwieldy
(because it will magnify the issues above). Having the frontend attach
information per-call-site seems like a better approach.

To bikeshed, "trap-func-name", why "trap"?

 -Hal


Hm, actually I think you might have been referring to a function attribute at
call site, same as `trap-func-name` in
`CodeGenModule::ConstructDefaultFnAttrList`:

if (AttrOnCallSite) {
    // Attributes that should go on the call site only.
    if (!CodeGenOpts.SimplifyLibCalls ||
        CodeGenOpts.isNoBuiltinFunc(Name.data()))
      FuncAttrs.addAttribute(llvm::Attribute::NoBuiltin);
    if (!CodeGenOpts.TrapFuncName.empty())
      FuncAttrs.addAttribute("trap-func-name",
CodeGenOpts.TrapFuncName);
  }

The nice thing is that `trap-func-name` paves the way as it's a pretty
similar use case. Here's a v3 diff with this approach for comparison:
https://reviews.llvm.org/D56313.



On Fri, Jan 4, 2019 at 11:27 AM Clement Courbet <courbet at
google.com<mailto:courbet at google.com>> wrote:
Thanks for the suggestions Hal,

So if I understand correctly, you're recommending we add a module
flag<https://llvm.org/docs/LangRef.html#module-flags-metadata> to LLVM,
something like:

!llvm.module.flags = !{..., !123}
!123 = !{i32 1, !"memeq_lib_function", !"user_memeq"}

I've given it a try in the following patch: https://reviews.llvm.org/D56311
If this sounds reasonable I can start working on adding a CodeGenOptions to
clang to see what this entails.

I don't think the function attribute works here because we want this to be
globally enabled instead of per-function (but maybe I misunderstood what you
were suggesting).


On Thu, Jan 3, 2019 at 6:40 PM Finkel, Hal J. <hfinkel at
anl.gov<mailto:hfinkel at anl.gov>> wrote:

On 1/3/19 3:29 AM, Clement Courbet via llvm-dev wrote:
Hi all,

We'd like to suggest adding a -memeq-lib-function flag to allow the user to
specify a `memeq()` function to improve string equality check performance.

Hi, Clement,

We really shouldn't be adding backend flags for anything at this point
(except for debugging and the like). A function attribute should be fine, or
global metadata if necessary. A function attribute should play better with LTO,
and so that's generally the recommended design point.


Right now, when llvm encounters a string equality check, e.g. `if (memcmp(a, b,
s) == 0)`, it tries  to expand to an equality comparison if `s` is a small
compile-time constant, and falls back on calling `memcmp()` else.

This is sub-optimal because memcmp has to compute much more than equality.

We propose adding a way for the user to specify a `memeq` library function (e.g.
`-memeq-lib-function=user_memeq`) which will be called instead of `memcmp()`
when the result of the memcmp call is only used for equality comparison.

`memeq` can be made much more efficient than `memcmp` because equality
comparison is trivially parallel while lexicographic ordering has a chain
dependency.

We measured an very large improvement of this approach on our internal codebase.
A significant portion of this improvement comes from the stl, typically
`std::string::operator==()`.

Note that this is a backend-only change. Because the c family of languages do
not have a standard `memeq()` (posix used to have `bcmp()` but it was removed in
2001), c/c++ code cannot communicate the equality comparison semantics to the
compiler.

We did not add an RTLIB entry for memeq because the user environment is not
guaranteed to contain a `memeq()` function as the libc has no such concept.

If there is interest, we could also contribute our optimized `memeq` to
compiler-rt.


That would be useful.

Thanks again,

Hal


A proof of concept patch for this for this RFC can be found here:
https://reviews.llvm.org/D56248

Comments & suggestions welcome !
Thanks,

Clement



_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190104/f7b6aa8f/attachment.html>

llvm dev - Jan 2019 - [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.