thr3ads.net - llvm dev - [llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function. [Jan 2019]

If this information is useful, please help other people find it:
Share via:

David Jones via llvm-dev

2019-Jan-05 00:17 UTC

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

If we are considering an optimization to convert calls to memcmp into bcmp,
then does it make sense to add an intrinsic for bcmp like there is for
memcmp?  That way IR writers can express their requirements precisely:
memcmp if you care about the direction of inequality, and bcmp if you do
not.


On Fri, Jan 4, 2019 at 12:34 PM James Y Knight via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> This seems a somewhat odd and overcomplicated way to go about this.
>
> Given that bcmp was required in POSIX until relatively recently, I will
> guess that almost all platforms support it already. From a quick check,
> glibc, freebsd, netbsd, newlib, and musl all seem to contain it. So,
> couldn't we just add bcmp to the runtime function list for those
platforms
> which support it? And, add an optimization to translate a call to memcmp
> into bcmp if it exists?
>
> Of course, it would then also be a good idea to go back to POSIX and
> present the performance numbers to make a case for why it was actually a
> quite valuable function and should be reinstated into the standard as well.
>
> On Thu, Jan 3, 2019 at 4:30 AM Clement Courbet via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi all,
>>
>> We'd like to suggest *adding a -memeq-lib-function* flag to allow
the
>> user to specify a `*memeq()*` function to improve string equality check
>> performance.
>>
>> Right now, when llvm encounters a *string equality check*, e.g. `if
>> (memcmp(a, b, s) == 0)`, it tries  to expand to an equality comparison
if
>> `s` is a small compile-time constant, and falls back on calling
`memcmp()`
>> else.
>>
>> This is sub-optimal because memcmp has to compute much more than
equality.
>>
>> We propose adding a way for the user to specify a `memeq` library
>> function (e.g. `-memeq-lib-function=user_memeq`) which will be called
>> instead of `memcmp()` when the result of the memcmp call is only used
for
>> equality comparison.
>>
>> `memeq` can be made much more efficient than `memcmp` because equality
>> comparison is trivially parallel while lexicographic ordering has a
chain
>> dependency.
>>
>> We measured an very large improvement of this approach on our internal
>> codebase. A significant portion of this improvement comes from the stl,
>> typically `std::string::operator==()`.
>>
>> Note that this is a *backend-only change*. Because the c family of
>> languages do not have a standard `memeq()` (posix used to have `bcmp()`
but
>> it was removed in 2001), c/c++ code cannot communicate the equality
>> comparison semantics to the compiler.
>>
>> We did not add an RTLIB entry for memeq because the user environment is
>> not guaranteed to contain a `memeq()` function as the libc has no such
>> concept.
>>
>> If there is interest, we could also contribute our optimized `memeq` to
>> compiler-rt.
>>
>> A proof of concept patch for this for this RFC can be found here:
>> https://reviews.llvm.org/D56248
>>
>> Comments & suggestions welcome !
>> Thanks,
>>
>> Clement
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190104/7574710d/attachment.html>

Clement Courbet via llvm-dev

2019-Jan-07 10:50 UTC

head link

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

Hi David & James,

On Sat, Jan 5, 2019 at 1:17 AM David Jones <david.jones at metrics.ca>
wrote:
> If we are considering an optimization to convert calls to memcmp into
> bcmp, then does it make sense to add an intrinsic for bcmp like there is
> for memcmp?  That way IR writers can express their requirements precisely:
> memcmp if you care about the direction of inequality, and bcmp if you do
> not.
>
As mentioned in my answer to Hal above, I think the backend at least should
be able to do this.

Then adding an intrinsic is only for for convenience, because if the
backend can do the optimization automatically it's enough for the frontend
to emit the attribute. For example, for a language that has a runtime which
as a memeq/bcmp, the frontend could just emit the call site annotation. I
have no strong opinion on whether the convenience justifies an intrinsic.

>
>
> On Fri, Jan 4, 2019 at 12:34 PM James Y Knight via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> This seems a somewhat odd and overcomplicated way to go about this.
>>
>> Given that bcmp was required in POSIX until relatively recently, I will
>> guess that almost all platforms support it already. From a quick check,
>> glibc, freebsd, netbsd, newlib, and musl all seem to contain it. So,
>> couldn't we just add bcmp to the runtime function list for those
platforms
>> which support it? And, add an optimization to translate a call to
memcmp
>> into bcmp if it exists?
>>
>That would indeed be much simpler, but this seems brittle to me. The
approach you're suggesting works for us (google) because we fully control
our environment, but I'm afraid it will not work for others.
For example, someone might distribute linux binaries built with LLVM to
their users, but since there is nothing guaranteeing that bcmp is available
on the user's libc, they won't be able to run it on their systems.
Is there a precedent for that approach ?

>
>> Of course, it would then also be a good idea to go back to POSIX and
>> present the performance numbers to make a case for why it was actually
a
>> quite valuable function and should be reinstated into the standard as
well.
>>
>Indeed :)

>
>> On Thu, Jan 3, 2019 at 4:30 AM Clement Courbet via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> Hi all,
>>>
>>> We'd like to suggest *adding a -memeq-lib-function* flag to
allow the
>>> user to specify a `*memeq()*` function to improve string equality
check
>>> performance.
>>>
>>> Right now, when llvm encounters a *string equality check*, e.g. `if
>>> (memcmp(a, b, s) == 0)`, it tries  to expand to an equality
comparison if
>>> `s` is a small compile-time constant, and falls back on calling
`memcmp()`
>>> else.
>>>
>>> This is sub-optimal because memcmp has to compute much more than
>>> equality.
>>>
>>> We propose adding a way for the user to specify a `memeq` library
>>> function (e.g. `-memeq-lib-function=user_memeq`) which will be
called
>>> instead of `memcmp()` when the result of the memcmp call is only
used for
>>> equality comparison.
>>>
>>> `memeq` can be made much more efficient than `memcmp` because
equality
>>> comparison is trivially parallel while lexicographic ordering has a
chain
>>> dependency.
>>>
>>> We measured an very large improvement of this approach on our
internal
>>> codebase. A significant portion of this improvement comes from the
stl,
>>> typically `std::string::operator==()`.
>>>
>>> Note that this is a *backend-only change*. Because the c family of
>>> languages do not have a standard `memeq()` (posix used to have
`bcmp()` but
>>> it was removed in 2001), c/c++ code cannot communicate the equality
>>> comparison semantics to the compiler.
>>>
>>> We did not add an RTLIB entry for memeq because the user
environment is
>>> not guaranteed to contain a `memeq()` function as the libc has no
such
>>> concept.
>>>
>>> If there is interest, we could also contribute our optimized
`memeq` to
>>> compiler-rt.
>>>
>>> A proof of concept patch for this for this RFC can be found here:
>>> https://reviews.llvm.org/D56248
>>>
>>> Comments & suggestions welcome !
>>> Thanks,
>>>
>>> Clement
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190107/8bab4bb2/attachment.html>

David Jones via llvm-dev

2019-Jan-07 12:43 UTC

head link

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

On Mon, Jan 7, 2019 at 5:50 AM Clement Courbet <courbet at google.com>
wrote:
> Hi David & James,
>
> On Sat, Jan 5, 2019 at 1:17 AM David Jones <david.jones at
metrics.ca> wrote:
>
>> If we are considering an optimization to convert calls to memcmp into
>> bcmp, then does it make sense to add an intrinsic for bcmp like there
is
>> for memcmp?  That way IR writers can express their requirements
precisely:
>> memcmp if you care about the direction of inequality, and bcmp if you
do
>> not.
>>
>
> As mentioned in my answer to Hal above, I think the backend at least
> should be able to do this.
>
> Then adding an intrinsic is only for for convenience, because if the
> backend can do the optimization automatically it's enough for the
frontend
> to emit the attribute. For example, for a language that has a runtime which
> as a memeq/bcmp, the frontend could just emit the call site annotation. I
> have no strong opinion on whether the convenience justifies an intrinsic.
>However, there may be cases where the optimizer cannot determine that a
call to memcmp can be reduced to bcmp.  e.g. I pass the result of memcmp()
to some externally-defined function unknown to LLVM. I would like to be
able to specify bcmp explicitly in this case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190107/f7ffb6b2/attachment.html>

James Y Knight via llvm-dev

2019-Jan-07 21:25 UTC

head link

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

On Mon, Jan 7, 2019 at 5:50 AM Clement Courbet <courbet at google.com>
wrote:
> On Fri, Jan 4, 2019 at 12:34 PM James Y Knight via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> This seems a somewhat odd and overcomplicated way to go about this.
>>>
>>> Given that bcmp was required in POSIX until relatively recently, I
will
>>> guess that almost all platforms support it already. From a quick
check,
>>> glibc, freebsd, netbsd, newlib, and musl all seem to contain it.
So,
>>> couldn't we just add bcmp to the runtime function list for
those platforms
>>> which support it? And, add an optimization to translate a call to
memcmp
>>> into bcmp if it exists?
>>>
>>
> That would indeed be much simpler, but this seems brittle to me. The
> approach you're suggesting works for us (google) because we fully
control
> our environment, but I'm afraid it will not work for others.
> For example, someone might distribute linux binaries built with LLVM to
> their users, but since there is nothing guaranteeing that bcmp is available
> on the user's libc, they won't be able to run it on their systems.
> Is there a precedent for that approach ?
>
There are many library functions that are only available on some platforms
and not others. LLVM can easily be told which do include it and which do
not, and emit a call only for those which do.

For Glibc Linux, we can be sure that bcmp is available on all systems -- is
is there now, it always has been, and always will be in the future (due to
backwards compatibility guarantees). It's just defined as an alias for
memcmp, however, so there's no advantage (nor, for that matter,
disadvantage) to using it.

But of course it's not _only_ glibc which has it -- it's present in
almost
every environment out there.

e.g. for Linux there's also a few other libc implementations that people
use:
musl: includes it, calls memcmp.
uclibc-ng: includes it, alias of memcmp (only if you compile
with UCLIBC_SUSV3_LEGACY, but "buildroot", the way one generally uses
uclibc, does so).
dietlibc: includes it, alias of memcmp.
Android bionic: doesn't include it.

Some other platforms:
RTEMS (newlib): includes it, calls memcmp.
NetBSD: includes it, separate optimized implementation.
FreeBSD: includes it, separate optimized implementation.
Darwin/macOS/iOS/etc: includes it, alias of memcmp.
Windows: doesn't include it.
Solaris: includes it (haven't checked impl).

I do note, sadly, that currently out of all these implementations, only
NetBSD and FreeBSD seem to actually define a separate more optimized bcmp
function. That does mean that this optimization would be effectively a
no-op, for the vast majority of people.

For Google's purposes, if a call to bcmp is emitted by the compiler, you
can of course just override it with a more optimal version. But if you can
show a similar performance win in public code, it'd be great to attempt to
push a more optimized version upstream at least to glibc. Some more precise
numbers than "very large improvement" are probably necessary to show
it's
actually worth it. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190107/891b8046/attachment.html>

llvm dev - Jan 2019 - [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.

[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.