Clement Courbet via llvm-dev
2019-Jan-03 09:29 UTC
[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.
Hi all, We'd like to suggest *adding a -memeq-lib-function* flag to allow the user to specify a `*memeq()*` function to improve string equality check performance. Right now, when llvm encounters a *string equality check*, e.g. `if (memcmp(a, b, s) == 0)`, it tries to expand to an equality comparison if `s` is a small compile-time constant, and falls back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. We propose adding a way for the user to specify a `memeq` library function (e.g. `-memeq-lib-function=user_memeq`) which will be called instead of `memcmp()` when the result of the memcmp call is only used for equality comparison. `memeq` can be made much more efficient than `memcmp` because equality comparison is trivially parallel while lexicographic ordering has a chain dependency. We measured an very large improvement of this approach on our internal codebase. A significant portion of this improvement comes from the stl, typically `std::string::operator==()`. Note that this is a *backend-only change*. Because the c family of languages do not have a standard `memeq()` (posix used to have `bcmp()` but it was removed in 2001), c/c++ code cannot communicate the equality comparison semantics to the compiler. We did not add an RTLIB entry for memeq because the user environment is not guaranteed to contain a `memeq()` function as the libc has no such concept. If there is interest, we could also contribute our optimized `memeq` to compiler-rt. A proof of concept patch for this for this RFC can be found here: https://reviews.llvm.org/D56248 Comments & suggestions welcome ! Thanks, Clement -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190103/9ba61c16/attachment.html>
Finkel, Hal J. via llvm-dev
2019-Jan-03 17:40 UTC
[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.
On 1/3/19 3:29 AM, Clement Courbet via llvm-dev wrote: Hi all, We'd like to suggest adding a -memeq-lib-function flag to allow the user to specify a `memeq()` function to improve string equality check performance. Hi, Clement, We really shouldn't be adding backend flags for anything at this point (except for debugging and the like). A function attribute should be fine, or global metadata if necessary. A function attribute should play better with LTO, and so that's generally the recommended design point. Right now, when llvm encounters a string equality check, e.g. `if (memcmp(a, b, s) == 0)`, it tries to expand to an equality comparison if `s` is a small compile-time constant, and falls back on calling `memcmp()` else. This is sub-optimal because memcmp has to compute much more than equality. We propose adding a way for the user to specify a `memeq` library function (e.g. `-memeq-lib-function=user_memeq`) which will be called instead of `memcmp()` when the result of the memcmp call is only used for equality comparison. `memeq` can be made much more efficient than `memcmp` because equality comparison is trivially parallel while lexicographic ordering has a chain dependency. We measured an very large improvement of this approach on our internal codebase. A significant portion of this improvement comes from the stl, typically `std::string::operator==()`. Note that this is a backend-only change. Because the c family of languages do not have a standard `memeq()` (posix used to have `bcmp()` but it was removed in 2001), c/c++ code cannot communicate the equality comparison semantics to the compiler. We did not add an RTLIB entry for memeq because the user environment is not guaranteed to contain a `memeq()` function as the libc has no such concept. If there is interest, we could also contribute our optimized `memeq` to compiler-rt. That would be useful. Thanks again, Hal A proof of concept patch for this for this RFC can be found here: https://reviews.llvm.org/D56248 Comments & suggestions welcome ! Thanks, Clement _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190103/7084a03f/attachment-0001.html>
Clement Courbet via llvm-dev
2019-Jan-04 10:27 UTC
[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.
Thanks for the suggestions Hal, So if I understand correctly, you're recommending we add a module flag <https://llvm.org/docs/LangRef.html#module-flags-metadata> to LLVM, something like: !llvm.module.flags = !{..., !123} !123 = !{i32 1, !"memeq_lib_function", !"user_memeq"} I've given it a try in the following patch: https://reviews.llvm.org/D56311 If this sounds reasonable I can start working on adding a CodeGenOptions to clang to see what this entails. I don't think the function attribute works here because we want this to be globally enabled instead of per-function (but maybe I misunderstood what you were suggesting). On Thu, Jan 3, 2019 at 6:40 PM Finkel, Hal J. <hfinkel at anl.gov> wrote:> > On 1/3/19 3:29 AM, Clement Courbet via llvm-dev wrote: > > Hi all, > > We'd like to suggest *adding a -memeq-lib-function* flag to allow the > user to specify a `*memeq()*` function to improve string equality check > performance. > > Hi, Clement, > > We really shouldn't be adding backend flags for anything at this point > (except for debugging and the like). A function attribute should be fine, > or global metadata if necessary. A function attribute should play better > with LTO, and so that's generally the recommended design point. > > > > Right now, when llvm encounters a *string equality check*, e.g. `if > (memcmp(a, b, s) == 0)`, it tries to expand to an equality comparison if > `s` is a small compile-time constant, and falls back on calling `memcmp()` > else. > > This is sub-optimal because memcmp has to compute much more than equality. > > We propose adding a way for the user to specify a `memeq` library function > (e.g. `-memeq-lib-function=user_memeq`) which will be called instead of > `memcmp()` when the result of the memcmp call is only used for equality > comparison. > > `memeq` can be made much more efficient than `memcmp` because equality > comparison is trivially parallel while lexicographic ordering has a chain > dependency. > > We measured an very large improvement of this approach on our internal > codebase. A significant portion of this improvement comes from the stl, > typically `std::string::operator==()`. > > Note that this is a *backend-only change*. Because the c family of > languages do not have a standard `memeq()` (posix used to have `bcmp()` but > it was removed in 2001), c/c++ code cannot communicate the equality > comparison semantics to the compiler. > > We did not add an RTLIB entry for memeq because the user environment is > not guaranteed to contain a `memeq()` function as the libc has no such > concept. > > If there is interest, we could also contribute our optimized `memeq` to > compiler-rt. > > > That would be useful. > > Thanks again, > > Hal > > > > A proof of concept patch for this for this RFC can be found here: > https://reviews.llvm.org/D56248 > > Comments & suggestions welcome ! > Thanks, > > Clement > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190104/845aadba/attachment.html>
James Y Knight via llvm-dev
2019-Jan-04 17:33 UTC
[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.
This seems a somewhat odd and overcomplicated way to go about this. Given that bcmp was required in POSIX until relatively recently, I will guess that almost all platforms support it already. From a quick check, glibc, freebsd, netbsd, newlib, and musl all seem to contain it. So, couldn't we just add bcmp to the runtime function list for those platforms which support it? And, add an optimization to translate a call to memcmp into bcmp if it exists? Of course, it would then also be a good idea to go back to POSIX and present the performance numbers to make a case for why it was actually a quite valuable function and should be reinstated into the standard as well. On Thu, Jan 3, 2019 at 4:30 AM Clement Courbet via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > We'd like to suggest *adding a -memeq-lib-function* flag to allow the > user to specify a `*memeq()*` function to improve string equality check > performance. > > Right now, when llvm encounters a *string equality check*, e.g. `if > (memcmp(a, b, s) == 0)`, it tries to expand to an equality comparison if > `s` is a small compile-time constant, and falls back on calling `memcmp()` > else. > > This is sub-optimal because memcmp has to compute much more than equality. > > We propose adding a way for the user to specify a `memeq` library function > (e.g. `-memeq-lib-function=user_memeq`) which will be called instead of > `memcmp()` when the result of the memcmp call is only used for equality > comparison. > > `memeq` can be made much more efficient than `memcmp` because equality > comparison is trivially parallel while lexicographic ordering has a chain > dependency. > > We measured an very large improvement of this approach on our internal > codebase. A significant portion of this improvement comes from the stl, > typically `std::string::operator==()`. > > Note that this is a *backend-only change*. Because the c family of > languages do not have a standard `memeq()` (posix used to have `bcmp()` but > it was removed in 2001), c/c++ code cannot communicate the equality > comparison semantics to the compiler. > > We did not add an RTLIB entry for memeq because the user environment is > not guaranteed to contain a `memeq()` function as the libc has no such > concept. > > If there is interest, we could also contribute our optimized `memeq` to > compiler-rt. > > A proof of concept patch for this for this RFC can be found here: > https://reviews.llvm.org/D56248 > > Comments & suggestions welcome ! > Thanks, > > Clement > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190104/2131f2b4/attachment.html>
David Jones via llvm-dev
2019-Jan-05 00:17 UTC
[llvm-dev] [RFC] Adding a -memeq-lib-function flag to allow the user to specify a memeq function.
If we are considering an optimization to convert calls to memcmp into bcmp, then does it make sense to add an intrinsic for bcmp like there is for memcmp? That way IR writers can express their requirements precisely: memcmp if you care about the direction of inequality, and bcmp if you do not. On Fri, Jan 4, 2019 at 12:34 PM James Y Knight via llvm-dev < llvm-dev at lists.llvm.org> wrote:> This seems a somewhat odd and overcomplicated way to go about this. > > Given that bcmp was required in POSIX until relatively recently, I will > guess that almost all platforms support it already. From a quick check, > glibc, freebsd, netbsd, newlib, and musl all seem to contain it. So, > couldn't we just add bcmp to the runtime function list for those platforms > which support it? And, add an optimization to translate a call to memcmp > into bcmp if it exists? > > Of course, it would then also be a good idea to go back to POSIX and > present the performance numbers to make a case for why it was actually a > quite valuable function and should be reinstated into the standard as well. > > On Thu, Jan 3, 2019 at 4:30 AM Clement Courbet via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi all, >> >> We'd like to suggest *adding a -memeq-lib-function* flag to allow the >> user to specify a `*memeq()*` function to improve string equality check >> performance. >> >> Right now, when llvm encounters a *string equality check*, e.g. `if >> (memcmp(a, b, s) == 0)`, it tries to expand to an equality comparison if >> `s` is a small compile-time constant, and falls back on calling `memcmp()` >> else. >> >> This is sub-optimal because memcmp has to compute much more than equality. >> >> We propose adding a way for the user to specify a `memeq` library >> function (e.g. `-memeq-lib-function=user_memeq`) which will be called >> instead of `memcmp()` when the result of the memcmp call is only used for >> equality comparison. >> >> `memeq` can be made much more efficient than `memcmp` because equality >> comparison is trivially parallel while lexicographic ordering has a chain >> dependency. >> >> We measured an very large improvement of this approach on our internal >> codebase. A significant portion of this improvement comes from the stl, >> typically `std::string::operator==()`. >> >> Note that this is a *backend-only change*. Because the c family of >> languages do not have a standard `memeq()` (posix used to have `bcmp()` but >> it was removed in 2001), c/c++ code cannot communicate the equality >> comparison semantics to the compiler. >> >> We did not add an RTLIB entry for memeq because the user environment is >> not guaranteed to contain a `memeq()` function as the libc has no such >> concept. >> >> If there is interest, we could also contribute our optimized `memeq` to >> compiler-rt. >> >> A proof of concept patch for this for this RFC can be found here: >> https://reviews.llvm.org/D56248 >> >> Comments & suggestions welcome ! >> Thanks, >> >> Clement >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190104/7574710d/attachment.html>