At CPPcon last week, I saw a talk by Bob Steagall called "Fast Conversion From UTF-8 with C++, DFAs, and SSE Intrinsics." Part of this talk included data from a half-dozen or so conversion libraries... one of which was labeled "LLVM". The LLVM converters were invariably the slowest. On Windows, the mbtowc (or something like that) syscall was pretty good. Steagall's converters were of course wicked fast, even before he started playing tricks with SSE intrinsics. I found his stuff at the following link (note CppNow not CppCon) if anyone is interested in following up. https://github.com/BobSteagall/CppNow2018/tree/master/FastConversionFromUTF-8 --paulr
On 10/2/2018 2:27 PM, via llvm-dev wrote:> At CPPcon last week, I saw a talk by Bob Steagall called > "Fast Conversion From UTF-8 with C++, DFAs, and SSE Intrinsics." > Part of this talk included data from a half-dozen or so conversion > libraries... one of which was labeled "LLVM". > > The LLVM converters were invariably the slowest.UTF conversion is not on any hot paths, as far as I know, so nobody has spent any time optimizing it. If you're interested in the history of the LLVM code, see https://reviews.llvm.org/rC68208 ; it's mostly untouched since then, except for a few bugfixes. -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
It may not be in llvm, but it is in Android. I did NEON versions of UTF functions in H2 2014 and they've been in Samsung's Android versions for several years, making single digit percentage speedups in benchmarks. On Wed, Oct 3, 2018 at 10:58 AM, Friedman, Eli via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 10/2/2018 2:27 PM, via llvm-dev wrote: > >> At CPPcon last week, I saw a talk by Bob Steagall called >> "Fast Conversion From UTF-8 with C++, DFAs, and SSE Intrinsics." >> Part of this talk included data from a half-dozen or so conversion >> libraries... one of which was labeled "LLVM". >> >> The LLVM converters were invariably the slowest. >> > > UTF conversion is not on any hot paths, as far as I know, so nobody has > spent any time optimizing it. If you're interested in the history of the > LLVM code, see https://reviews.llvm.org/rC68208 ; it's mostly untouched > since then, except for a few bugfixes. > > -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux > Foundation Collaborative Project > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181003/446223a1/attachment.html>
On 02/10/18 22:58, Friedman, Eli via llvm-dev wrote:> On 10/2/2018 2:27 PM, via llvm-dev wrote: >> At CPPcon last week, I saw a talk by Bob Steagall called >> "Fast Conversion From UTF-8 with C++, DFAs, and SSE Intrinsics." >> Part of this talk included data from a half-dozen or so conversion >> libraries... one of which was labeled "LLVM". >> >> The LLVM converters were invariably the slowest. > > UTF conversion is not on any hot paths, as far as I know, so nobody has > spent any time optimizing it. If you're interested in the history of > the LLVM code, see https://reviews.llvm.org/rC68208 ; it's mostly > untouched since then, except for a few bugfixes.Given https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=823100 https://bugs.llvm.org/show_bug.cgi?id=32962 I think moving away from it should be encouraged, assuming use of that file can be removed. Thanks, Stephen.