thr3ads.net - llvm dev - [llvm-dev] RFC: Improving performance of HashString [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Scott Smith via llvm-dev

2017-Apr-25 00:37 UTC

[llvm-dev] RFC: Improving performance of HashString

I've been working on improving the startup performance of lldb, and ran
into an issue with llvm::HashString.  It works a character at a time, which
creates a long dependency chain in the processor.  On the other hand, the
function is very short, which probably works well for short identifiers.

I don't know how the mix of identifier length seen by lldb compares with
that seen by llvm/clang; I imagine they're pretty similar.

I have to different proposals, and wanted to gauge which would be preferred:

1. Use xxhash instead.

2. Use the Intel native crc32q instruction to process 8 bytes at a time,
then fall back to byte at a time.  Non sse 4.2 capable processors (either
early or non Intel/AMD x86) would use the existing algorithm, or possibly
#1 above.

For my test, both result in approximately the same # of cycles (within
0.5%).

#1 uses 3+% more instructions.
#2 requires (realistically) runtime detection of cpu capabilities, because
distributions would want generic x86/x86_64 compatible binaries, not
separate binaries per cpu feature.

I'm leaning toward #1 despite the instruction increase.  My only worry is
the effect on compile times for code with lots of short identifiers.  I
haven't tested that though (and I don't have a suitable benchmark suite
for
that) so for all I know I'm worrying about nothing.

FYI the improvement is approximately 11.5% reduction in cycles for my lldb
test (b main, run, quit), so IMO it's pretty significant.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170424/6d2c5e1e/attachment.html>

Vedant Kumar via llvm-dev

2017-Apr-25 19:55 UTC

head link

[llvm-dev] RFC: Improving performance of HashString

> On Apr 24, 2017, at 5:37 PM, Scott Smith via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> I've been working on improving the startup performance of lldb, and ran
into an issue with llvm::HashString.  It works a character at a time, which
creates a long dependency chain in the processor.  On the other hand, the
function is very short, which probably works well for short identifiers.
> 
> I don't know how the mix of identifier length seen by lldb compares
with that seen by llvm/clang; I imagine they're pretty similar.
> 
> I have to different proposals, and wanted to gauge which would be
preferred:
> 1. Use xxhash instead.
Seems OK to me. I suppose we'd have to fall back to the original hash
function if xxhash isn't supported.

vedant
> 
> 2. Use the Intel native crc32q instruction to process 8 bytes at a time,
then fall back to byte at a time.  Non sse 4.2 capable processors (either early
or non Intel/AMD x86) would use the existing algorithm, or possibly #1 above.
> 
> For my test, both result in approximately the same # of cycles (within
0.5%).
> 
> #1 uses 3+% more instructions.
> #2 requires (realistically) runtime detection of cpu capabilities, because
distributions would want generic x86/x86_64 compatible binaries, not separate
binaries per cpu feature.
> 
> I'm leaning toward #1 despite the instruction increase.  My only worry
is the effect on compile times for code with lots of short identifiers.  I
haven't tested that though (and I don't have a suitable benchmark suite
for that) so for all I know I'm worrying about nothing.
> 
> FYI the improvement is approximately 11.5% reduction in cycles for my lldb
test (b main, run, quit), so IMO it's pretty significant.
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Davide Italiano via llvm-dev

2017-Apr-25 20:11 UTC

head link

[llvm-dev] RFC: Improving performance of HashString

On Tue, Apr 25, 2017 at 12:55 PM, Vedant Kumar via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
>> On Apr 24, 2017, at 5:37 PM, Scott Smith via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>>
>> I've been working on improving the startup performance of lldb, and
ran into an issue with llvm::HashString.  It works a character at a time, which
creates a long dependency chain in the processor.  On the other hand, the
function is very short, which probably works well for short identifiers.
>>
>> I don't know how the mix of identifier length seen by lldb compares
with that seen by llvm/clang; I imagine they're pretty similar.
>>
>> I have to different proposals, and wanted to gauge which would be
preferred:
>
>
>> 1. Use xxhash instead.
>
I'd lean towards this option as we already have a copy of xxHash in
lib/Support (llvm/Support/xxhash.h).
We use it for computing the --build-id in lld as fast/non-crytographic
hash variant and has proven to outperform other candidates (e.g.
CityHash).

Vedant, what do you mean with unsupported?

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare

Sean Silva via llvm-dev

2017-Apr-28 21:57 UTC

head link

[llvm-dev] RFC: Improving performance of HashString

IIRC when I talked with Chandler (who has a lot of background in hashing),
the Bernstein hash function actually ends up being pretty good (as a hash
function, not necessarily performance) for C/C++ identifier-like strings
(despite not being that good for other types of strings), so we'll want to
verify that we don't regress hash quality (which affects efficiency of the
hash tables). In particular, IIRC this is the function used for Clang's
identifier maps (via StringMap) and so I'd like to see some measurements
that ensure that these performance improvements translate over to Clang (or
at least don't regress).

If Clang doesn't regress and xxHash is measurably better for other
HashString workloads, then I don't see a reason not to switch to it.

-- Sean Silva

On Mon, Apr 24, 2017 at 5:37 PM, Scott Smith via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I've been working on improving the startup performance of lldb, and ran
> into an issue with llvm::HashString.  It works a character at a time, which
> creates a long dependency chain in the processor.  On the other hand, the
> function is very short, which probably works well for short identifiers.
>
> I don't know how the mix of identifier length seen by lldb compares
with
> that seen by llvm/clang; I imagine they're pretty similar.
>
> I have to different proposals, and wanted to gauge which would be
> preferred:
>
> 1. Use xxhash instead.
>
> 2. Use the Intel native crc32q instruction to process 8 bytes at a time,
> then fall back to byte at a time.  Non sse 4.2 capable processors (either
> early or non Intel/AMD x86) would use the existing algorithm, or possibly
> #1 above.
>
> For my test, both result in approximately the same # of cycles (within
> 0.5%).
>
> #1 uses 3+% more instructions.
> #2 requires (realistically) runtime detection of cpu capabilities, because
> distributions would want generic x86/x86_64 compatible binaries, not
> separate binaries per cpu feature.
>
> I'm leaning toward #1 despite the instruction increase.  My only worry
is
> the effect on compile times for code with lots of short identifiers.  I
> haven't tested that though (and I don't have a suitable benchmark
suite for
> that) so for all I know I'm worrying about nothing.
>
> FYI the improvement is approximately 11.5% reduction in cycles for my lldb
> test (b main, run, quit), so IMO it's pretty significant.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170428/239818ec/attachment-0001.html>

Bruce Hoult via llvm-dev

2017-Apr-28 22:51 UTC

head link

[llvm-dev] RFC: Improving performance of HashString

According to...

https://github.com/rurban/smhasher/blob/master/README.md

Bernstein has quality problems (while xx is as good as you get in a
non-crypto hash), and xx is 7x (32 bit) - 12x (64 bit) faster.

That's on long strings. It would be worth checking the startup overhead for
typically short identifiers in programs.

See later on in the README:

"When used in a hash table the instruction cache will usually beat the CPU
and throughput measured here. In my tests the smallest FNV1A beats the
fastest crc32_hw1 with Perl 5 hash tables. Even if those worse hash
functions will lead to more collisions, the overall speed advantage beats
the slightly worse quality. See e.g. A Seven-Dimensional Analysis of
Hashing Methods and its Implications on Query Processing for a concise
overview of the best hash table strategies, confirming that the simpliest
Mult hashing (bernstein, FNV*, x17, sdbm) always beat "better" hash
functions (Tabulation, Murmur, Farm, ...) when used in a hash table.

The fast hash functions tested here are recommendable as fast for file
digests and maybe bigger databases, but not for 32bit hash tables."



On Sat, Apr 29, 2017 at 12:57 AM, Sean Silva via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> IIRC when I talked with Chandler (who has a lot of background in hashing),
> the Bernstein hash function actually ends up being pretty good (as a hash
> function, not necessarily performance) for C/C++ identifier-like strings
> (despite not being that good for other types of strings), so we'll want
to
> verify that we don't regress hash quality (which affects efficiency of
the
> hash tables). In particular, IIRC this is the function used for Clang's
> identifier maps (via StringMap) and so I'd like to see some
measurements
> that ensure that these performance improvements translate over to Clang (or
> at least don't regress).
>
> If Clang doesn't regress and xxHash is measurably better for other
> HashString workloads, then I don't see a reason not to switch to it.
>
> -- Sean Silva
>
> On Mon, Apr 24, 2017 at 5:37 PM, Scott Smith via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I've been working on improving the startup performance of lldb, and
ran
>> into an issue with llvm::HashString.  It works a character at a time,
which
>> creates a long dependency chain in the processor.  On the other hand,
the
>> function is very short, which probably works well for short
identifiers.
>>
>> I don't know how the mix of identifier length seen by lldb compares
with
>> that seen by llvm/clang; I imagine they're pretty similar.
>>
>> I have to different proposals, and wanted to gauge which would be
>> preferred:
>>
>> 1. Use xxhash instead.
>>
>> 2. Use the Intel native crc32q instruction to process 8 bytes at a
time,
>> then fall back to byte at a time.  Non sse 4.2 capable processors
(either
>> early or non Intel/AMD x86) would use the existing algorithm, or
possibly
>> #1 above.
>>
>> For my test, both result in approximately the same # of cycles (within
>> 0.5%).
>>
>> #1 uses 3+% more instructions.
>> #2 requires (realistically) runtime detection of cpu capabilities,
>> because distributions would want generic x86/x86_64 compatible
binaries,
>> not separate binaries per cpu feature.
>>
>> I'm leaning toward #1 despite the instruction increase.  My only
worry is
>> the effect on compile times for code with lots of short identifiers.  I
>> haven't tested that though (and I don't have a suitable
benchmark suite for
>> that) so for all I know I'm worrying about nothing.
>>
>> FYI the improvement is approximately 11.5% reduction in cycles for my
>> lldb test (b main, run, quit), so IMO it's pretty significant.
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170429/8b1ecf35/attachment.html>

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Apr 2017 - RFC: Improving performance of HashString

[llvm-dev] RFC: Improving performance of HashString

[llvm-dev] RFC: Improving performance of HashString

[llvm-dev] RFC: Improving performance of HashString

[llvm-dev] RFC: Improving performance of HashString

[llvm-dev] RFC: Improving performance of HashString

Possibly Parallel Threads