Davide Italiano via llvm-dev
2017-Apr-25 20:11 UTC
[llvm-dev] RFC: Improving performance of HashString
On Tue, Apr 25, 2017 at 12:55 PM, Vedant Kumar via llvm-dev <llvm-dev at lists.llvm.org> wrote:> >> On Apr 24, 2017, at 5:37 PM, Scott Smith via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> I've been working on improving the startup performance of lldb, and ran into an issue with llvm::HashString. It works a character at a time, which creates a long dependency chain in the processor. On the other hand, the function is very short, which probably works well for short identifiers. >> >> I don't know how the mix of identifier length seen by lldb compares with that seen by llvm/clang; I imagine they're pretty similar. >> >> I have to different proposals, and wanted to gauge which would be preferred: > > >> 1. Use xxhash instead. >I'd lean towards this option as we already have a copy of xxHash in lib/Support (llvm/Support/xxhash.h). We use it for computing the --build-id in lld as fast/non-crytographic hash variant and has proven to outperform other candidates (e.g. CityHash). Vedant, what do you mean with unsupported? -- Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare
Vedant Kumar via llvm-dev
2017-Apr-25 20:15 UTC
[llvm-dev] RFC: Improving performance of HashString
> On Apr 25, 2017, at 1:11 PM, Davide Italiano <davide at freebsd.org> wrote: > > On Tue, Apr 25, 2017 at 12:55 PM, Vedant Kumar via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >>> On Apr 24, 2017, at 5:37 PM, Scott Smith via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> I've been working on improving the startup performance of lldb, and ran into an issue with llvm::HashString. It works a character at a time, which creates a long dependency chain in the processor. On the other hand, the function is very short, which probably works well for short identifiers. >>> >>> I don't know how the mix of identifier length seen by lldb compares with that seen by llvm/clang; I imagine they're pretty similar. >>> >>> I have to different proposals, and wanted to gauge which would be preferred: >> >> >>> 1. Use xxhash instead. >> > > I'd lean towards this option as we already have a copy of xxHash in > lib/Support (llvm/Support/xxhash.h). > We use it for computing the --build-id in lld as fast/non-crytographic > hash variant and has proven to outperform other candidates (e.g. > CityHash). > > Vedant, what do you mean with unsupported?Apologies. I misread my grep results. I saw an #ifdef containing 'xxhash' and 'support', so I thought xxhash support was conditional in some way. I didn't realize it's a portable hash. vedant> > -- > Davide > > "There are no solved problems; there are only problems that are more > or less solved" -- Henri Poincare
Scott Smith via llvm-dev
2017-Apr-25 20:19 UTC
[llvm-dev] RFC: Improving performance of HashString
On Tue, Apr 25, 2017 at 1:11 PM, Davide Italiano <davide at freebsd.org> wrote:> On Tue, Apr 25, 2017 at 12:55 PM, Vedant Kumar via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > >> On Apr 24, 2017, at 5:37 PM, Scott Smith via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> > >> I've been working on improving the startup performance of lldb, and ran > into an issue with llvm::HashString. It works a character at a time, which > creates a long dependency chain in the processor. On the other hand, the > function is very short, which probably works well for short identifiers. > >> > >> I don't know how the mix of identifier length seen by lldb compares > with that seen by llvm/clang; I imagine they're pretty similar. > >> > >> I have to different proposals, and wanted to gauge which would be > preferred: > > > > > >> 1. Use xxhash instead. > > > > I'd lean towards this option as we already have a copy of xxHash in > lib/Support (llvm/Support/xxhash.h). > We use it for computing the --build-id in lld as fast/non-crytographic > hash variant and has proven to outperform other candidates (e.g. > CityHash). >That's actually why I suggested it, though AFAIK xxHash has more collisions than other hash algorithms. xxHash outperforms CityHash? https://github.com/rurban/smhasher/ So many choices! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170425/6f8b0d4d/attachment.html>
Davide Italiano via llvm-dev
2017-Apr-25 21:47 UTC
[llvm-dev] RFC: Improving performance of HashString
On Tue, Apr 25, 2017 at 1:19 PM, Scott Smith <scott.smith at purestorage.com> wrote:> On Tue, Apr 25, 2017 at 1:11 PM, Davide Italiano <davide at freebsd.org> wrote: >> >> On Tue, Apr 25, 2017 at 12:55 PM, Vedant Kumar via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> > >> >> On Apr 24, 2017, at 5:37 PM, Scott Smith via llvm-dev >> >> <llvm-dev at lists.llvm.org> wrote: >> >> >> >> I've been working on improving the startup performance of lldb, and ran >> >> into an issue with llvm::HashString. It works a character at a time, which >> >> creates a long dependency chain in the processor. On the other hand, the >> >> function is very short, which probably works well for short identifiers. >> >> >> >> I don't know how the mix of identifier length seen by lldb compares >> >> with that seen by llvm/clang; I imagine they're pretty similar. >> >> >> >> I have to different proposals, and wanted to gauge which would be >> >> preferred: >> > >> > >> >> 1. Use xxhash instead. >> > >> >> I'd lean towards this option as we already have a copy of xxHash in >> lib/Support (llvm/Support/xxhash.h). >> We use it for computing the --build-id in lld as fast/non-crytographic >> hash variant and has proven to outperform other candidates (e.g. >> CityHash). > > > That's actually why I suggested it, though AFAIK xxHash has more collisions > than other hash algorithms. > > xxHash outperforms CityHash? https://github.com/rurban/smhasher/ So many > choices! >IIRC, yes. Of course, your mileage may vary but given we have a solid alternative in-tree already I wouldn't go for another non-crypto hash function unless there's a good reason for it. -- Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare