Gregory Petrosyan
2010-Feb-06 15:09 UTC
[LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
Some additional info can be found at: http://murmurhash.googlepages.com/ http://en.wikipedia.org/wiki/MurmurHash http://www.codeproject.com/KB/recipes/hash_functions.aspx as well as in the patch description itself. Patch and benchmark attached. Gregory -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-FoldingSetNodeID-use-MurmurHash2-instead-of-SuperFas.patch Type: text/x-diff Size: 4202 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100206/b1f04413/attachment.patch> -------------- next part -------------- A non-text attachment was scrubbed... Name: hash_bench.c Type: text/x-csrc Size: 3650 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100206/b1f04413/attachment.c>
Chandler Carruth
2010-Feb-07 00:51 UTC
[LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
While I've not reviewed the patch in too much detail, it looks promising. Can you run some end-to-end benchmarks to make sure that cache pressure in the full program or other variables not accounted for in a micro-benchmark don't dominate performance? Specifically the nightly tester includes a number of real programs and machinery to measure total compile time. On Sat, Feb 6, 2010 at 7:09 AM, Gregory Petrosyan <gregory.petrosyan at gmail.com> wrote:> Some additional info can be found at: > > http://murmurhash.googlepages.com/ > http://en.wikipedia.org/wiki/MurmurHash > http://www.codeproject.com/KB/recipes/hash_functions.aspx > > as well as in the patch description itself. Patch and benchmark attached. > > Gregory > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >
Gregory Petrosyan
2010-Feb-07 21:03 UTC
[LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
On Sat, Feb 06, 2010 at 04:51:15PM -0800, Chandler Carruth wrote:> While I've not reviewed the patch in too much detail, it looks > promising. Can you run some end-to-end benchmarks to make sure that > cache pressure in the full program or other variables not accounted > for in a micro-benchmark don't dominate performance? Specifically the > nightly tester includes a number of real programs and machinery to > measure total compile time.Ok, now with some kinda-hard numbers! murmurhash2 | superfasthash | - 6.6404 seconds (6.6697 wall clock) | 6.6204 seconds (6.8557 wall clock) + 2.6722 seconds (2.7064 wall clock) | 2.7962 seconds (2.7502 wall clock) + 8.6725 seconds (8.6662 wall clock) | 8.7526 seconds (8.7162 wall clock) + 2.7362 seconds (2.7729 wall clock) | 2.8242 seconds (2.8146 wall clock) + 1.4281 seconds (1.4068 wall clock) | 1.4761 seconds (1.4198 wall clock) + 4.3163 seconds (4.3392 wall clock) | 4.3683 seconds (4.3969 wall clock) | + 6.6804 seconds (6.6916 wall clock) | 6.7804 seconds (6.6950 wall clock) + 2.7682 seconds (2.7342 wall clock) | 2.7802 seconds (2.7321 wall clock) - 8.7365 seconds (8.9505 wall clock) | 8.6965 seconds (8.7124 wall clock) + 2.7962 seconds (2.7812 wall clock) | 2.8282 seconds (2.8049 wall clock) + 1.3881 seconds (1.4103 wall clock) | 1.4001 seconds (1.4103 wall clock) - 4.3843 seconds (4.3509 wall clock) | 4.3723 seconds (4.3714 wall clock) | + 6.5724 seconds (6.7971 wall clock) | 6.6684 seconds (6.6465 wall clock) + 2.6401 seconds (2.7378 wall clock) | 2.6682 seconds (2.7965 wall clock) + 8.7005 seconds (8.7068 wall clock) | 8.7766 seconds (8.8452 wall clock) + 2.7282 seconds (2.7680 wall clock) | 2.7762 seconds (2.7904 wall clock) - 1.4561 seconds (1.4230 wall clock) | 1.4201 seconds (1.4184 wall clock) - 4.4523 seconds (4.3754 wall clock) | 4.4003 seconds (4.3706 wall clock) This table was obtaind by running 3 times 'make TEST=nightly report.html -j2 && l Output/ | grep info | xargs cat | grep 'Total Execution Time' >> timings && make clean' or something like that, on MultiSource/Applications/ClamAV. It looks like FoldingSetNodeID is not the most performance-critical part of LLVM, but changing the hash algorithm help a bit. Gregory
Török Edwin
2010-Feb-08 18:15 UTC
[LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
On 2010-02-06 17:09, Gregory Petrosyan wrote:> Some additional info can be found at: > > http://murmurhash.googlepages.com/ > http://en.wikipedia.org/wiki/MurmurHash > http://www.codeproject.com/KB/recipes/hash_functions.aspx > > as well as in the patch description itself. Patch and benchmark attached. > >+/// This version additionally assumes that 'len % 4 == 0'. Below are the +/// original notes. +/// +/// Note - This code makes a few assumptions about how your machine behaves - +/// +/// 1. We can read a 4-byte value from any address without crashing Doesn't it only need the start of the data to be 4-byte aligned? That seems to be the case already, since this is a smallvector of unsigned. +/// 2. sizeof(int) == 4 How about using uint32_t/int32_t instead of int then? Best regards, --Edwin
Gregory Petrosyan
2010-Feb-08 19:27 UTC
[LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
On Mon, Feb 08, 2010 at 08:15:49PM +0200, Török Edwin wrote:> +/// This version additionally assumes that 'len % 4 == 0'. Below are the > +/// original notes. > +/// > +/// Note - This code makes a few assumptions about how your machine > behaves - > +/// > +/// 1. We can read a 4-byte value from any address without crashing > > Doesn't it only need the start of the data to be 4-byte aligned? > That seems to be the case already, since this is a smallvector of unsigned. > > +/// 2. sizeof(int) == 4 > > How about using uint32_t/int32_t instead of int then?You are right in both cases -- these are notes (and source) from the original implementation, I decided to leave them as-is for easier verification of the hashing algorithm. Gregory
Reasonably Related Threads
- [LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
- [LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
- [LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
- [LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
- [LLVMdev] FoldingSet #collisions comparison