Gregory Petrosyan
2010-Feb-07 21:03 UTC
[LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
On Sat, Feb 06, 2010 at 04:51:15PM -0800, Chandler Carruth wrote:> While I've not reviewed the patch in too much detail, it looks > promising. Can you run some end-to-end benchmarks to make sure that > cache pressure in the full program or other variables not accounted > for in a micro-benchmark don't dominate performance? Specifically the > nightly tester includes a number of real programs and machinery to > measure total compile time.Ok, now with some kinda-hard numbers! murmurhash2 | superfasthash | - 6.6404 seconds (6.6697 wall clock) | 6.6204 seconds (6.8557 wall clock) + 2.6722 seconds (2.7064 wall clock) | 2.7962 seconds (2.7502 wall clock) + 8.6725 seconds (8.6662 wall clock) | 8.7526 seconds (8.7162 wall clock) + 2.7362 seconds (2.7729 wall clock) | 2.8242 seconds (2.8146 wall clock) + 1.4281 seconds (1.4068 wall clock) | 1.4761 seconds (1.4198 wall clock) + 4.3163 seconds (4.3392 wall clock) | 4.3683 seconds (4.3969 wall clock) | + 6.6804 seconds (6.6916 wall clock) | 6.7804 seconds (6.6950 wall clock) + 2.7682 seconds (2.7342 wall clock) | 2.7802 seconds (2.7321 wall clock) - 8.7365 seconds (8.9505 wall clock) | 8.6965 seconds (8.7124 wall clock) + 2.7962 seconds (2.7812 wall clock) | 2.8282 seconds (2.8049 wall clock) + 1.3881 seconds (1.4103 wall clock) | 1.4001 seconds (1.4103 wall clock) - 4.3843 seconds (4.3509 wall clock) | 4.3723 seconds (4.3714 wall clock) | + 6.5724 seconds (6.7971 wall clock) | 6.6684 seconds (6.6465 wall clock) + 2.6401 seconds (2.7378 wall clock) | 2.6682 seconds (2.7965 wall clock) + 8.7005 seconds (8.7068 wall clock) | 8.7766 seconds (8.8452 wall clock) + 2.7282 seconds (2.7680 wall clock) | 2.7762 seconds (2.7904 wall clock) - 1.4561 seconds (1.4230 wall clock) | 1.4201 seconds (1.4184 wall clock) - 4.4523 seconds (4.3754 wall clock) | 4.4003 seconds (4.3706 wall clock) This table was obtaind by running 3 times 'make TEST=nightly report.html -j2 && l Output/ | grep info | xargs cat | grep 'Total Execution Time' >> timings && make clean' or something like that, on MultiSource/Applications/ClamAV. It looks like FoldingSetNodeID is not the most performance-critical part of LLVM, but changing the hash algorithm help a bit. Gregory
Chris Lattner
2010-Feb-08 18:31 UTC
[LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
On Feb 7, 2010, at 1:03 PM, Gregory Petrosyan wrote:> On Sat, Feb 06, 2010 at 04:51:15PM -0800, Chandler Carruth wrote: >> While I've not reviewed the patch in too much detail, it looks >> promising. Can you run some end-to-end benchmarks to make sure that >> cache pressure in the full program or other variables not accounted >> for in a micro-benchmark don't dominate performance? Specifically the >> nightly tester includes a number of real programs and machinery to >> measure total compile time. > > Ok, now with some kinda-hard numbers!Hi Gregory, These numbers are so noisy, that they aren't particularly useful. Could you try instrumenting foldingset to keep track track of the # collisions and # hash table resizes and compare those? They should be much more stable and still correlate directly to performance. -Chris
Gregory Petrosyan
2010-Feb-08 19:28 UTC
[LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
On Mon, Feb 08, 2010 at 10:31:23AM -0800, Chris Lattner wrote:> These numbers are so noisy, that they aren't particularly useful. > Could you try instrumenting foldingset to keep track track of the # > collisions and # hash table resizes and compare those? They should > be much more stable and still correlate directly to performance.Yup -- they are extremely noisy. Very good idea -- will give it a try. Gregory
On Mon, Feb 08, 2010 at 10:31:23AM -0800, Chris Lattner wrote:> On Feb 7, 2010, at 1:03 PM, Gregory Petrosyan wrote: > > >On Sat, Feb 06, 2010 at 04:51:15PM -0800, Chandler Carruth wrote: > >>While I've not reviewed the patch in too much detail, it looks > >>promising. Can you run some end-to-end benchmarks to make sure that > >>cache pressure in the full program or other variables not accounted > >>for in a micro-benchmark don't dominate performance? Specifically the > >>nightly tester includes a number of real programs and machinery to > >>measure total compile time. > > > >Ok, now with some kinda-hard numbers! > > Hi Gregory, > > These numbers are so noisy, that they aren't particularly useful. > Could you try instrumenting foldingset to keep track track of the # > collisions and # hash table resizes and compare those? They should > be much more stable and still correlate directly to performance.OK, now with real numbers :-) First, the main thing: SuperFastHash appears to be the hash with best distribution. Use of MurmurHash instead generates 1.28% more collisions while doing nightly test in MultiSource/Applications. Second: I've also tested lookup3 hash, and its use generated 0.1% more collisions, compared to SFH. These results were a bit surprising for me! Number of hash table resizes is independent of used hashing algorithm, because hash table grows when 'nentries > nbuckets * 2'. Gregory
Possibly Parallel Threads
- [LLVMdev] FoldingSet #collisions comparison
- [LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
- [LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
- [LLVMdev] [PATCH] FoldingSetNodeID: use MurmurHash2 instead of SuperFastHash
- [LLVMdev] a different hash for APInts