thr3ads.net - search: "salsa20"

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

2013 Aug 08

2

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

...s such as PollyDependence, PollyOptimization and PollyCodegen. Especially, I notice that the PollyDependence can lead to significant extra compile-time overhead. Its compile-time percentage for some expensive benchmarks can be summarized as: nestedloop: 41.4% (Polly - Calculate dependence) salsa20: 98.5% (Polly - Calculate dependence) seidel-2d: 72.1% (Polly - Calculate dependence) multiplies: 54.3% (Poly - Calculate dependence) Puzzle: 22.8% (Poly - Calculate dependence) As a result, it is critical to improve the PollyDependence pass. I have previously...

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

2013 Aug 08

0

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

...llyDependence, PollyOptimization and PollyCodegen. Especially, I notice that the PollyDependence can lead to significant extra compile-time overhead. Its compile-time percentage for some expensive benchmarks can be summarized as: > nestedloop: 41.4% (Polly - Calculate dependence) > salsa20: 98.5% (Polly - Calculate dependence) > seidel-2d: 72.1% (Polly - Calculate dependence) > multiplies: 54.3% (Poly - Calculate dependence) > Puzzle: 22.8% (Poly - Calculate dependence) > > > As a result, it is critical to improve the PollyDepe...

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

2013 Aug 09

2

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

...dence, PollyOptimization and PollyCodegen. Especially, I notice that the PollyDependence can lead to significant extra compile-time overhead. Its compile-time percentage for some expensive benchmarks can be summarized as: >> nestedloop: 41.4% (Polly - Calculate dependence) >> salsa20: 98.5% (Polly - Calculate dependence) >> seidel-2d: 72.1% (Polly - Calculate dependence) >> multiplies: 54.3% (Poly - Calculate dependence) >> Puzzle: 22.8% (Poly - Calculate dependence) >> >> >> As a result, it is critical...

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

2013 Aug 11

2

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

...8 benchmarks improved and 16 benchmarks regressed. Especially, with those recent performance-oriented patch files for ScopDetect/ScopInfo/ScopDependences/..., we have significantly reduced the compile-time overhead of Polly for a large number of benchmarks, such as: SingleSource/Benchmarks/Misc/salsa20 -97.84% SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu -85.01% MultiSource/Applications/obsequi/Obsequi -57.12% SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d -50.00% MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gs...

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

2013 Aug 11

0

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

...hmarks improved and 16 benchmarks regressed. Especially, with those recent performance-oriented patch files for ScopDetect/ScopInfo/ScopDependences/..., we have significantly reduced the compile-time overhead of Polly for a large number of benchmarks, such as: > SingleSource/Benchmarks/Misc/salsa20 -97.84% > SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu -85.01% > MultiSource/Applications/obsequi/Obsequi -57.12% > SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d -50.00% > MultiSource/Benchmarks/MiBench/...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 08

1

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...66.6% slowdown SingleSource/Benchmarks/Misc/flops-8 - 64.2% slowdown (from these, I've excluded tests that took less that 0.1 seconds to run). Largest three compile-time slowdowns: MultiSource/Benchmarks/MiBench/security-rijndael/security-rijndael - 1276% slowdown SingleSource/Benchmarks/Misc/salsa20 - 1000% slowdown MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des - 508% slowdown Not everything slows down, MultiSource/Benchmarks/Prolangs-C ++/city/city, for example, compiles 10% faster with vectorization enabled; but, for the most part, things certainly take longer to compile with vectorizat...

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

2013 Aug 09

0

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

...s the first without the patch applied the second with. If this is the case, why does the page say "*** WARNING ***: comparison is against a different machine (pollyScopInfo__clang_DEV__x86_64,11)"? > Results show that this patch file is very effective for several benchmarks, such as salsa20 (reduced by 97.72%), Obsequi (54.62%), seidel-2d (48.64%), telecomm-gsm (33.71%). > >> I did not yet look at the nestedloop benchmark, but it sounds basically >> like a benchmark only consisting of loop nests that we can optimise. >> This is definitely interesting to look into....

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 08

0

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...Misc/flops-8 - 64.2% slowdown > Interesting. Do you understand what causes these slowdowns? Can your heuristic be improved? > Largest three compile-time slowdowns: > MultiSource/Benchmarks/MiBench/security-rijndael/security-rijndael - > 1276% slowdown > SingleSource/Benchmarks/Misc/salsa20 - 1000% slowdown > MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des - 508% slowdown Yes, that is a lot. Do you understand if this time is invested well (does it give significant speedups)? If I understood correctly it seems your vectorizer has quadratic complexity which may cause large slow...

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

2013 Aug 12

1

[LLVMdev] [FastPolly]: Update of Polly's performance on LLVM test-suite

...ks improved and 16 benchmarks regressed. Especially, with those recent performance-oriented patch files for ScopDetect/ScopInfo/ScopDependences/..., we have significantly reduced the compile-time overhead of Polly for a large number of benchmarks, such as: >> SingleSource/Benchmarks/Misc/salsa20 -97.84% >> SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu -85.01% >> MultiSource/Applications/obsequi/Obsequi -57.12% >> SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d -50.00% >> MultiSource/Ben...

Asterisk support for Bittorrent Bleep

2014 Aug 11

3

Asterisk support for Bittorrent Bleep

Hello, Full disclosure: my name is Farid Fadaie and I'm in charge of BitTorrent Bleep (a private P2P SIP-based messaging application in early alpha) http://blog.bittorrent.com/2014/07/30/building-an-engine-for-decentralized-communications/ I have personally been a fan of Asterisk and have been using it for years and now that we have (kind of) released Bleep, I wanted to ask you guys to let

[LLVMdev] Greedy register allocation

2011 Apr 30

2

[LLVMdev] Greedy register allocation

...c4 -11.0% MultiSource/Benchmarks/ASCI_Purple/SMG2000/smg2000 -9.0% MultiSource/Applications/minisat/minisat -8.8% External/SPEC/CINT2006/473.astar/473.astar -8.7% SingleSource/Benchmarks/Misc/ReedSolomon -8.6% SingleSource/Benchmarks/BenchmarkGame/nsieve-bits -8.0% SingleSource/Benchmarks/Misc/salsa20 -7.7% MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -7.4% External/SPEC/CINT2006/401.bzip2/401.bzip2 -7.4% SingleSource/Benchmarks/Misc/mandel-2 -7.3% SingleSource/Benchmarks/Shootout-C++/methcall -6.5% MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame -6.4% External/SPEC/CF...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 08

3

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...ive vector operations creates inefficient code, and I'll also look at these cases in more detail. > > > Largest three compile-time slowdowns: > > MultiSource/Benchmarks/MiBench/security-rijndael/security-rijndael - > > 1276% slowdown > > SingleSource/Benchmarks/Misc/salsa20 - 1000% slowdown > > MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des - 508% slowdown > > Yes, that is a lot. Do you understand if this time is invested well > (does it give significant speedups)? No, not always. Actually, the security-rijndael test not only takes the longest to...

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

2013 Aug 09

2

[LLVMdev] [Polly] Summary of some expensive compiler passes, especially PollyDependence

...e page say "*** WARNING ***: comparison is against a >different machine (pollyScopInfo__clang_DEV__x86_64,11)"? LNT always report warning if you compare two runs with different tester names. >> Results show that this patch file is very effective for several benchmarks, such as salsa20 (reduced by 97.72%), Obsequi (54.62%), seidel-2d (48.64%), telecomm-gsm (33.71%). >> >>> I did not yet look at the nestedloop benchmark, but it sounds basically >>> like a benchmark only consisting of loop nests that we can optimise. >>> This is definitely interesti...

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

2015 Feb 26

5

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 08

3

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

...rijndael - 114% slowdown MultiSource/Benchmarks/MiBench/network-patricia/network-patricia - 81.8% slowdown SingleSource/Benchmarks/Misc/flops-8 - 79% slowdown Top compile-time slowdowns: MultiSource/Benchmarks/MiBench/security-rijndael/security-rijndael - 832% slowdown SingleSource/Benchmarks/Misc/salsa20 - 600% slowdown MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des - 263% slowdown For this comparison, however (unlike comparing to plain -O3), there are a significant number of compile-time speedups (I guess that this is because vectorization can reduce the number of instructions processed by lat...

[LLVMdev] Problem While Running Test Suite

2012 Feb 19

2

[LLVMdev] Problem While Running Test Suite

...eSource/Benchmarks/Misc/lowercase | * | * | SingleSource/Benchmarks/Misc/flops-8 | * | * | SingleSource/Benchmarks/Misc/ffbench | * | * | SingleSource/Benchmarks/Misc/salsa20 | * | * | SingleSource/Benchmarks/Misc/flops-7 | * | * | SingleSource/Benchmarks/Misc/mandel | * | * | SingleSource/Benchmarks/Misc/perlin...

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Nov 08

0

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On 11/08/2011 03:36 PM, Hal Finkel wrote: > On Tue, 2011-11-08 at 12:12 +0100, Tobias Grosser wrote: >> On 11/08/2011 11:45 AM, Hal Finkel wrote: >>> I've attached the latest version of my autovectorization patch. >>> >>> Working through the test suite has proved to be a productive >>> experience ;) -- And almost all of the bugs that it revealed

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

2011 Oct 29

4

[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass

On Sat, 2011-10-29 at 14:02 -0500, Hal Finkel wrote: > On Sat, 2011-10-29 at 12:30 -0500, Hal Finkel wrote: > > Ralf, et al., > > > > Attached is the latest version of my autovectorization patch. llvmdev > > has been CC'd (as had been suggested to me); this e-mail contains > > additional benchmark results. > > > > First, these are preliminary

Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path

2018 Apr 26

0

Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path

...3886|0.158737498| -0.03| 2155883879| 0|0.158739646| -0.03| 2155883879| 0| |SingleSource/Benchmarks/Misc/richards_benchmark.test | 78|0.434660464| 4033193605|0.434746459| -0.02| 4033193598| 0|0.434897056| -0.05| 4033193598| 0| |SingleSource/Benchmarks/Misc/salsa20.test | 40|3.685973193|45487828644|3.687791537| -0.05|45487828691| 0|3.685686752| 0.01|45487828690| 0| |SingleSource/Benchmarks/Misc/whetstone.test | 46|0.784256179| 4409167761|0.784001984| 0.03| 4409167754| 0|...

[LLVMdev] 2.6 pre-release2 ready for testing

2009 Oct 20

1

[LLVMdev] 2.6 pre-release2 ready for testing

... 0.84 | 0.97 1.01 n/a n/a > SingleSource/Benchmarks/Misc/richards_benchmark | > 0.0300 5756 0.0100 * 0.0300 | 1.13 1.05 > 1.17 * 1.33 | 1.08 0.97 n/a n/a > SingleSource/Benchmarks/Misc/salsa20 | > 0.0100 2692 0.0100 * 0.0100 | 9.30 9.72 > 9.27 * 9.33 | 0.96 1.00 n/a n/a > SingleSource/Benchmarks/Misc/whetstone | > 0.0199 3344 0.0199 * ...

search for: salsa20