thr3ads.net - search: "crossingthreshold"

always allow canonicalizing to 8- and 16-bit ops?

2018 Jan 22

2

always allow canonicalizing to 8- and 16-bit ops?

...-pc1 > -17.92% > SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant > -8.57% > External/SPEC/CINT2000/253.perlbmk/253.perlbmk > -3.43% > MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm > -3.36% > MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl > -1.34% > > +ve for these is bad, -ve is good. So overall looks like a good change, > especially in > simple_types_constant_folding. There may be some alignment issues that can > causing wilder swings than they should, but the results here look...

always allow canonicalizing to 8- and 16-bit ops?

2018 Jan 22

0

always allow canonicalizing to 8- and 16-bit ops?

...tiSource/Benchmarks/Trimaran/enc-pc1/enc-pc1 -17.92% SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant -8.57% External/SPEC/CINT2000/253.perlbmk/253.perlbmk -3.43% MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm -3.36% MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl -1.34% +ve for these is bad, -ve is good. So overall looks like a good change, especially in simple_types_constant_folding. There may be some alignment issues that can causing wilder swings than they should, but the results here look good. The list for aarch64 i...

[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation

2013 Sep 25

0

[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation

...e=58 It seems their execution-time performance are comparable: Performance Regressions - Execution Time (ISL over Cloog) MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt 8.49% MultiSource/Benchmarks/TSVC/StatementReordering-flt/StatementReordering-flt 6.77% MultiSource/Benchmarks/TSVC/CrossingThresholds-flt/CrossingThresholds-flt 2.65% SingleSource/UnitTests/Vectorizer/gcc-loops 2.63% Performance Improvements - Execution Time (ISL over Cloog) MultiSource/Benchmarks/TSVC/NodeSplitting-flt/NodeSplitting-flt -6.77% MultiSource/Benchmarks/ASC_Sequoia/AMGmk/AMGmk -3.03% However, ISL outperforms Clo...

always allow canonicalizing to 8- and 16-bit ops?

2018 Jan 17

3

always allow canonicalizing to 8- and 16-bit ops?

Example: define i8 @narrow_add(i8 %x, i8 %y) { %x32 = zext i8 %x to i32 %y32 = zext i8 %y to i32 %add = add nsw i32 %x32, %y32 %tr = trunc i32 %add to i8 ret i8 %tr } With no data-layout or with an x86 target where 8-bit integer is in the data-layout, we reduce to: $ ./opt -instcombine narrowadd.ll -S define i8 @narrow_add(i8 %x, i8 %y) { %add = add i8 %x, %y ret i8 %add } But on

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

2015 Feb 26

5

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".

Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path

2018 Apr 26

0

Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path

...54875|2.167388513| -0.2|10131154868| 0|2.162387173| 0.03|10131154865| 0| |MultiSource/Benchmarks/TSVC/ControlLoops-flt/ControlLoops-flt.test | 40|1.790150659| 6749980181|1.783526288| 0.37| 6749980171| 0|1.786574704| 0.2| 6749980173| 0| |MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl.test| 40| 2.08235624|11637250479|2.082839371| -0.02|11637250478| 0| 2.0841341| -0.09|11637250478| 0| |MultiSource/Benchmarks/TSVC/CrossingThresholds-flt/CrossingThresholds-flt.test| 40|1.513224597| 9567133532|1.509145173| 0.27| 9567133532| 0|1.5058262...

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

2015 May 15

6

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

tl;dr in low data situations we don’t look at past information, and that increases the false positive regression rate. We should look at the possibly incorrect recent past runs to fix that. Motivation: LNT’s current regression detection system has false positive rate that is too high to make it useful. With test suites as large as the llvm “test-suite” a single report will show hundreds of

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

2015 May 18

2

[LLVMdev] Proposal: change LNT’s regression detection algorithm and how it is used to reduce false positives

...51.49s this program) nts.MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk.exec > 23. 46.89% cumulative (0.88% - 50.66s this program) nts.MultiSource/Benchmarks/TSVC/ControlFlow-flt/ControlFlow-flt.exec > 24. 47.73% cumulative (0.84% - 48.74s this program) nts.MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl.exec > 25. 48.57% cumulative (0.84% - 48.43s this program) nts.MultiSource/Benchmarks/TSVC/InductionVariable-dbl/InductionVariable-dbl.exec > 26. 49.40% cumulative (0.83% - 47.92s this program) nts.SingleSource/Benchmarks/Polybench/datamining/correlation/correlatio...

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

2013 Jul 28

0

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

...972 0.45625335480408 Applications/lemon/lemon 0.6774 0.6805 0.45763212282255 Benchmarks/MiBench/telecomm-FFT/telecomm-ff 0.0209 0.021 0.47846889952154 Benchmarks/7zip/7zip-benchmark 5.9521 5.9811 0.48722299692545 Benchmarks/TSVC/CrossingThresholds-dbl/Cros 2.6449 2.6578 0.48773110514575 Applications/SPASS/SPASS 5.9442 5.9748 0.51478752397294 Benchmarks/MallocBench/cfrac/cfrac 1.2635 1.2704 0.54610209734862 Benchmarks/Ptrdist/ks/ks 0.7054 0.7117 0.8931...

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

2013 Jul 18

3

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

Andy and I briefly discussed this the other day, we have not yet got chance to list a detailed pass order for the pre- and post- IPO scalar optimizations. This is wish-list in our mind: pre-IPO: based on the ordering he propose, get rid of the inlining (or just inline tiny func), get rid of all loop xforms... post-IPO: get rid of inlining, or maybe we still need it, only

MachineVerifier and undef

2018 Jan 23

0

MachineVerifier and undef

...-pc1 > -17.92% > SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant > -8.57% > External/SPEC/CINT2000/253.perlbmk/253.perlbmk > -3.43% > MultiSource/Benchmarks/MiBench/telecomm-gsm/telecomm-gsm > -3.36% > MultiSource/Benchmarks/TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl > -1.34% > > +ve for these is bad, -ve is good. So overall looks like a good change, > especially in > simple_types_constant_folding. There may be some alignment issues that can > causing wilder swings than they should, but the results here look...

search for: crossingthreshold