thr3ads.net - search: "doitgen"

Displaying 18 results from an estimated 18 matches for "doitgen".

Did you mean: dingen

[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation

2013 Sep 25

[LLVMdev] [Polly] Performance comparison between Cloog and ISL code generation

...ch/linear-algebra/kernels/syr2k/syr2k -11.11% MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt -10.87% MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -10.87% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm -10.74% SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen -10.66% ... Star Tan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130925/a36b76fe/attachment.html>

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 Mar 19

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

...| 0.159 | 0.391 | 1.3% | 149.0% | | 3mm.c | 0.103 | 0.109 | 0.122 | 5.8% | 18.4% | | covariance.c | 0.16 | 0.163 | 1.346 | 1.9% | 741.3% | | gramchmidt.c | 0.159 | 0.167 | 1.023 | 5.0% | 543.4% | | eidel.c | 0.125 | 0.13 | 0.285 | 4.0% | 128.0% | | adi.c | 0.155 | 0.156 | 0.953 | 0.6% | 514.8% | | doitgen.c | 0.124 | 0.128 | 0.298 | 3.2% | 140.3% | | intrument.c | 0.149 | 0.151 | 0.837 | 1.3% | 461.7% | | atax.c | 0.135 | 0.136 | 0.917 | 0.7% | 579.3% | | gemm.c | 0.161 | 0.162 | 1.839 | 0.6% | 1042.2% | | jacobi-2d-imper.c | 0.16 | 0.161 | 0.649 | 0.6% | 305.6% | | bicg.c | 0.149 | 0.152 | 0.444 |...

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 Mar 18

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

...0.786 | 0.811 | 2.617 | 3.2% | 233.0% | >> | covariance.c | 0.73 | 0.74 | 2.294 | 1.4% | 214.2% | >> | gramschmidt.c | 0.63 | 0.643 | 1.134 | 2.1% | 80.0% | >> | seidel.c | 0.632 | 0.645 | 2.036 | 2.1% | 222.2% | >> | adi.c | 0.8 | 0.811 | 3.044 | 1.4% | 280.5% | >> | doitgen.c | 0.742 | 0.752 | 2.32 | 1.3% | 212.7% | >> | instrument.c | 0.445 | 0.45 | 0.495 | 1.1% | 11.2% | > >It is interesting to see that the only file that does not contain a >kernel that is optimized by polly, but just some auxiliary functions has >a very low compile time overhead...

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 Mar 18

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

...7 | 0.806 | 2.475 | 2.4% | 214.5% | | 3mm.c | 0.786 | 0.811 | 2.617 | 3.2% | 233.0% | | covariance.c | 0.73 | 0.74 | 2.294 | 1.4% | 214.2% | | gramschmidt.c | 0.63 | 0.643 | 1.134 | 2.1% | 80.0% | | seidel.c | 0.632 | 0.645 | 2.036 | 2.1% | 222.2% | | adi.c | 0.8 | 0.811 | 3.044 | 1.4% | 280.5% | | doitgen.c | 0.742 | 0.752 | 2.32 | 1.3% | 212.7% | | instrument.c | 0.445 | 0.45 | 0.495 | 1.1% | 11.2% | | atax.c | 0.614 | 0.627 | 1.007 | 2.1% | 64.0% | | gemm.c | 0.721 | 0.74 | 1.327 | 2.6% | 84.0% | | jacobi-2d-imper.c | 0.721 | 0.735 | 2.211 | 1.9% | 206.7% | | bicg.c | 0.577 | 0.597 | 1.01 | 3.5% |...

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 Mar 18

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

....5% | > | 3mm.c | 0.786 | 0.811 | 2.617 | 3.2% | 233.0% | > | covariance.c | 0.73 | 0.74 | 2.294 | 1.4% | 214.2% | > | gramschmidt.c | 0.63 | 0.643 | 1.134 | 2.1% | 80.0% | > | seidel.c | 0.632 | 0.645 | 2.036 | 2.1% | 222.2% | > | adi.c | 0.8 | 0.811 | 3.044 | 1.4% | 280.5% | > | doitgen.c | 0.742 | 0.752 | 2.32 | 1.3% | 212.7% | > | instrument.c | 0.445 | 0.45 | 0.495 | 1.1% | 11.2% | It is interesting to see that the only file that does not contain a kernel that is optimized by polly, but just some auxiliary functions has a very low compile time overhead. This may imply tha...

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 Mar 20

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

...How large is the standard deviation of the results? (You can use a tool like ministat to calculate these values [1]) > | gramchmidt.c | 0.159 | 0.167 | 1.023 | 5.0% | 543.4% | > | eidel.c | 0.125 | 0.13 | 0.285 | 4.0% | 128.0% | > | adi.c | 0.155 | 0.156 | 0.953 | 0.6% | 514.8% | > | doitgen.c | 0.124 | 0.128 | 0.298 | 3.2% | 140.3% | > | intrument.c | 0.149 | 0.151 | 0.837 | 1.3% | 461.7% | This number is surprising. In your last numbers you reported Polly-optimize as taking 0.495 sec in debug mode. The time you now report for the release mode is almost twice as much. Can you ver...

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 Apr 30

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

...ajor compiling time, I think those timers should catch most of Polly compiling overhead. Unfortunately, this is not true. My experimental results show that the compiling time captured by those timers only accounts for less than half of total Polly compiling time. For example, when compiling the doitgen.c in PolyBench with Polly, the total Polly compiling overhead is about 0.7 seconds, but the compiling overhead captured by our timers is only about 0.2 seconds. A lot of compiling time is consumed by LLVM codes out of Polly. For example, the RegisterPasses.cpp shows that PM.add(polly::createIslSc...

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

2015 Feb 26

[LLVMdev] [RFC] AArch64: Should we disable GlobalMerge?

Hi all, I've started looking at the GlobalMerge pass, enabled by default on ARM and AArch64. I think we should reconsider that, at least for AArch64. As is, the pass just merges all globals together, in groups of 4KB (AArch64, 128B on ARM). At the time it was enabled, the general thinking was "it's almost free, it doesn't affect performance much, we might as well use it".

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 Apr 26

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

Hi all, I have updated my GSoS proposal: "FastPolly: Reducing LLVM-Polly Compiling overhead" (https://gist.github.com/tanstar/5441808). I think the pass ordering problem you discussed early can be also investigated in this project! Is there any comment or advice about my proposal? I appreciate all your help and advice. Thanks, Star Tan Proposal:

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 Mar 23

[LLVMdev] [Polly]GSoC Proposal: Reducing LLVM-Polly Compiling overhead

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 May 03

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

...t the .ll file with 'clang -O0'. Otherwise you run polly on code that is already -O3 optimized, making the runs not comparable to the clang integrated ones and also unrealistic as they do not reflect what we do when running Polly from within clang. It would be interesting to understand the doitgen results better. The time in the optimizer is only 0.408 seconds, whereas the increase from pBasic to pOpt is 0.897 - 0.151 = 0.746 seconds. This seems surprising. Is this because of running polly on -O3 optimized code, is Polly producing bigger .ll files which yield to longer object file emmission...

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 May 03

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

Dear Tobias, Thank you very much for your very helpful advice. Yes, -debug-pass and -time-passes are two very useful and powerful options when evaluating the compile-time of each compiler pass. They are exactly what I need! With these options, I can step into details of the compile-time overhead of each pass. I have finished some preliminary testing based on two randomly selected files from

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 May 02

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

On 04/30/2013 04:13 PM, Star Tan wrote: > Hi all, [...] > How could I find out where the time is spent on between two adjacent Polly passes? Can anyone give me some advice? Hi Star Tan, I propose to do the performance analysis using the 'opt' tool and optimizing LLVM-IR, instead of running it from within clang. For the 'opt' tool there are two commands that should help

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 May 02

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

....c 0.1062 0.1075 0.1124 0.123 0.1216 0.00% 5.84% > 15.82% 14.50% ludcmp.c 0.157 0.1602 0.2002 1.0761 1.3175 2.04% > 27.52% 585.41% 739.17% 3mm.c 0.1529 0.1559 0.1826 0.4134 > 1.0436 1.96% 19.42% 170.37% 582.54% bicg.c 0.1244 0.1268 > 0.1353 0.1977 0.2828 1.93% 8.76% 58.92% 127.33% doitgen.c > 0.1492 0.1505 0.1644 0.3325 0.8971 0.00% 10.19% 122.86% 501.27% > gesummv.c 0.1224 0.1279 0.134 0.1999 0.2937 4.49% 9.48% > 63.32% 139.95% jacobi.c 0.1444 0.1506 0.1592 0.3912 0.8494 > 0.00% 10.25% 170.91% 488.23% seidel.c 0.1337 0.1353 0.1462 > 0.6299 0.9155 0.00% 9.35% 371.13%...

Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path

2018 Apr 26

Compare test-suite benchmarks performance complied without TBAA, with default TBAA and with new TBAA struct path

...4571983| 0|0.167430814| 0.15| 464571980| 0| |SingleSource/Benchmarks/Polybench/linear-algebra/kernels/cholesky/cholesky.tes| 110|0.286466354| 1690933420|0.287513507| -0.36| 1690933424| 0|0.287682162| -0.42| 1690933424| 0| |SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.test | 75|0.445250085| 3399897372|0.446000827| -0.17| 3399897368| 0|0.446003224| -0.17| 3399897368| 0| |SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.test | 72|0.476362624| 714917745|0.479692636| -0.69| 714917750| 0|0.475910147| 0.1| 7149177...

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

2013 May 03

[LLVMdev] [Polly] GSoC Proposal: Reducing LLVM-Polly Compiling overhead

...'clang -O0'. Otherwise you run >polly on code that is already -O3 optimized, making the runs not >comparable to the clang integrated ones and also unrealistic as they do >not reflect what we do when running Polly from within clang. > >It would be interesting to understand the doitgen results better. The >time in the optimizer is only 0.408 seconds, whereas the increase from >pBasic to pOpt is 0.897 - 0.151 = 0.746 seconds. This seems surprising. >Is this because of running polly on -O3 optimized code, is Polly >producing bigger .ll files which yield to longer object...

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

2014 Jan 28

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

Hi Stepan, Sorry for the delay. It's great that you are working on MergeFunctions as well and I agree, we should definitely try to combine our efforts to improve MergeFunctions. Just to give you some context, the pass (with the similar function merging patch) is already being used in a production setting. From my point of view, it would be better if we focus on improving its capability

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

2014 Jan 30

[LLVMdev] MergeFunctions: reduce complexity to O(log(N))

...8241 0 0.01 8225 0 0.01 8225 div.ll 18 13898 7 0.01 11319 7 0.01 11319 Divsol.ll 4 29265 0 0.01 29243 0 0.01 29243 djpeg.ll 5 90842 0 0.01 90811 0 0.01 90811 doborder.ll 2 45439 0 0.01 45406 0 0.01 45406 doc-proof.ll 29 56444 0 0.01 56427 0 0.02 56427 does_x_win.ll 5 91600 0 0.01 91581 0 0.02 91581 doitgen.ll 12 30417 0 0.01 30366 0 0.01 30366 dominate.ll 2 34437 0 0.01 34407 0 0.01 34407 dot.ll 1 1874 0 0.01 1845 0 0.01 1845 doublecheck.ll 1 32093 0 0.01 32060 0 0.01 32060 dp_dec.ll 2 62131 0 0.02 62108 0 0.02 62108 dp_enc.ll 4 65607 0 0.01 65584 0 0.02 65584 draw_line.ll 1 1792 0 0.01 1763 0 0.01 1...

search for: doitgen