search for: libquantum

Displaying 20 results from an estimated 74 matches for "libquantum".

2014 Jan 28
2
[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
...bled with "-mllvm -vectorize-num-stores-pred=1”. Furthermore, I added a heuristic to unroll until load/store ports are saturated “-mllvm enable-loadstore-runtime-unroll” instead of the pure size based heuristic. Those two together with a patch that slightly changes the register heuristic and libquantum’s three hot loops will unroll and goodness will ensue (at least for libquantum). commit 6b908b8b1084c97238cc642a3404a4285c21286f Author: Arnold Schwaighofer <aschwaighofer at apple.com> Date: Mon Jan 27 13:21:55 2014 -0800 Subtract one for loop induction variable. It is unlikely to b...
2017 Aug 16
1
Heroic LLVM optimizations
I'll be interested in seeing the improvements. As a reference, this is what I get in an Intel 6700K when I compare gcc 5.4 (Ofast flto) vs published Intel results. 23x in libquantum, and over 40% in many benchmarks. I think that it is mostly from AoS vs SoA and loop transformations. 5.4 OfastICCperlbench12.9812.100.93bzip27.647.851.03gcc12.3011.000.89mcf14.0821.781.55gobmk8.308.981.08hmmer9.0727.002.98sjeng8.949.731.09libquantum23.10535.0023.16h264ref15.7722.301.41omnetpp...
2017 Aug 16
2
Heroic LLVM optimizations
Hi Tobias- The loop fusion you mention is the one in libquantum/cpu2006 ? Or something else in cpu2017 ? -Thx Dibyendu -----Original Message----- From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Tobias Grosser via llvm-dev Sent: Wednesday, August 16, 2017 10:10 AM To: renau at uncore.io; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev...
2014 Jan 21
2
[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
On Tue, Jan 21, 2014 at 2:44 PM, Arnold Schwaighofer < aschwaighofer at apple.com> wrote: > The LoopVectorizer depends on LCSSA and LoopSimplify. Both are loop > passes. We will have to make them also available as utility functions. Yuck. We still need to fix these at least, but that's much better than teaching *all* the loop passes to preserve BPI and BFI. -------------- next
2017 Feb 18
2
[RFC] Using Intel MPX to harden SafeStack
...|445.gobmk|677.80|686.12|685.50|702.87 | +--------------+---------+---------+---------+-------+ |456.hmmer|534.94|533.68|534.37|553.40 | +--------------+---------+---------+---------+-------+ |458.sjeng|633.69|641.21|641.81|655.94 | +--------------+---------+---------+---------+-------+ |462.libquantum|362.82|367.00|367.38|382.14 | +--------------+---------+---------+---------+-------+ |464.h264ref|701.37|682.13|683.41|699.93 | +--------------+---------+---------+---------+-------+ |471.omnetpp|397.04|407.38|407.33|411.36 | +--------------+---------+---------+---------+-------+ |473.astar|6...
2014 Jan 16
11
[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
I am starting to use the sample profiler to analyze new performance opportunities. The loop unroller has popped up in several of the benchmarks I'm running. In particular, libquantum. There is a ~12% opportunity when the runtime unroller is triggered. This helps functions like quantum_sigma_x (http://sourcecodebrowser.com/libquantum/0.2.4/gates_8c_source.html#l00149). The function accounts for ~20% of total runtime. By allowing the runtime unroller, we can speedup the program...
2010 Feb 15
0
[LLVMdev] Measurements of the new inlinehint attribute
...mcf 0.00% -1.78% 11.88% 0.61% SPEC/CINT2006/445.gobmk/445.gobmk 0.02% 0.00% 13.86% 0.00% SPEC/CINT2006/456.hmmer/456.hmmer 0.17% 1.72% 28.38% 1.72% SPEC/CINT2006/458.sjeng/458.sjeng 0.19% 1.35% 8.97% 6.05% SPEC/CINT2006/462.libquantum/462.libquantum 1.08% -20.22% 146.24% -7.26% SPEC/CINT2006/464.h264ref/464.h264ref 0.00% -0.30% 9.22% 0.72% SPEC/CINT2006/471.omnetpp/471.omnetpp 2.78% 1.92% 67.24% 3.92% SPEC/CINT2006/473.astar/473.astar 4.59% 6.61% 12.90% -0.87% SPEC/CINT2006/483.xa...
2014 Jan 21
5
[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
On 16/01/2014, 23:47 , Andrew Trick wrote: > > On Jan 15, 2014, at 4:13 PM, Diego Novillo <dnovillo at google.com > <mailto:dnovillo at google.com>> wrote: > >> Chandler also pointed me at the vectorizer, which has its own >> unroller. However, the vectorizer only unrolls enough to serve the >> target, it's not as general as the runtime-triggered
2014 Jan 16
3
[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
On Thu, Jan 16, 2014 at 9:26 AM, Nadav Rotem <nrotem at apple.com> wrote: > Hi Diego, > > It looks like the problem is with the code in the vectorizer that tries to estimate the most profitable vectorization factor: > >> LV: Found an estimated cost of 6 for VF 2 For instruction: %3 = load >> i64* %state, align 8, !dbg !58, !tbaa !61 > > > It looks like a
2014 Jan 21
2
[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info
Just to add a few notes... On Tue, Jan 21, 2014 at 1:31 PM, Andrew Trick <atrick at apple.com> wrote: > Chandler suggested a way around the problem. I'll work on that first. > > > It is very difficult to deal with the LoopPassManager. The concept doesn’t > fit with typical loop passes, which may need to rerun function level > analyses, and can affect code outside the
2017 Aug 15
2
Heroic LLVM optimizations
...lvm.org/devmtg/2015-10/slides/Gerolf-PerformanceImprovementsAndHeadroom.pdf) in LLVM. Focus on SPEC2006 but also looking at the new SPEC2017. The goal is to match, or get closer, to the Intel compiler with SPEC2006. ICC has a significant advantage. As the talk shows, there is over 10x diff in libquantum, and other benchmarks have also significant difference between latest gcc/llvm and ICC. Send me an email with your CV or questions if you want a full time job working on this (open source) and helping with other compiler optimizations for future ARMv8 servers. Something like 50% of the time o...
2016 Dec 14
0
Enabling scalarized conditional stores in the loop vectorizer
Hi Michael- Since you bring up libquantum performance can you let me know what the IR will look like for this small code snippet (libquantum-like) with –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking in unless -force-vector-width=<> is specified. Let me know if I am missing something. -Thx struct nodeTy {...
2016 Dec 13
4
Enabling scalarized conditional stores in the loop vectorizer
Hi Michael, Thanks for testing this on your benchmarks and target. I think the results will help guide the direction we go. I tested the feature with spec2k/2k6 on AArch64/Kryo and saw minor performance swings, aside from a large (30%) improvement in spec2k6/libquantum. The primary loop in that benchmark has a conditional store, so I expected it to benefit. Regarding the cost model, I think the vectorizer's modeling of the conditional stores is good. We could potentially improve it by using profile information if available. But I'm not sure of the qualit...
2016 Dec 14
2
Enabling scalarized conditional stores in the loop vectorizer
...geting X86 it doesn't. You can take a look at the costs with "-mllvm -debug-only=loop-vectorize" Hope that helps. -- Matt On Wed, Dec 14, 2016 at 12:59 AM, Das, Dibyendu via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi Michael- > > > > Since you bring up libquantum performance can you let me know what the IR > will look like for this small code snippet (libquantum-like) with > –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking > in unless -force-vector-width=<> is specified. Let me know if I am missing > something. &g...
2016 Dec 14
4
Enabling scalarized conditional stores in the loop vectorizer
...quot;-mllvm -debug-only=loop-vectorize" > > > > Hope that helps. > > > > -- Matt > > > > On Wed, Dec 14, 2016 at 12:59 AM, Das, Dibyendu via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hi Michael- > > > > Since you bring up libquantum performance can you let me know what the IR > will look like for this small code snippet (libquantum-like) with > –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking > in unless -force-vector-width=<> is specified. Let me know if I am missing > something. &g...
2017 Jul 22
4
[RFC] Add IR level interprocedural outliner for code size.
...: > > - > > bzip2: 7.27% > - > > sphinx3: 3.65% > - > > Namd: 3.08% > - > > Gcc: 3.06% > - > > H264ref: 3.05% > > MO: > > - > > Namd: 7.8% > - > > bzip2: 7.27% > - > > libquantum: 2.99% > - > > h264ref: 2% > > > Do you understand why so? > > I'm especially interested in cases where MO managed to find redundancies > while E&O+LO didn't. For example, 2.99% on libquantum (or is it simply > below "top 5 results" for E&am...
2016 Dec 14
0
Enabling scalarized conditional stores in the loop vectorizer
...oesn't. You can take a look at the costs with "-mllvm -debug-only=loop-vectorize" Hope that helps. -- Matt On Wed, Dec 14, 2016 at 12:59 AM, Das, Dibyendu via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi Michael- Since you bring up libquantum performance can you let me know what the IR will look like for this small code snippet (libquantum-like) with –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking in unless -force-vector-width=<> is specified. Let me know if I am missing something. -Thx struct nodeTy {...
2017 May 18
6
Enable vectorizer-maximize-bandwidth by default?
...spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% Scores are benchmark specific. We do have regression on 453.povray, but it's due to secondary effects as all hot functions are the same. I've also te...
2016 Dec 15
0
Enabling scalarized conditional stores in the loop vectorizer
...> Hope that helps. >> >> >> >> -- Matt >> >> >> >> On Wed, Dec 14, 2016 at 12:59 AM, Das, Dibyendu via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Hi Michael- >> >> >> >> Since you bring up libquantum performance can you let me know what the IR >> will look like for this small code snippet (libquantum-like) with >> –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking >> in unless -force-vector-width=<> is specified. Let me know if I am missing >&g...
2018 Jan 15
2
(no subject)
...integrating Polly closer into LLVM: _https://github.com/pfaffe/llvm-project-20170507/commits/merge-polly-into-upstream_ (further cleanup needed) * We are working further with ARM (Florian Hahn and Francesco) to upstream the inliner changes needed for the end-to-end optimization of SPEC 2006 libquantum.   _https://reviews.llvm.org/D38585_ * Oleksandr, Sven and Manasij Mukherjee started to look into spatial locality * We worked on expanding the isl C++ bindings (_http://repo.or.cz/isl.git/shortlog_). While a first set of patches is already open, further patches will follow over the next couple o...