thr3ads.net - search: "libquantum"

Displaying 20 results from an estimated 74 matches for "libquantum".

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

2014 Jan 28

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

...bled with "-mllvm -vectorize-num-stores-pred=1”. Furthermore, I added a heuristic to unroll until load/store ports are saturated “-mllvm enable-loadstore-runtime-unroll” instead of the pure size based heuristic. Those two together with a patch that slightly changes the register heuristic and libquantum’s three hot loops will unroll and goodness will ensue (at least for libquantum). commit 6b908b8b1084c97238cc642a3404a4285c21286f Author: Arnold Schwaighofer <aschwaighofer at apple.com> Date: Mon Jan 27 13:21:55 2014 -0800 Subtract one for loop induction variable. It is unlikely to b...

Heroic LLVM optimizations

2017 Aug 16

Heroic LLVM optimizations

I'll be interested in seeing the improvements. As a reference, this is what I get in an Intel 6700K when I compare gcc 5.4 (Ofast flto) vs published Intel results. 23x in libquantum, and over 40% in many benchmarks. I think that it is mostly from AoS vs SoA and loop transformations. 5.4 OfastICCperlbench12.9812.100.93bzip27.647.851.03gcc12.3011.000.89mcf14.0821.781.55gobmk8.308.981.08hmmer9.0727.002.98sjeng8.949.731.09libquantum23.10535.0023.16h264ref15.7722.301.41omnetpp...

Heroic LLVM optimizations

2017 Aug 16

Heroic LLVM optimizations

Hi Tobias- The loop fusion you mention is the one in libquantum/cpu2006 ? Or something else in cpu2017 ? -Thx Dibyendu -----Original Message----- From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Tobias Grosser via llvm-dev Sent: Wednesday, August 16, 2017 10:10 AM To: renau at uncore.io; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev...

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

2014 Jan 21

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

On Tue, Jan 21, 2014 at 2:44 PM, Arnold Schwaighofer < aschwaighofer at apple.com> wrote: > The LoopVectorizer depends on LCSSA and LoopSimplify. Both are loop > passes. We will have to make them also available as utility functions. Yuck. We still need to fix these at least, but that's much better than teaching *all* the loop passes to preserve BPI and BFI. -------------- next

[RFC] Using Intel MPX to harden SafeStack

2017 Feb 18

[RFC] Using Intel MPX to harden SafeStack

...|445.gobmk|677.80|686.12|685.50|702.87 | +--------------+---------+---------+---------+-------+ |456.hmmer|534.94|533.68|534.37|553.40 | +--------------+---------+---------+---------+-------+ |458.sjeng|633.69|641.21|641.81|655.94 | +--------------+---------+---------+---------+-------+ |462.libquantum|362.82|367.00|367.38|382.14 | +--------------+---------+---------+---------+-------+ |464.h264ref|701.37|682.13|683.41|699.93 | +--------------+---------+---------+---------+-------+ |471.omnetpp|397.04|407.38|407.33|411.36 | +--------------+---------+---------+---------+-------+ |473.astar|6...

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

2014 Jan 16

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

I am starting to use the sample profiler to analyze new performance opportunities. The loop unroller has popped up in several of the benchmarks I'm running. In particular, libquantum. There is a ~12% opportunity when the runtime unroller is triggered. This helps functions like quantum_sigma_x (http://sourcecodebrowser.com/libquantum/0.2.4/gates_8c_source.html#l00149). The function accounts for ~20% of total runtime. By allowing the runtime unroller, we can speedup the program...

[LLVMdev] Measurements of the new inlinehint attribute

2010 Feb 15

[LLVMdev] Measurements of the new inlinehint attribute

...mcf 0.00% -1.78% 11.88% 0.61% SPEC/CINT2006/445.gobmk/445.gobmk 0.02% 0.00% 13.86% 0.00% SPEC/CINT2006/456.hmmer/456.hmmer 0.17% 1.72% 28.38% 1.72% SPEC/CINT2006/458.sjeng/458.sjeng 0.19% 1.35% 8.97% 6.05% SPEC/CINT2006/462.libquantum/462.libquantum 1.08% -20.22% 146.24% -7.26% SPEC/CINT2006/464.h264ref/464.h264ref 0.00% -0.30% 9.22% 0.72% SPEC/CINT2006/471.omnetpp/471.omnetpp 2.78% 1.92% 67.24% 3.92% SPEC/CINT2006/473.astar/473.astar 4.59% 6.61% 12.90% -0.87% SPEC/CINT2006/483.xa...

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

2014 Jan 21

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

On 16/01/2014, 23:47 , Andrew Trick wrote: > > On Jan 15, 2014, at 4:13 PM, Diego Novillo <dnovillo at google.com > <mailto:dnovillo at google.com>> wrote: > >> Chandler also pointed me at the vectorizer, which has its own >> unroller. However, the vectorizer only unrolls enough to serve the >> target, it's not as general as the runtime-triggered

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

2014 Jan 16

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

On Thu, Jan 16, 2014 at 9:26 AM, Nadav Rotem <nrotem at apple.com> wrote: > Hi Diego, > > It looks like the problem is with the code in the vectorizer that tries to estimate the most profitable vectorization factor: > >> LV: Found an estimated cost of 6 for VF 2 For instruction: %3 = load >> i64* %state, align 8, !dbg !58, !tbaa !61 > > > It looks like a

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

2014 Jan 21

[LLVMdev] Loop unrolling opportunity in SPEC's libquantum with profile info

Just to add a few notes... On Tue, Jan 21, 2014 at 1:31 PM, Andrew Trick <atrick at apple.com> wrote: > Chandler suggested a way around the problem. I'll work on that first. > > > It is very difficult to deal with the LoopPassManager. The concept doesn’t > fit with typical loop passes, which may need to rerun function level > analyses, and can affect code outside the

Heroic LLVM optimizations

2017 Aug 15

Heroic LLVM optimizations

...lvm.org/devmtg/2015-10/slides/Gerolf-PerformanceImprovementsAndHeadroom.pdf) in LLVM. Focus on SPEC2006 but also looking at the new SPEC2017. The goal is to match, or get closer, to the Intel compiler with SPEC2006. ICC has a significant advantage. As the talk shows, there is over 10x diff in libquantum, and other benchmarks have also significant difference between latest gcc/llvm and ICC. Send me an email with your CV or questions if you want a full time job working on this (open source) and helping with other compiler optimizations for future ARMv8 servers. Something like 50% of the time o...

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

Hi Michael- Since you bring up libquantum performance can you let me know what the IR will look like for this small code snippet (libquantum-like) with –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking in unless -force-vector-width=<> is specified. Let me know if I am missing something. -Thx struct nodeTy {...

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 13

Enabling scalarized conditional stores in the loop vectorizer

Hi Michael, Thanks for testing this on your benchmarks and target. I think the results will help guide the direction we go. I tested the feature with spec2k/2k6 on AArch64/Kryo and saw minor performance swings, aside from a large (30%) improvement in spec2k6/libquantum. The primary loop in that benchmark has a conditional store, so I expected it to benefit. Regarding the cost model, I think the vectorizer's modeling of the conditional stores is good. We could potentially improve it by using profile information if available. But I'm not sure of the qualit...

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

...geting X86 it doesn't. You can take a look at the costs with "-mllvm -debug-only=loop-vectorize" Hope that helps. -- Matt On Wed, Dec 14, 2016 at 12:59 AM, Das, Dibyendu via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi Michael- > > > > Since you bring up libquantum performance can you let me know what the IR > will look like for this small code snippet (libquantum-like) with > –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking > in unless -force-vector-width=<> is specified. Let me know if I am missing > something. &g...

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

...quot;-mllvm -debug-only=loop-vectorize" > > > > Hope that helps. > > > > -- Matt > > > > On Wed, Dec 14, 2016 at 12:59 AM, Das, Dibyendu via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hi Michael- > > > > Since you bring up libquantum performance can you let me know what the IR > will look like for this small code snippet (libquantum-like) with > –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking > in unless -force-vector-width=<> is specified. Let me know if I am missing > something. &g...

[RFC] Add IR level interprocedural outliner for code size.

2017 Jul 22

[RFC] Add IR level interprocedural outliner for code size.

...: > > - > > bzip2: 7.27% > - > > sphinx3: 3.65% > - > > Namd: 3.08% > - > > Gcc: 3.06% > - > > H264ref: 3.05% > > MO: > > - > > Namd: 7.8% > - > > bzip2: 7.27% > - > > libquantum: 2.99% > - > > h264ref: 2% > > > Do you understand why so? > > I'm especially interested in cases where MO managed to find redundancies > while E&O+LO didn't. For example, 2.99% on libquantum (or is it simply > below "top 5 results" for E&am...

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

...oesn't. You can take a look at the costs with "-mllvm -debug-only=loop-vectorize" Hope that helps. -- Matt On Wed, Dec 14, 2016 at 12:59 AM, Das, Dibyendu via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi Michael- Since you bring up libquantum performance can you let me know what the IR will look like for this small code snippet (libquantum-like) with –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking in unless -force-vector-width=<> is specified. Let me know if I am missing something. -Thx struct nodeTy {...

Enable vectorizer-maximize-bandwidth by default?

2017 May 18

Enable vectorizer-maximize-bandwidth by default?

...spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% Scores are benchmark specific. We do have regression on 453.povray, but it's due to secondary effects as all hot functions are the same. I've also te...

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 15

Enabling scalarized conditional stores in the loop vectorizer

...> Hope that helps. >> >> >> >> -- Matt >> >> >> >> On Wed, Dec 14, 2016 at 12:59 AM, Das, Dibyendu via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Hi Michael- >> >> >> >> Since you bring up libquantum performance can you let me know what the IR >> will look like for this small code snippet (libquantum-like) with >> –enable-cond-stores-vec ? I ask because I don’t see vectorization kicking >> in unless -force-vector-width=<> is specified. Let me know if I am missing >&g...

(no subject)

2018 Jan 15

(no subject)

...integrating Polly closer into LLVM: _https://github.com/pfaffe/llvm-project-20170507/commits/merge-polly-into-upstream_ (further cleanup needed) * We are working further with ARM (Florian Hahn and Francesco) to upstream the inliner changes needed for the end-to-end optimization of SPEC 2006 libquantum. _https://reviews.llvm.org/D38585_ * Oleksandr, Sven and Manasij Mukherjee started to look into spatial locality * We worked on expanding the isl C++ bindings (_http://repo.or.cz/isl.git/shortlog_). While a first set of patches is already open, further patches will follow over the next couple o...

search for: libquantum