similar to: Heroic LLVM optimizations

Displaying 20 results from an estimated 1000 matches similar to: "Heroic LLVM optimizations"

2017 Aug 16
2
Heroic LLVM optimizations
Hi Tobias- The loop fusion you mention is the one in libquantum/cpu2006 ? Or something else in cpu2017 ? -Thx Dibyendu -----Original Message----- From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Tobias Grosser via llvm-dev Sent: Wednesday, August 16, 2017 10:10 AM To: renau at uncore.io; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Heroic LLVM optimizations Hi
2017 Aug 16
1
Heroic LLVM optimizations
I'll be interested in seeing the improvements. As a reference, this is what I get in an Intel 6700K when I compare gcc 5.4 (Ofast flto) vs published Intel results. 23x in libquantum, and over 40% in many benchmarks. I think that it is mostly from AoS vs SoA and loop transformations. 5.4
2017 Nov 02
13
[RFC] Enable Partial Inliner by default
Forgot to add that all experiments were done with '-O3 -m64 -fexperimental-new-pass-manager'. Graham Yiu LLVM Compiler Development IBM Toronto Software Lab Office: (905) 413-4077 C2-707/8200/Markham Email: gyiu at ca.ibm.com From: Graham Yiu/Toronto/IBM To: llvm-dev at lists.llvm.org Cc: junbuml at codeaurora.org, xinliangli at gmail.com Date: 11/02/2017 05:26 PM Subject: [RFC]
2017 Nov 10
5
[RFC] Enable Partial Inliner by default
Hi Graham, Thank you for offering help. I am trying to create a reproducer. The problem is that the crashes happen whilst LTO is used. One thing I am sure about IR is broken at compile time. Thanks, Evgeny From: Graham Yiu <gyiu at ca.ibm.com> Date: Friday, 10 November 2017 at 16:09 To: Evgeny Astigeevich <Evgeny.Astigeevich at arm.com> Cc: "junbuml at codeaurora.org"
2017 Nov 10
0
[RFC] Enable Partial Inliner by default
Hi Evgeny, I just realized that if these are compile-time errors I can help investigate on my end. Do you have something I can use to reproduce? Cheers, Graham Yiu LLVM Compiler Development IBM Toronto Software Lab Office: (905) 413-4077 C2-707/8200/Markham Email: gyiu at ca.ibm.com From: Graham Yiu/Toronto/IBM To: Evgeny Astigeevich <Evgeny.Astigeevich at arm.com> Cc:
2016 Mar 29
2
[CodeGen] CodeSize - TailMerging and BlockPlacement
Hi everyone, The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. I did an experiment of adding additional BranchFolding and BlockPlacement after the existing BlockPlacement (i.e., -block-placement -branch-folder -block-placement) targeting
2015 Jan 16
7
[LLVMdev] proof of concept for a loop fusion pass
Hi, We are proposing a loop fusion pass that tries to proactive fuse loops across function call boundaries and arbitrary control flow. http://reviews.llvm.org/D7008 With this pass, we get 103 loop fusions in SPECCPU INT 2006 462.libquantum with rate performance improving close to 2.5X in x86 (results from AMD A10-6700). I took some liberties in patching up some of the code in
2015 Oct 02
2
Register Spill Caused by the Reassociation pass
This conflict is with many optimizations incl. copy prop, coalescing, hoisting etc. Each could increase register pressure and with similar impact. Attempts to control the register pressure locally (within an optimization pass) tend to get hard to tune and maintain. Would it be a better way to describe eg in metadata how to undo an optimization? Optimizations that attempt to reduce pressure like
2015 Oct 01
2
Register Spill Caused by the Reassociation pass
Hi Sanjay, I observed some extra register spills when applying the reassociation pass on spec2006 benchmarks and I would like to listen to your advice. For example, function get_new_point_on_quad() of tria_boundary.cc in spec2006/dealII has a sequences of code like this . X=a+b . Y=X+c . Z=Y+d . There are many other instructions between these float adds. The reassociation
2016 Mar 29
1
[PATCH 02/10] x86/cpufeature: Kill cpu_has_hypervisor
From: Borislav Petkov <bp at suse.de> Use boot_cpu_has() instead. Signed-off-by: Borislav Petkov <bp at suse.de> Cc: virtualization at lists.linux-foundation.org Cc: sparmaintainer at unisys.com --- arch/x86/events/intel/cstate.c | 2 +- arch/x86/events/intel/uncore.c | 2 +- arch/x86/include/asm/cpufeature.h | 1 -
2016 Mar 29
1
[PATCH 02/10] x86/cpufeature: Kill cpu_has_hypervisor
From: Borislav Petkov <bp at suse.de> Use boot_cpu_has() instead. Signed-off-by: Borislav Petkov <bp at suse.de> Cc: virtualization at lists.linux-foundation.org Cc: sparmaintainer at unisys.com --- arch/x86/events/intel/cstate.c | 2 +- arch/x86/events/intel/uncore.c | 2 +- arch/x86/include/asm/cpufeature.h | 1 -
2020 Apr 09
3
RFC: a practical mechanism for applying Machine Learning for optimization policies in LLVM
+Yundi Qian <yundi at google.com> +Eugene Brevdo <ebrevdo at google.com> , our team members from the ML side. To avoid formatting issues, here is a link to the RFC <https://docs.google.com/document/d/1BoSGQlmgAh-yUZMn4sCDoWuY6KWed2tV58P4_472mDE/edit?usp=sharing>, open to comments. Thanks! On Wed, Apr 8, 2020 at 2:34 PM Mircea Trofin <mtrofin at google.com> wrote: >
2015 Jun 11
2
[LLVMdev] BasicAA unable to analyze recursive PHI nodes
----- Original Message ----- > From: "Tobias Edler von Koch" <tobias at codeaurora.org> > To: "Daniel Berlin" <dberlin at dberlin.org> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, June 11, 2015 10:02:37 AM > Subject: Re: [LLVMdev] BasicAA unable to analyze recursive PHI nodes > > Hi Daniel,
2017 Nov 13
2
[RFC] Enable Partial Inliner by default
Hi Graham, I created a bug report with a reproducer for the failures I’ve got: https://bugs.llvm.org/show_bug.cgi?id=35288 I have also found that LTO reverts everything the partial inliner has done. Maybe the partial inliner should not be used at the first LTO phase (compilation). I hope I’ll have a chance to look at the code size regressions this week. Thanks, Evgeny Astigeevich From:
2020 Apr 08
2
RFC: a practical mechanism for applying Machine Learning for optimization policies in LLVM
It turns out it's me, sorry. Let me see how I can sort this out. In the meantime, here is the csv: SPEC2006 data: binary,base -Oz size,ML -Oz size,ML size shrink by,,perf: base -Oz scores,perf: ML -Oz scores,ML improvement by 400.perlbench,2054200,2086776,-1.59%,,2.9,2.9,0.00% 401.bzip2,1129976,1095544,3.05%,,6.4,6.2,-3.13% 403.gcc,4078488,4130840,-1.28%,,11.6,11.7,0.86%
2010 Jul 20
5
[LLVMdev] LLVM and Spec2006
Hi, What are the best options to compile Spec2006 with LLVM compilers to get the best performance numbers on x86? Has anybody compared LLVM Spec2006 numbers with GCC 4.5 base? reza -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100719/40cf38a5/attachment.html>
2020 Apr 09
2
RFC: a practical mechanism for applying Machine Learning for optimization policies in LLVM
Sorry, I wasn't aware of that. I can make the google doc view-only, keeping the current comments. I'll wait a bit (few hrs) to see if there's any pushback to that. On Thu, Apr 9, 2020 at 9:57 AM Xinliang David Li <xinliangli at gmail.com> wrote: > One suggestion : should we consolidate the discussion into the main > thread? I know some folks are not willing to comment in
2017 Jun 30
2
LoopSimplify pass prevents loop unrolling
Hi All, In the attached test case there, is an unnested loop with 2 iterations. The loop latch block is terminated by an unconditional branch, so simplifycfg folds the almost empty latch block into its predecessor which is the loop header. This results in an additional backedge in the CFG, so when LoopRotate pass is called it canonicalizes the loop into a nested loop. However, now the loop
2017 Feb 18
2
[RFC] Using Intel MPX to harden SafeStack
On 2/7/2017 20:02, Kostya Serebryany wrote: > ... > > My understanding is that BNDCU is the cheapest possible instruction, > just like XOR or ADD, > so the overhead should be relatively small. > Still my guesstimate would be >= 5% since stores are very numerous. > And such overhead will be on top of whatever overhead SafeStack has. > Do you have any measurements to
2010 Jul 20
0
[LLVMdev] LLVM and Spec2006
Hi Reza, -O4 is the highest level of LLVM optimization that I know of. But, I don't know if it has been tried on Spec2006. IIRC, Dan Gohman has run Spec. tests with LLVM, so he can provide more info. - fariborz On Jul 19, 2010, at 6:06 PM, Reza Yazdani wrote: > Hi, > > What are the best options to compile Spec2006 with LLVM compilers to > get the best performance numbers