thr3ads.net - similar to: "[LLVMdev] Enabling the vectorizer for -Os"

Displaying 20 results from an estimated 50000 matches similar to: "[LLVMdev] Enabling the vectorizer for -Os"

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 06

[LLVMdev] Enabling the vectorizer for -Os

Hi, Thanks for the feedback. I think that we agree that vectorization on -Os can benefit many programs. Regarding -O2 vs -O3, maybe we should set a higher cost threshold for O2 to increase the likelihood of improving the performance ? We have very few regressions on -O3 as is and with better cost models I believe that we can bring them close to zero, so I am not sure if it can help that much.

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 05

[LLVMdev] Enabling the vectorizer for -Os

On 5 June 2013 13:32, David Tweed <david.tweed at arm.com> wrote: > This is what I'd like to know about: what specific potential to change > results have you seen in the vectorizer? > No changes, just conceptual. AFAIK, the difference between the passes on O2 and O3 are minimal (looking at the code where this is chosen) and they don't seem to be particularly amazing to

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 05

[LLVMdev] Enabling the vectorizer for -Os

On 5 June 2013 11:59, David Tweed <david.tweed at arm.com> wrote: > (I've very rarely had O3 optimzation, rather than some program specific > subset of the options, acheive any non-noise-level speed-up over O2 with > gcc/g++.) > Hi David, You surely remember this: http://plasma.cs.umass.edu/emery/stabilizer "We find that, while -O2 has a significant impact relative

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 06

[LLVMdev] Enabling the vectorizer for -Os

On Wed, Jun 5, 2013 at 5:51 PM, Nadav Rotem <nrotem at apple.com> wrote: > Hi, > > Thanks for the feedback. I think that we agree that vectorization on -Os > can benefit many programs. Regarding -O2 vs -O3, maybe we should set a > higher cost threshold for O2 to increase the likelihood of improving the > performance ? We have very few regressions on -O3 as is and with

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 05

[LLVMdev] Enabling the vectorizer for -Os

On 5 June 2013 04:26, Nadav Rotem <nrotem at apple.com> wrote: > I would like to start a discussion about enabling the loop vectorizer by > default for -Os. The loop vectorizer can accelerate many workloads and > enabling it for -Os and -O2 has obvious performance benefits. Hi Nadav, As it stands, O2 is very similar to O3 with a few, more aggressive, optimizations running,

[LLVMdev] Enabling the vectorizer for -Os

2013 Jun 06

[LLVMdev] Enabling the vectorizer for -Os

Hi Chandler, > FWIW, I don't yet agree. > > > Your tables show many programs growing in code size by over 20%. While there is associated performance improvements, it isn't clear that this is a good tradeoff. Historically, optimizations which optimize as a direct result of growing code size have *not* been an acceptable tradeoff in -Os. > > > > > I am

[LLVMdev] Regarding BOF: Vectorization in LLVM

2012 Nov 06

[LLVMdev] Regarding BOF: Vectorization in LLVM

Hi Nadav, Unfortunately I'm not attending the dev meeting, but the BoF looks interesting. One thing that I'd like to throw into the mix is that, while dealing with autovectorisation of LLVM compiled down from C-like languages (or maybe Fortran-like languages) is clearly a very big area for fruitful work both algorithmically and in terms of practical relevance, it'd also be interesting

[LLVMdev] Regarding BOF: Vectorization in LLVM

2012 Nov 06

[LLVMdev] Regarding BOF: Vectorization in LLVM

Hi David! On Nov 6, 2012, at 3:23 AM, David Tweed <david.tweed at gmail.com> wrote: > Hi Nadav, > > Unfortunately I'm not attending the dev meeting, but the BoF looks interesting. One thing that I'd like to throw into the mix is that, while dealing with autovectorisation of LLVM compiled down from C-like languages (or maybe Fortran-like languages) is clearly a very big

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 15

[LLVMdev] Enabling the SLP vectorizer by default for -O3

On Jul 14, 2013, at 9:52 PM, Chris Lattner <clattner at apple.com> wrote: > > On Jul 13, 2013, at 11:30 PM, Nadav Rotem <nrotem at apple.com> wrote: > >> Hi, >> >> LLVM’s SLP-vectorizer is a new pass that combines similar independent instructions in a straight-line code. It is currently not enabled by default, and people who want to experiment with it

[LLVMdev] Regarding BOF: Vectorization in LLVM

2012 Nov 06

[LLVMdev] Regarding BOF: Vectorization in LLVM

----- Original Message ----- > From: "Nadav Rotem" <nrotem at apple.com> > To: "David Tweed" <david.tweed at gmail.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Tuesday, November 6, 2012 11:08:23 AM > Subject: Re: [LLVMdev] Regarding BOF: Vectorization in LLVM > > Hi David! > > On Nov 6, 2012, at 3:23 AM, David Tweed <david.tweed at

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 17

(RFC) Adjusting default loop fully unroll threshold

> On Feb 16, 2017, at 4:41 PM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > On Thu, Feb 16, 2017 at 3:45 PM, Chandler Carruth via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > First off, I just want to say wow and thank you. This kind of data is amazing. =D > > On Thu, Feb 16, 2017 at

[LLVMdev] Enabling the SLP-vectorizer by default for -O3

2013 Jul 28

[LLVMdev] Enabling the SLP-vectorizer by default for -O3

Hi, Below you can see the updated benchmark results for the new SLP-vectorizer. As you can see, there is a small number of compile time regressions, a single major runtime *regression, and many performance gains. There is a tiny increase in code size: 30k for the whole test-suite. Based on the numbers below I would like to enable the SLP-vectorizer by default for -O3. Please let me know if you

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

2015 Jul 15

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

Hi all, I would like to propose an improvement of the “almost dead” block elimination in Transforms/Local.cpp so that it will preserve the canonical loop form for loops with a volatile iteration variable. *** Problem statement Nested loops in LCALS Subset B (https://codesign.llnl.gov/LCALS.php) are not vectorized with LLVM -O3 because the LLVM loop vectorizer fails the test whether the loop

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 23

[LLVMdev] Enabling the SLP vectorizer by default for -O3

Hi, Sorry for the delay in response. I measured the code size change and noticed small changes in both directions for individual programs. I found a 30k binary size growth for the entire testsuite + SPEC. I attached an updated performance report that includes both compile time and performance measurements. Thanks, Nadav On Jul 14, 2013, at 10:55 PM, Nadav Rotem <nrotem at apple.com>

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 16

(RFC) Adjusting default loop fully unroll threshold

First off, I just want to say wow and thank you. This kind of data is amazing. =D On Thu, Feb 16, 2017 at 2:46 AM Kristof Beyls <Kristof.Beyls at arm.com> wrote: > The biggest relative code size increases indeed didn't happen for the > biggest programs, but instead for a few programs weighing in at about 100KB. > I'm assuming the Google benchmark set covers much bigger

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

2015 Aug 13

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

Hi Gerolf, I think we have several (perhaps separable) issues here: 1. Do we have a canonical form for loops, preserved through the optimizer, that allows naturally-constructed loop nests to remain separable? 2. Do we forbid non-lowering transformations that turn vectorizable loops into non-vectorizable loops? 3. How do we detect cases where transformations cause a negative answer to either

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

2013 Jan 14

[LLVMdev] RFC: Codifying (but not formalizing) the optimization levels in LLVM and Clang

This has been an idea floating around in my head for a while and after several discussions with others it continues to hold up so I thought I would mail it out. Sorry for cross posting to both lists, but this is an issue that would significantly impact both LLVM and Clang. Essentially, LLVM provides canned optimization "levels" for frontends to re-use. This is nothing new. However, we

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

2015 Jul 30

[LLVMdev] RFC: Callee speedup estimation in inline cost analysis

TLDR - The proposal below is intended to allow inlining of larger callees when such inlining is expected to reduce the dynamic instructions count. Proposal ------------- LLVM inlines a function if the size growth (in the given context) is less than a threshold. The threshold is increased based on certain characteristics of the called function (inline keyword and the fraction of vector

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 14

[LLVMdev] Enabling the SLP vectorizer by default for -O3

Hi, LLVM’s SLP-vectorizer is a new pass that combines similar independent instructions in a straight-line code. It is currently not enabled by default, and people who want to experiment with it can use the clang command line flag “-fslp-vectorize”. I ran LLVM’s test suite with and without the SLP vectorizer on a Sandybridge mac (using SSE4, w/o AVX). Based on my performance measurements

(RFC) Adjusting default loop fully unroll threshold

2017 Feb 15

(RFC) Adjusting default loop fully unroll threshold

Thanks for running these Kristof! I'd still like to hear from Apple, and if we can get a few more x86 micro-architectures covered that'd be great, but it looks like -O3 is uncontroversial, and the question is whether this makes sense at O2... To me, it would help a lot to know the actual breakdown of benchmarks such as yours Kristof (as they seem to have more codesize impact than others

similar to: [LLVMdev] Enabling the vectorizer for -Os