similar to: [LLVMdev] Reproducing clang -O3 with opt

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Reproducing clang -O3 with opt"

2016 Mar 10
3
Regression in SPEC2006/gcc caused by LoopLoadElimination
On Thu, Mar 10, 2016 at 1:17 AM, Adam Nemet <anemet at apple.com> wrote: > I’ve committed the fix in r263058. Haicheng, Eric/Benjamin, can you guys > please give it a test with your codebase. (You need to enable the pass with > -mllvm -enable-loop-load-elim.) The miscompilation I was seeing is gone now, too. Thanks! > On Mar 7, 2016, at 11:05 PM, Adam Nemet <anemet at
2016 Mar 08
3
Regression in SPEC2006/gcc caused by LoopLoadElimination
> On Mar 7, 2016, at 9:43 AM, Adam Nemet <anemet at apple.com> wrote: > > Hi Haicheng, > > Sorry about the breakage. I reverted it in r262839. > > I will try to reproduce it locally. Please don’t blow away your directories yet in case I need further help. OK, I managed to reproduce this locally. Should be able to make progress from here without further help from
2016 Mar 10
2
Regression in SPEC2006/gcc caused by LoopLoadElimination
Thank you, Adam. It passes all the benchmarks I have. Haicheng From: anemet at apple.com [mailto:anemet at apple.com] Sent: Wednesday, March 09, 2016 7:17 PM To: Haicheng Wu; Eric Christopher; Benjamin Kramer Cc: llvm-dev Subject: Re: Regression in SPEC2006/gcc caused by LoopLoadElimination I’ve committed the fix in r263058. Haicheng, Eric/Benjamin, can you guys please give it a
2020 Jul 02
2
flags to reproduce clang -O3 with opt -O3
Hello, I've been trying to figure out how to reproduce the results of a single clang -O3 compilation to a binary with a multi-step process using opt. Specifically I have: clang -O3 foo.c -o foo.exe which I want to replicate with: clang -O0 -c -emit-llvm foo.c opt -O3 foo.bc -o foo_o.bc clang foo_o.bc -o foo.exe Any hints / suggestions on what additional flags I need to produce the same
2020 Jul 03
2
flags to reproduce clang -O3 with opt -O3
Awesome, thanks! I'd like to have the last step (llc in your example) not perform additional optimization passes, such as O3, and simply use the O3 pass from opt in the previous line. Do you happen to know if I should use 'llc -O0 foo_o.bc -o foo.exe' instead to achieve this? On Thu, Jul 2, 2020 at 6:35 PM Mehdi AMINI <joker.eph at gmail.com> wrote: > > > On Thu,
2010 Oct 29
2
[LLVMdev] strict aliasing and LLVM
On Oct 29, 2010, at 12:26 AM, Nick Lewycky wrote: > Xinliang David Li wrote: >> As simple as >> >> void foo (int n, double *p, int *q) >> { >> for (int i = 0; i < n; i++) >> *p += *q; >> } >> >> clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c >> llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc > >
2015 Feb 09
2
[LLVMdev] Is "clang -O1" the same as "clang -O0 + opt -O1"?
Hello, I encounter a bug that pumped during execution of "clang -O1". However the bug cannot be reproduced by using "clang -O0 + opt -O1". It seems that "clang -O1" is not the same as "clang -O0 + opt -O1". According to the generated LLVM IRs are large, I would like to use bugpoint with "clang -O1" directly instead of using "clang -O0"
2009 Apr 28
1
[LLVMdev] O3 passes
Thanks for the help. When I run the following (where $llvm is the path to my llvm installation): $llvm/bin/llvm-gcc -c -o - -O1 tmp.c -emit-llvm -mllvm --disable-llvm-optzns | $llvm/bin/opt -raiseallocs I get the following error: cc1: error: unrecognized command line option "-fdisable-llvm-optzns" I am running llvm 2.5. I performed a $llvm/libexec/gcc/i686-pc-linux-gnu/4.2.1/cc1
2019 May 09
2
Is it possible to reproduce the result of opt -O3 manually?
Dear developers, I am trying to reproduce the results of applying opt -O3 to a source file in the form of LLVM IR. I want to get the same IR by manually ordering the passes used by O3 and passing them to opt. To illustrate what I am doing on an example, as an input I use linpack benchmark from the LLVM test suite[1]: 1. First I produce the intermediate representation using clang: clang -O3
2009 Apr 28
3
[LLVMdev] O3 passes
Can I specify passes that I want run directly to llvm-gcc? I don't want all of -O3, for example. I tried llvm-gcc -raiseallocs ..., but that didn't work. I also tried running cc1 directly and it didn't take -raiseallocs as a parameter either. Duncan Sands wrote: > On Tuesday 28 April 2009 04:02:47 am Ryan M. Lefever wrote: >> I assume that when -O3 (or O2 or O1) is
2019 May 13
2
Is it possible to reproduce the result of opt -O3 manually?
I think this has to do with how the pass manager is populated when we give -O3 vs when we give particular pass names. Some passes have multiple createXYZPass() methods that accept arguments too. These methods call non-default pass constructors, which in turn cause the passes to behave in a different manner. eg: Pass *llvm::createLICMPass() { return new LegacyLICMPass(); } Pass
2009 Apr 28
0
[LLVMdev] O3 passes
On Tuesday 28 April 2009 09:19:19 am Ryan M. Lefever wrote: > Can I specify passes that I want run directly to llvm-gcc? I don't want > all of -O3, for example. I tried llvm-gcc -raiseallocs ..., but that > didn't work. I also tried running cc1 directly and it didn't take > -raiseallocs as a parameter either. You are better off run passes explicitly using opt. Try
2016 Nov 17
2
Rewriting opt-viewer in C++
If the decision on whether this should swing Python or C++ is still open, here’s some food for thought: it’s trivially parallelizable. I lobbed some stuff in https://reviews.llvm.org/D26789 I used the pure python PyYAML and got a speedup of ~4x on my test case. I expect you might still be able to get an improvement with libYAML + a patch like this one. FWIW prior to this I also tried PyPy
2016 Nov 15
3
Rewriting opt-viewer in C++
> On Nov 15, 2016, at 10:33 AM, Bob Haarman <inglorion at google.com> wrote: > > Thanks for your comments, everyone! I'll try to answer the questions people have asked. First, let me say that I like Python, so I would be happy to keep the tool in Python if people feel that is a better way to go and we can still get it to go fast. As for precedent, we have several Python scripts
2016 Nov 16
1
Rewriting opt-viewer in C++
That's compared to the implementation with the Python parser. So if the libYAML parser is 6x the speed of that, the C++ version would be about 10x the speed of the implementation with libYAML, instead of 60x. On Tue, Nov 15, 2016 at 10:50 AM, Adam Nemet <anemet at apple.com> wrote: > > On Nov 15, 2016, at 10:33 AM, Bob Haarman <inglorion at google.com> wrote: > >
2016 Nov 17
2
Rewriting opt-viewer in C++
Adam, The test case was the Python-3.6.0b3 release, 234 input YAML files. The large majority of time is spent with processing the file input. Next ranked was rendering output. Moving the files to a tmpfs partition didn’t change the time significantly (but I would expect that experiment would yield different results with libYAML). original, single-threaded: processed input files
2019 Oct 19
3
Replicate Individual O3 optimizations
On Thu, Oct 17, 2019 at 11:22 AM David Greene via llvm-dev < llvm-dev at lists.llvm.org> wrote: > hameeza ahmed via llvm-dev <llvm-dev at lists.llvm.org> writes: > > > Hello, > > I want to study the individual O3 optimizations. For this I am using > > following commands, but unable to replicate O3 behavior. > > > > 1.
2015 Jun 11
4
[LLVMdev] Question about NoWrap flag for SCEVAddRecExpr
[+Arnold] > On Jun 10, 2015, at 1:29 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > [+CC Andy] > >> Can anyone familiar with ScalarRevolution tell me whether this is an >> expected behavior or a bug? > > Assuming you're talking about 2*k, this is a bug. ScalarEvolution > should be able to prove that {0,+,4} is <nsw> and
2016 Nov 14
2
Rewriting opt-viewer in C++
Again I am still undecided which way this should go but I was also wondering about the speed difference if we used the C-based parser in PyYAML (http://pyyaml.org/wiki/LibYAML <http://pyyaml.org/wiki/LibYAML>). > On Nov 13, 2016, at 12:19 AM, Adam Nemet <anemet at apple.com> wrote: > > Hi Bob, > > I am glad you’re finding opt-viewer useful. I am generally fine this
2015 May 02
5
[LLVMdev] Modifying LoopUnrollingPass
Hi Zhoulai, I am trying to modify "LoopUnrollPass" in llvm which produces multiple copies of loop equal to the loop unroll factor.Currently, using multicore architecture, say 3 for example and the execution goes like: for 3 cores if there are 9 iterations of loop core instruction 1 0,3,6 2 1,4,7 3 2,5,8 But I want to to