similar to: [LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom"

2013 Sep 30
0
[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom
Was there any development on this? I noticed that clang still produces a lea for the testcase in llvm.org/pr13320. On 28 September 2012 11:36, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > Hi, > > > > Here is an update on our proposal to improve the uses of LEA on Atom > processors. > > > > 1. Disable current generation of LEAs > > > > Due to
2012 Aug 10
0
[LLVMdev] RFC: Adding pass in X86PassConfig::addPreEmitPass for LEA optimization on Atom
Hi, We are getting ready to implement several heuristics for correctly using LEAs to avoid stalls in the address generator on Atom. Our plan is to: 1. Disabling LEA generation on Atom in X86ISelDAGToDAG:: SelectLEAAddr() for all but a few pseudo-instructions 2. Identify loads and stores in a X86PassConfig::addPreEmitPass() pass and examine several preceding instructions to
2013 Oct 03
2
[LLVMdev] Codegen performance issue: LEA vs. INC.
The two address pass is only concerned about register pressure. It sounds like it should be taught about profitability. In cases where profitability can only be determined with something machinetracemetric then it probably should live it to more sophisticated pass like regalloc. In this case, we probably need a profitability target hook which knows about lea. We should also consider disabling
2013 Oct 05
1
[LLVMdev] Codegen performance issue: LEA vs. INC.
> The lea->cmp problem is fixed by switching to the MI scheduler. Please run with -mllvm -misched-bench to confirm. I get the same output in the testcase in pr13320. The leaq is in between the cmp and the jmp, preventing macro-fusion. Cheers, Rafael
2013 Sep 17
2
[LLVMdev] Codegen performance issue: LEA vs. INC.
Hi all. I'm looking for an advice on how to deal with inefficient code generation for Intel Nehalem/Westmere architecture on 64-bit platform for the attached test.cpp (LLVM IR is in test.cpp.ll). The inner loop has 11 iterations and eventually unrolled. Test.lea.s is the assembly code of the outer loop. It simply has 11 loads, 11 FP add, 11 FP mull, 1 FP store and lea+mov for index
2013 Oct 02
0
[LLVMdev] Codegen performance issue: LEA vs. INC.
This sounds like llvm.org/pr13320. On 17 September 2013 18:20, Bader, Aleksey A <aleksey.a.bader at intel.com> wrote: > Hi all. > > > > I’m looking for an advice on how to deal with inefficient code generation > for Intel Nehalem/Westmere architecture on 64-bit platform for the attached > test.cpp (LLVM IR is in test.cpp.ll). > > The inner loop has 11 iterations
2013 Oct 05
0
[LLVMdev] Codegen performance issue: LEA vs. INC.
On Oct 2, 2013, at 11:48 PM, Evan Cheng <evan.cheng at apple.com> wrote: > The two address pass is only concerned about register pressure. It sounds like it should be taught about profitability. In cases where profitability can only be determined with something machinetracemetric then it probably should live it to more sophisticated pass like regalloc. > > In this case, we
2013 Jan 07
4
[LLVMdev] instruction scheduling issue
On 1/7/2013 2:15 PM, Xu Liu wrote: > > This would be ideal. How can I do the instrumentation pass after the > instruction scheduling? You could derive your own class from TargetPassConfig, and add the annotation pass in YourDerivedTargetPassConfig::addPreEmitPass. This will add your annotation pass very late, just before the final code is emitted. If you're using the X86 target,
2014 Jan 22
2
[LLVMdev] How to force a MachineFunctionPass to be the last one ?
On Jan 21, 2014, at 3:20 PM, Andrew Trick <atrick at apple.com> wrote: > > On Jan 21, 2014, at 2:20 PM, sebastien riou <matic at nimp.co.uk> wrote: > >> Hi, >> >> I would like to execute a MachineFunctionPass after all other passes >> which modify the machine code. >> In other words, if we call llc to generate assembly file, that pass >>
2012 Jun 24
1
how to find out lea instruction causes skype crash when starting
Hi david, I find you signed off a patch about "x86: emulate lea with two register operands correctly" .In this patch,you described skype does a lea instruction and will crash when starting if it does not get the exception.I have used a tool named mentorKG.exe to make a LICENSE.TXT.but the software crashes when starting.I used your patch,and find it works well.I wonder how can you
2013 Jan 07
2
[LLVMdev] instruction scheduling issue
On 1/7/2013 1:53 PM, Sergei Larin wrote: > > Also, how much performance are you willing to sacrifice to do what you > do? Maybe turning off scheduling all together is an acceptable solution? Or insert the calls after scheduling. -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
2013 Jan 07
0
[LLVMdev] instruction scheduling issue
Krzysztof, This would be ideal. How can I do the instrumentation pass after the instruction scheduling? Xu Liu Quoting Krzysztof Parzyszek <kparzysz at codeaurora.org>: > On 1/7/2013 1:53 PM, Sergei Larin wrote: >> >> Also, how much performance are you willing to sacrifice to do what you >> do? Maybe turning off scheduling all together is an acceptable solution?
2014 Jan 21
2
[LLVMdev] How to force a MachineFunctionPass to be the last one ?
Hi, I would like to execute a MachineFunctionPass after all other passes which modify the machine code. In other words, if we call llc to generate assembly file, that pass should run right before the "Assembly Printer" pass. Is there any official way to enforce this ? Best regards, Sebastien
2020 Jul 02
2
flags to reproduce clang -O3 with opt -O3
Hello, I've been trying to figure out how to reproduce the results of a single clang -O3 compilation to a binary with a multi-step process using opt. Specifically I have: clang -O3 foo.c -o foo.exe which I want to replicate with: clang -O0 -c -emit-llvm foo.c opt -O3 foo.bc -o foo_o.bc clang foo_o.bc -o foo.exe Any hints / suggestions on what additional flags I need to produce the same
2015 Nov 17
2
Confused on how to do a machinefunction pass
Hi, So, I run my pass in X86 target with llc command and it printed out "hello****". Now I am trying to do the same pass for ARM target. So I did exactly what I did for X86 as mentioned in my previous posts. When I run the following command: llc -march=arm test.ll -o test nothing prints out. I did the same for MIPS target too and I got no result. Can anyone tell me what I'm doing
2020 Jul 03
2
flags to reproduce clang -O3 with opt -O3
Awesome, thanks! I'd like to have the last step (llc in your example) not perform additional optimization passes, such as O3, and simply use the O3 pass from opt in the previous line. Do you happen to know if I should use 'llc -O0 foo_o.bc -o foo.exe' instead to achieve this? On Thu, Jul 2, 2020 at 6:35 PM Mehdi AMINI <joker.eph at gmail.com> wrote: > > > On Thu,
2015 Nov 04
3
Confused on how to do a machinefunction pass
Thank you so much. That helped alot. Fami On Wed, Nov 4, 2015 at 9:40 AM, John Criswell <jtcriswel at gmail.com> wrote: > On 11/3/15 7:54 PM, fateme Hoseini wrote: > > Dear John, > Thank you so much for your help. I looked at those documents. Could you > kindly answer the following questions: > > Does it mean that I have to make my own backend target in order to write
2015 Nov 17
2
Confused on how to do a machinefunction pass
Yes, I have done exactly the same. The wawanalyzer is the same. I changed ARM.h and ARMTargetMachine.cpp in the tager/arm folder. then I make tool/llc and lib folder. On Tue, Nov 17, 2015 at 10:55 AM, John Criswell <jtcriswel at gmail.com> wrote: > On 11/17/15 12:16 AM, fateme Hoseini via llvm-dev wrote: > > Hi, > So, I run my pass in X86 target with llc command and it printed
2013 Apr 04
1
[LLVMdev] Packed instructions generaetd by LoopVectorize?
Thanks, that did it! Are there any plans to enable the loop vectorizer by default? From: Nadav Rotem [mailto:nrotem at apple.com] Sent: Wednesday, April 03, 2013 13:33 PM To: Nowicki, Tyler Cc: LLVM Developers Mailing List Subject: Re: Packed instructions generaetd by LoopVectorize? Hi Tyler, Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating
2013 Feb 21
2
[LLVMdev] Generate scalar SSE instructions instead of packed instructions
On Thu, Feb 21, 2013 at 12:14 PM, Nadav Rotem <nrotem at apple.com> wrote: > You can change the input LLVM-IR. > > On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> > wrote: > > Hi,**** > > ** ** > > I am interested in evaluating the performance of packed vs scalar > double-precision floating point instructions on