thr3ads.net - similar to: "[LLVMdev] Instruction Selection"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Instruction Selection"

[LLVMdev] Rematerialization and spilling

2013 Jun 03

[LLVMdev] Rematerialization and spilling

Hi Jakob, thanks for the advice. I'll do as you suggest and make sure that CCR is never live. I can use pseudo-instructions to bundle cmp+jump but it's not ideal because I might also have to bundle cmp+jump+jump+... into a pseudo. Also, there are several flavours of cmp instruction so I might need a lot of pseudos. That's what led me to wonder whether MachineInstrBundles might be a

[LLVMdev] Load serialisation during selection DAG building

2012 Aug 14

[LLVMdev] Load serialisation during selection DAG building

I looked into those patches but I don't think they will help in my situation because my problems occur during instruction selection rather than scheduling. A simple and concrete example is a pattern like: [(set GR:$dst (add GR:$src (nvload addr:$mem)))] where nvload matches a load provided that isVolatile() is false. If the selection DAG looks like: | | LD1 LD2 ^

[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

2013 Sep 30

[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

Was there any development on this? I noticed that clang still produces a lea for the testcase in llvm.org/pr13320. On 28 September 2012 11:36, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > Hi, > > > > Here is an update on our proposal to improve the uses of LEA on Atom > processors. > > > > 1. Disable current generation of LEAs > > > > Due to

Convert MachineInstr to MCInst in AsmPrinter.cpp

2017 Dec 27

Convert MachineInstr to MCInst in AsmPrinter.cpp

Hello everyone, In the file *lib/CodeGen/AsmPrinter/AsmPrinter.cpp*, I would like to obtain an MCInst corresponding to its MachineInstr. Can anyone tell me a way to do that? If that is not possible, then, I would like to know if a given MachineInstr is an *lea *instruction and I would like to know if the symbol involved with this lea instruction is a jump-table. For instance, given a

[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

2012 Sep 28

[LLVMdev] [PROPOSAL] Improve uses of LEA on Atom

Hi, Here is an update on our proposal to improve the uses of LEA on Atom processors. 1. Disable current generation of LEAs Due to a 3 cycle stall between the ALU and the AGU any address generation done using math instruction will cause a stall on loads and stores which are within 3 cycles of the address generation. Consequently, the heuristics for using LEAs efficiently must know how many

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 13

[LLVMdev] trunk's optimizer generates slower code than 3.5

I submitted the problem report to clang's bugzilla but no one seems to care so I have to send it to the mailing list. clang 3.7 svn (trunk 229055 as the time I was to report this problem) generates slower code than 3.5 (Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)) for the following code. It is a "8 queens puzzle" solver written as an educational example. As

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 14

[LLVMdev] trunk's optimizer generates slower code than 3.5

The regressions in the performance of generated code, introduced by the llvm 3.6 release, don't seem to be limited to this 8 queens puzzle" solver test case. See... http://www.phoronix.com/scan.php?page=article&item=llvm-clang-3.5-3.6-rc1&num=1 where a bit hit in the performance of the Sparse Matrix Multiply test of the SciMark v2.0 benchmark was observed as well as others.

[LLVMdev] trunk's optimizer generates slower code than 3.5

2015 Feb 14

[LLVMdev] trunk's optimizer generates slower code than 3.5

Using the SciMark 2.0 code from http://math.nist.gov/scimark2/scimark2_1c.zip compiled with the same... make CFLAGS="-O3 -march=native" I am able to reproduce the 22% performance regression in the run time of the Sparse matmult benchmark. For 10 runs of the scimark2 benechmark, I get 998.439+/-0.4828 with the release llvm clang 3.5.1 compiler and 1217.363+/-1.1004 for the current

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 12

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

> Anyway, thanks very much for the information. Hopefully that'll let me > track things down. Let me know if you need some more information or dumps. > Would you mind me taking a day or so to investigate what's going on > here properly? Introducing a volatile to work around a bug in Clang > itself just seems perverse to me. (And we shouldn't let a CodeGen bug >

[LLVMdev] Codegen performance issue: LEA vs. INC.

2013 Sep 17

[LLVMdev] Codegen performance issue: LEA vs. INC.

Hi all. I'm looking for an advice on how to deal with inefficient code generation for Intel Nehalem/Westmere architecture on 64-bit platform for the attached test.cpp (LLVM IR is in test.cpp.ll). The inner loop has 11 iterations and eventually unrolled. Test.lea.s is the assembly code of the outer loop. It simply has 11 loads, 11 FP add, 11 FP mull, 1 FP store and lea+mov for index

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 13

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

Pretty sure you need to check EAX>=7 from cpuid leaf 0 before calling leaf 7 and you need to use the pass ECX=0 to leaf 7. See lib/Target/X86/X86Subtarget.cpp which uses a GetX86CpuIDAndInfoEx function to pass EAX and ECX to cpuid. I don't think it explains your compiler bug though. On Thu, Sep 12, 2013 at 2:12 PM, Adam Strzelecki <ono at java.pl> wrote: > > Anyway, thanks

[LLVMdev] Load serialisation during selection DAG building

2012 Aug 13

[LLVMdev] Load serialisation during selection DAG building

Steve, I had created a patch last year that does something similar to what you describe for regular loads and stores using aliasing information. I think that the last message in the thread was: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120402/140299.html This approach has worked for me, but it is not the preferred solution going forward. The preferred solution is to keep the

[LLVMdev] Fwd: Prevention register promotion at the isel codegen phase

2012 Nov 24

[LLVMdev] Fwd: Prevention register promotion at the isel codegen phase

Sorry, forgot to Reply-All. Begin forwarded message: > From: Steve Montgomery <stephen.montgomery3 at btinternet.com> > Subject: Re: [LLVMdev] Prevention register promotion at the isel codegen phase > Date: 24 November 2012 17:09:58 GMT > To: Joseph Pusdesris <joe at pusdesris.com> > > I had a similar problem trying to implement reg-mem operations. The solution I

[LLVMdev] Codegen performance issue: LEA vs. INC.

2013 Oct 03

[LLVMdev] Codegen performance issue: LEA vs. INC.

The two address pass is only concerned about register pressure. It sounds like it should be taught about profitability. In cases where profitability can only be determined with something machinetracemetric then it probably should live it to more sophisticated pass like regalloc. In this case, we probably need a profitability target hook which knows about lea. We should also consider disabling

[LLVMdev] Codegen performance issue: LEA vs. INC.

2013 Oct 02

[LLVMdev] Codegen performance issue: LEA vs. INC.

This sounds like llvm.org/pr13320. On 17 September 2013 18:20, Bader, Aleksey A <aleksey.a.bader at intel.com> wrote: > Hi all. > > > > I’m looking for an advice on how to deal with inefficient code generation > for Intel Nehalem/Westmere architecture on 64-bit platform for the attached > test.cpp (LLVM IR is in test.cpp.ll). > > The inner loop has 11 iterations

[LLVMdev] Load serialisation during selection DAG building

2012 Aug 13

[LLVMdev] Load serialisation during selection DAG building

I've got a question about how SelectionDAGBuilder treats loads. The LLVM Language Reference Manual explicitly states that the order of volatile operations may be changed relative to non-volatile operations. However, when the SelectionDAGBuilder in LLVM 3.1 encounters a volatile load, it flushes all pending loads and then chains the volatile load onto them meaning that the volatile load must

how to find out lea instruction causes skype crash when starting

2012 Jun 24

how to find out lea instruction causes skype crash when starting

Hi david, I find you signed off a patch about "x86: emulate lea with two register operands correctly" .In this patch,you described skype does a lea instruction and will crash when starting if it does not get the exception.I have used a tool named mentorKG.exe to make a LICENSE.TXT.but the software crashes when starting.I used your patch,and find it works well.I wonder how can you

[LLVMdev] "make check" failures: leaq in fold-mul-lohi.ll, stride-nine-with-base-reg.ll, stride-reuse.ll

2008 Feb 11

[LLVMdev] "make check" failures: leaq in fold-mul-lohi.ll, stride-nine-with-base-reg.ll, stride-reuse.ll

I'm seeing the following failures with "make check" (x86-32 linux): FAIL: test/CodeGen/X86/fold-mul-lohi.ll Failed with exit(1) at line 2 while running: llvm-as < test/CodeGen/X86/fold-mul-lohi.ll | llc -march=x86-64 | not grep lea leaq B, %rsi leaq A, %r8 leaq P, %rsi child process exited abnormally FAIL:

[LLVMdev] Fwd: Prevention register promotion at the isel codegen phase

2012 Nov 24

[LLVMdev] Fwd: Prevention register promotion at the isel codegen phase

Yes, this is very helpful! Thank you! How does this work when exiting a variable's liveness range? Will it automatically know to free the stack slot for reuse? -Joe On Sat, Nov 24, 2012 at 12:23 PM, Steve Montgomery < stephen.montgomery3 at btinternet.com> wrote: > Sorry, forgot to Reply-All. > > Begin forwarded message: > > *From: *Steve Montgomery

[LLVMdev] Use of custom operations after DAG legalization

2014 Sep 26

[LLVMdev] Use of custom operations after DAG legalization

I've been working on a backend for a 16-bit microcontroller and I've just updated my base from LLVM 3.4 to LLVM 3.5.0. This threw up a regression failure in my test suite, and having tracked down the cause, I'm now confused about the DAG legalization and optimization process which I thought I understood. I'd be really grateful for advice on whether I've misunderstood how

similar to: [LLVMdev] Instruction Selection