thr3ads.net - similar to: "Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)"

Displaying 20 results from an estimated 1000 matches similar to: "Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)"

Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)

2018 Feb 28

Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)

On 02/27/2018 10:21 AM, Alex Wang via llvm-dev wrote: > Hello all! > > I was looking through the results of disassembling a heavily-used > short function > in the program I'm working on, and ended up wondering why LLVM was > generating > that assembly and what changes would be necessary to improve the code. > I asked > on #llvm, but it seems that the people with

[LLVMdev] help with X86 DAG->DAG Instruction Selection

2013 Feb 08

[LLVMdev] help with X86 DAG->DAG Instruction Selection

I have an llvm ir, which generates the following machine code using llc (llvm 3.0 on win32) after # *** IR Dump After X86 DAG->DAG Instruction Selection ***: The first three lines and the last two lines alone together are used to compute "sin" for some double number. - line 1: move the stack pointer down 8 - line 2: copy the updated stack pointer to a base register - line 3: copy a

[LLVMdev] help with X86 DAG->DAG Instruction Selection

2013 Feb 08

[LLVMdev] help with X86 DAG->DAG Instruction Selection

Hi Peng, Can you please open a bugzilla and attache the LL file ? Can you please reproduce it on ToT ? Thanks, Nadav On Feb 7, 2013, at 9:08 PM, Peng Cheng <gm4cheng at gmail.com> wrote: > I have an llvm ir, which generates the following machine code using llc (llvm 3.0 on win32) after # *** IR Dump After X86 DAG->DAG Instruction Selection ***: > > The first three lines

Mischeduler: Unknown reason for peak register pressure increase

2017 Aug 12

Mischeduler: Unknown reason for peak register pressure increase

I am working on a project where we are integrating an existing pre-RA scheduler into LLVM and we are trying to match our peak register pressure values with the machine instruction schedulers values while using X86. I am finding some mismatches in test cases like the one attached. The registers "AH" and "AL" are live-out but not live-in and I don't see that they are defined

[LLVMdev] AVX spill alignment

2011 Aug 25

[LLVMdev] AVX spill alignment

Hey guys, Are spills/reloads of AVX registers using aligned stores/loads? I can't seem to find the code that aligns the stack slots to 32-bytes. Could someone point me in the right direction? Thanks, Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110825/b5724dec/attachment.html>

reg coalescing improvements

2017 Aug 17

reg coalescing improvements

Hi, I am seeing cases of poorly coalesced IV updates on SystemZ: In the final IR, it is obvious that %R4D<def> = LA %R2D<kill>, 4, %noreg // R4 = R2 + 4 %R2D<def> = LGR %R4D<kill> // R2 = R4 could be optimized to -> %R2D<def> = LA %R2D<kill>, 4, %noreg // R2 = R2 + 4 The reason this wasn't coalesced, is

[LLVMdev] AVX spill alignment

2011 Sep 01

[LLVMdev] AVX spill alignment

On Aug 25, 2011, at 4:17 PM, Cameron McInally wrote: > Hey guys, > > Are spills/reloads of AVX registers using aligned stores/loads? Yes. > I can't > seem to find the code that aligns the stack slots to 32-bytes. Could > someone point me in the right direction? The register class has 256-bit spill alignment: def VR256 : RegisterClass<"X86", [v32i8, v16i16,

2019 Oct 25

Hello, I have studied register allocation in theoretical aspects and exploring the same in the implementation level. I need a minimal testcase for register spilling to analyze spilling procedure in llvm. I tried with a testcase taking 20 variables but all the 20 variables are getting stored in the stack using %rbp. Maybe my live variable analysis is wrong. Please help me with a minimal testcase

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

All, I've been trying to simplify the way LLVM models sub-register relationships a bit, and the X86 sub_ss and sub_sd sub-register indices are getting in the way. I want to get rid of them. These sub-registers are special, they are only mentioned here: let CompositeIndices = [(sub_ss), (sub_sd)] in { def XMM0: Register<"xmm0">, DwarfRegNum<[17, 21, 21]>; def

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

Jakob Stoklund Olesen <jolesen at apple.com> writes: > These sub-registers are special, they are only mentioned here: > > let CompositeIndices = [(sub_ss), (sub_sd)] in { > def XMM0: Register<"xmm0">, DwarfRegNum<[17, 21, 21]>; > def XMM1: Register<"xmm1">, DwarfRegNum<[18, 22, 22]>; > ... I'm confused. Below you

[LLVMdev] Register class intersection

2009 Apr 28

[LLVMdev] Register class intersection

When the coalescer is run with -join-cross-class-copies it needs to determine the register class of the joined virtual registers. The new register class must be compatible with both old register classes. The current implementation chooses the register class with the larger spill size, or the less populous class. This works with the current targets, but it can produce illegal machine code

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

2012 Jul 26

[LLVMdev] X86 sub_ss and sub_sd sub-register indexes

On Jul 26, 2012, at 9:43 AM, dag at cray.com wrote: > Jakob Stoklund Olesen <jolesen at apple.com> writes: > >> As far as I can tell, all sub-register operations involving sub_ss and >> sub_sd can simply be replaced with COPY_TO_REGCLASS: >> >> def : Pat<(v4i32 (X86Movsd VR128:$src1, VR128:$src2)), >> (VMOVSDrr VR128:$src1,

[LLVMdev] Patterns with Multiple Stores

2008 Nov 17

[LLVMdev] Patterns with Multiple Stores

I want to write a pattern that looks something like this: def : Pat<(unalignedstore (v2f64 VR128:$src), addr:$dst), (MOVSDmr ADD64ri8(addr:$dst, imm:8), ( SHUFPDrri (VR128:$src, (MOVSDmr addr:$dst, FR64:$src))), imm:3) So I want to convert an unaligned vector store to a scalar store, a shuffle and a scalar store. There are several question I have: - Is the imm:3 syntax

[LLVMdev] Call instruction

2007 Sep 07

[LLVMdev] Call instruction

My home e--mail is down, which is where I get my llvm feeds, so please copy any replies to this address as well as the list. The call instruction can define implicit defs. What are the semantics when the call includes a use with a kill of some register and also an implicit def of that register? Is the register to be considered live out at that point? I've found a failing testcase where

How to get the case value from Machine Instruction

2018 Apr 09

How to get the case value from Machine Instruction

Hi, guys I am interesting about how to get the switch case value form the Machine Instruction. I know the switch will be converted to jump-table in the Machine Instruction. And in the phase CodeGen , the case-value of SwitchInst can get esasly. but it seems no case -value in Machine Instruction. The MI as follows: Frame Objects: fi#0: size=1, align=0, at location [SP] fi#1: size=4,

[LLVMdev] Problem with MachineFunctionPass and JMP

2013 May 13

[LLVMdev] Problem with MachineFunctionPass and JMP

Hi ! I'm trying to modify the code in a machine function pass… I added a new basicblock and I want to add a jump to an another BB from my new BB. Here is my code : bool Obfuscation::runOnMachineFunction(MachineFunction &MF) { MachineBasicBlock *newEntry = MF.CreateMachineBasicBlock(); MF.insert(MF.begin(), newEntry); std::vector<MachineBasicBlock*> origBB;

How to get the case value from Machine Instruction

2018 Apr 09

How to get the case value from Machine Instruction

Some glitch in the emailer? I have received this message 3 times in a row!? I think that by the time it gets as far as MI-level there is no reversible method of determining the 'case' label at all. The reason I say this, is that I have often seen optimisations that coalesce groups of values into interesting logical tests and jump-tables are completely avoided. For example, a simple

[LLVMdev] Missing optimization - constant parameter

2013 Aug 02

[LLVMdev] Missing optimization - constant parameter

For the little C test program where a constant is stored in memory and also used as a parameter: #include <stdint.h> uint64_t val, *p; extern uint64_t xtr( uint64_t); uint64_t caller() { uint64_t x; p = &val; x = 12345123400L; *p = x; return xtr(x); } clang (3.2, 3.3 and svn) generates the following X86 code (at -O3): caller: movq

How to get the case value from Machine Instruction

2018 Apr 10

How to get the case value from Machine Instruction

Thanks for your help. Is there possible I can get the realily case value form the MI? For the case in https://bugs.llvm.org/show_bug.cgi?id=34902. as follows. ############################# * GCC v7.1 generated assembly ############################# ** Options: -Os -marm -march=armv7-a foo: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 sub

[LLVMdev] Reducing .td redundancy

2009 Mar 24

[LLVMdev] Reducing .td redundancy

On Tuesday 24 March 2009 10:43, Chris Lattner wrote: > On Mar 23, 2009, at 5:56 PM, David Greene wrote: > > Is it legal to do something like a !strconcat on a non-string > > entity? That > > is, is there some operation that will let me do this (replace > > SOME_CONCAT with > > an appropriate operator): > > I don't get it, can you try a simpler example on

similar to: Missed optimization - spill/load generated instead of reg-to-reg move (and two other questions)