thr3ads.net - similar to: "[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences"

Displaying 20 results from an estimated 7000 matches similar to: "[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences"

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

2014 Dec 21

[LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences

Which performance guidelines are you referring to? I'm not that familiar with decade-old CPUs, but to the best of my knowledge, this is not true on current hardware. There is one specific circumstance where PUSHes should be avoided - for Atom/Silvermont processors, the memory form of PUSH is inefficient, so the register-freeing optimization below may not be profitable (see 14.3.3.6 and

[LLVMdev] help with X86 DAG->DAG Instruction Selection

2013 Feb 08

[LLVMdev] help with X86 DAG->DAG Instruction Selection

I have an llvm ir, which generates the following machine code using llc (llvm 3.0 on win32) after # *** IR Dump After X86 DAG->DAG Instruction Selection ***: The first three lines and the last two lines alone together are used to compute "sin" for some double number. - line 1: move the stack pointer down 8 - line 2: copy the updated stack pointer to a base register - line 3: copy a

[LLVMdev] help with X86 DAG->DAG Instruction Selection

2013 Feb 08

[LLVMdev] help with X86 DAG->DAG Instruction Selection

Hi Peng, Can you please open a bugzilla and attache the LL file ? Can you please reproduce it on ToT ? Thanks, Nadav On Feb 7, 2013, at 9:08 PM, Peng Cheng <gm4cheng at gmail.com> wrote: > I have an llvm ir, which generates the following machine code using llc (llvm 3.0 on win32) after # *** IR Dump After X86 DAG->DAG Instruction Selection ***: > > The first three lines

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

2017 Aug 02

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

Hi, We recently found a testcase showing that simplifications in instcombine sometimes change the instruction without reducing the instruction cost, but causing problems in TwoAddressInstruction pass. And it looks like the problem is generic and other simplification may have the same issue. I want to get some ideas about what is the best way to fix such kind of problem. The testcase:

[LLVMdev] Virtual register problem in X86 backend

2014 Dec 10

[LLVMdev] Virtual register problem in X86 backend

Hi, Thx for your help... Here is the IR code: ; ModuleID = 'foo_bar.c' target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @.str = private unnamed_addr constant [6 x i8] c"MAIN\0A\00", align 1 ; Function Attrs: nounwind uwtable define i32 @main(i32 %argc, i8** %argv) #0 { entry: %retval = alloca i32,

[LLVMdev] Virtual register problem in X86 backend

2014 Dec 08

[LLVMdev] Virtual register problem in X86 backend

Hi, I'm having trouble using virtual register in the X86 backend. I implemented a new intrinsic and I use a custom inserter. The goal of the intrinsic is to set the content of the stack to zero at the end of each function. Here is my code: MachineBasicBlock * X86TargetLowering::EmitBURNSTACKWithCustomInserter( MachineInstr *MI, MachineBasicBlock

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

2017 Aug 02

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote: > So to write this in a more condensed form, you have: > > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v1, 31 > use %v1 > use %v2 > > and transform this to > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v0, 31 > ... > > This is a classical problem with instruction

[LLVMdev] MachineSink and EFLAGS

2011 Jun 03

[LLVMdev] MachineSink and EFLAGS

On Jun 3, 2011, at 2:59 AM, Galanov, Sergey wrote: > Hi, Bill and Jakob. > > I don't quite understand. I am talking about CMOV_GR* instructions which are conservatively marked as clobbering EFLAGS in X86InstrCompiler.td. Doesn't that mean there cannot be any use of EFLAGS in subsequent instructions before it is defined by some other instruction? > > I also don't

[LLVMdev] MachineSink and EFLAGS

2011 Jun 05

[LLVMdev] MachineSink and EFLAGS

Thanks for spelling it out, now I understand. On Jun 5, 2011, at 6:11 AM, Galanov, Sergey wrote: > Well, the point is CMOV_GR* are marked clobbering EFLAGS conservatively just in case they turn out to be lowered into a sequence containing XOR %reg,%reg which indeed clobbers EFLAGS. This means there might not be any instruction which actually uses this EFLAGS value. This actually looks like a

[LLVMdev] MachineSink and EFLAGS

2011 Jun 05

[LLVMdev] MachineSink and EFLAGS

Well, the point is CMOV_GR* are marked clobbering EFLAGS conservatively just in case they turn out to be lowered into a sequence containing XOR %reg,%reg which indeed clobbers EFLAGS. This means there might not be any instruction which actually uses this EFLAGS value. For an example we can look no further than the actual test which has been disabled after the fix

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 26

[LLVMdev] Register scavenger and SP/FP adjustments

Consider this example: --- ex.ll --- declare void @bar() ; Function Attrs: nounwind optsize define void @main() { entry: %hin = alloca [256 x i32], align 4 %xin = alloca [256 x i32], align 4 call void @bar() ret void } ------------- Freshly built llc: llc -O2 -march=x86 < ex.ll -print-before-all # *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***: #

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 26

[LLVMdev] Register scavenger and SP/FP adjustments

CallFrameSetupOpcode is a pseudo opcode like X86::ADJCALLSTACKDOWN64. That means when the code is expected to be called before the pseudo instructions are eliminated. I don't know why it's not the case for you. A quick look at PEI code indicates the pseudo's should not have been removed at the time when replaceFrameIndices are run. Evan On Sep 25, 2013, at 8:57 AM, Krzysztof

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 06

[newbie] trouble with global variables and CreateLoad/Store in JIT

On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola < nikodemus at random-state.net> wrote: > Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it > through an opaque identity function ... then everything works fine. > > Is this a bug in LLVM or is there some magic involving globals I'm > misunderstanding? > This looks like a bug in the handling of

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 06

[newbie] trouble with global variables and CreateLoad/Store in JIT

That's useful to know that the static compilation code path works. Furthermore, as expected from that: 52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4 00000054: IMAGE_REL_I386_DIR32 _foo It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 26

[LLVMdev] Register scavenger and SP/FP adjustments

The code has changed a lot over the years. Looks like at some point of time the assumption was broken. calculateCallsInformation() may have eliminated the pseudo set up instructions already. // If call frames are not being included as part of the stack frame, and

[LLVMdev] liveness assertion problem in llc

2012 Sep 18

[LLVMdev] liveness assertion problem in llc

On Sep 18, 2012, at 1:45 PM, Bjorn De Sutter <bjorn.desutter at elis.ugent.be> wrote: > I am working on a backend for a CGRA architecture with advanced predicate support (as on EPIC machines and as first used in the OpenIMPACT compiler). Until last month, the backend was working fine, but since the r161643 commit by stoklund, my backend doesn't work anymore. I think I noticed some

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 25

[LLVMdev] Register scavenger and SP/FP adjustments

Hi All, I'm dealing with a problem where the spill/restore instructions inserted during scavenging span an adjustment of the SP/FP register. The result is that despite the base register (SP/FP) being changed between the spill and the restore, both store and load use the same immediate offset. I see code in the PEI (replaceFrameIndices) that is supposed to track the SP/FP adjustment:

[LLVMdev] Register scavenger and SP/FP adjustments

2013 Sep 26

[LLVMdev] Register scavenger and SP/FP adjustments

Thanks, I'll look into that. Still, the case where the function does not call anything remains---in such a situation there are no ADJCALLSTACK pseudos, so regardless of what that function you pointed at does, there won't be any target-independent information about the SP adjustment by the time the frame index elimination runs. Would it make sense to have ADJCALLSTACK pseudos every

[newbie] trouble with global variables and CreateLoad/Store in JIT

2017 Jun 07

[newbie] trouble with global variables and CreateLoad/Store in JIT

My code was hinky, but only in the sense that I was accidentally duplicating the definition variable in the module where the function was. With only the declaration in the second module loading the bitcode reproduces the issue. Managed an lli reproduction: $ cat jit-0.ll target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32" target triple =

[LLVMdev] RegAllocFast uses too much stack

2011 Jul 11

[LLVMdev] RegAllocFast uses too much stack

I discovered recently that RegAllocFast spills all the registers before every function call. This is the root cause of one of our recursive functions that takes about 150 bytes of stack when built with gcc (same at -O0 and -O2, or 120 bytes at llc -O2) taking 960 bytes of stack when built by llc -O0. That's pretty bad for situations where you have small stacks, which is not uncommon for

similar to: [LLVMdev] [RFC] [X86] Mov to push transformation in x86-32 call sequences