similar to: [LLVMdev] X86 - Help on fixing a poor code generation bug

Displaying 20 results from an estimated 600 matches similar to: "[LLVMdev] X86 - Help on fixing a poor code generation bug"

2013 Dec 05
0
[LLVMdev] X86 - Help on fixing a poor code generation bug
Hi Andrea, Thanks for working on this. I can see two approaches to solving this problem. The first one (that you suggested) is to catch this pattern after register allocation. The second approach is to eliminate this redundancy during instruction selection. Can you please look into catching this pattern during iSel? The idea is that ADDSS does an ADD plus BLEND operations, and you can easily
2012 Oct 24
3
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
Hi, I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below. The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is : // BEFORE LOOP ... Some COPYs.... 400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2
2016 Mar 10
2
Greedy register allocator allocates live sub-register
Hi all, I've come across a problem with register allocation which I have been unable to track down the root cause of. 6728B %vreg304<def> = COPY %vreg278; VRF128:%vreg304,%vreg278 6736B %vreg302<def> = COPY %vreg278; VRF128:%vreg302,%vreg278 6752B %vreg278<def,tied1> = foo %vreg278<tied0>, %vreg277, 14, pred:1, pred:%noreg, 5; VRF128:%vreg278 VRF64_l:%vreg277 * bar
2015 Aug 16
2
[LLVMdev] Adding a stack probe function attribute
I started to implement inlining of the stack probe function based on Microsoft's inlined stack probes in https://github.com/Microsoft/llvm/tree/MS. Do we know why the stack pointer cannot be updated in a loop (which results in ideal code)? I noticed that was commented in Microsoft's code. I suspect this is due to debug or unwinding information, since it is allowed on Windows x86-32. I
2017 Nov 30
2
TwoAddressInstructionPass bug?
Hi, we are in the midst of an interesting work that begun with setting 'guessInstructionProperties = 0' in the SystemZ backend. We have found this to be useful, and discovered many instructions where the hasSideEffects flag was incorrectly set while it actually shouldn't. The attached patch and test case triggers an assert in TwoAddress.  (bin/llc ./tc_TwoAddr_crash.ll
2014 Sep 05
3
[LLVMdev] [PATCH] [MachineSinking] Conservatively clear kill flags after coalescing.
Hi Quentin, Jonas looked further into the problem below, and asked me to submit his patch. Note the we have our own out-of-tree target, and we have not been able to reproduce this problem on an in-tree target. /Patrik Hägglund [MachineSinking] Conservatively clear kill flags after coalescing. This solves the problem of having a kill flag inside a loop with a definition of the register prior to
2013 Sep 26
2
[LLVMdev] Register scavenger and SP/FP adjustments
Consider this example: --- ex.ll --- declare void @bar() ; Function Attrs: nounwind optsize define void @main() { entry: %hin = alloca [256 x i32], align 4 %xin = alloca [256 x i32], align 4 call void @bar() ret void } ------------- Freshly built llc: llc -O2 -march=x86 < ex.ll -print-before-all # *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***: #
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
The code has changed a lot over the years. Looks like at some point of time the assumption was broken. calculateCallsInformation() may have eliminated the pseudo set up instructions already. // If call frames are not being included as part of the stack frame, and
2013 Sep 26
1
[LLVMdev] Register scavenger and SP/FP adjustments
Thanks, I'll look into that. Still, the case where the function does not call anything remains---in such a situation there are no ADJCALLSTACK pseudos, so regardless of what that function you pointed at does, there won't be any target-independent information about the SP adjustment by the time the frame index elimination runs. Would it make sense to have ADJCALLSTACK pseudos every
2017 Jun 05
2
[newbie] trouble with global variables and CreateLoad/Store in JIT
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them. So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.
2014 Sep 05
5
[LLVMdev] [PATCH] [MachineSinking] Conservatively clear kill flags after coalescing.
On Sep 5, 2014, at 10:21 AM, Juergen Ributzka <juergen at apple.com> wrote: > clearKillFlags seems a little "overkill" to me. In this case you could just simply transfer the value of the kill flag from the SrcReg to the DstReg. We are extending the live-range of SrcReg. I do not see how you could relate that to the kill flag of DstReg. Therefore, I still think, this is the
2015 Aug 10
2
ARM: Predicated returns considered analyzable?
Hello, The function ARMBaseInstrInfo::AnalyzeBranch contains the following piece of code: } else if (I->isReturn()) { // Returns can't be analyzed, but we should run cleanup. CantAnalyze = !isPredicated(I); } else { This could lead to cases where for a block that ends with a conditional return, AnalyzeBranch returns false (i.e. analyzed), both TBB and FBB are
2015 Oct 13
2
MachineSink optimization in code containing a setjmp
Hello LLVM-dev, I think I've found an issue with the MachineSink optimization on a program that uses setjmp. It looks like MachineSink will happily move a machine instruction into a following machine basic block (not necessarily a successor), even when that later block can be reached through a setjmp. Here is some example debug output from llc that I'm seeing: Sinking along critical
2017 Jun 06
2
[newbie] trouble with global variables and CreateLoad/Store in JIT
On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola < nikodemus at random-state.net> wrote: > Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it > through an opaque identity function ... then everything works fine. > > Is this a bug in LLVM or is there some magic involving globals I'm > misunderstanding? > This looks like a bug in the handling of
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
Hi, We recently found a testcase showing that simplifications in instcombine sometimes change the instruction without reducing the instruction cost, but causing problems in TwoAddressInstruction pass. And it looks like the problem is generic and other simplification may have the same issue. I want to get some ideas about what is the best way to fix such kind of problem. The testcase:
2017 Jun 06
2
[newbie] trouble with global variables and CreateLoad/Store in JIT
That's useful to know that the static compilation code path works. Furthermore, as expected from that: 52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4 00000054: IMAGE_REL_I386_DIR32 _foo It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.
2015 Aug 12
2
ARM: Predicated returns considered analyzable?
Doh. I missed the list in my first reply... Here's the replay of the conversation: ----- Renato: On 10 August 2015 at 14:05, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: > --> %SP<def,tied1> = t2LDMIA_RET %SP<tied0>, pred:8, pred:%CPSR, > %R7<def>, %PC<def>, %SP<imp-use,undef>, %R7<imp-use,undef>, >
2017 Jun 07
2
[newbie] trouble with global variables and CreateLoad/Store in JIT
My code was hinky, but only in the sense that I was accidentally duplicating the definition variable in the module where the function was. With only the declaration in the second module loading the bitcode reproduces the issue. Managed an lli reproduction: $ cat jit-0.ll target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32" target triple =
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote: > So to write this in a more condensed form, you have: > > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v1, 31 > use %v1 > use %v2 > > and transform this to > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v0, 31 > ... > > This is a classical problem with instruction
2013 May 09
2
[LLVMdev] Predicated Vector Operations
On May 9, 2013, at 3:05 PM, Jeff Bush <jeffbush001 at gmail.com> wrote: > On Thu, May 9, 2013 at 8:10 AM, <dag at cray.com> wrote: >> Jeff Bush <jeffbush001 at gmail.com> writes: >> >>> %tx = select %mask, %x, <0.0, 0.0, 0.0 ...> >>> %ty = select %mask, %y, <0.0, 0.0, 0.0 ...> >>> %sum = fadd %tx, %ty >>> %newvalue