Displaying 20 results from an estimated 600 matches similar to: "[LLVMdev] X86 - Help on fixing a poor code generation bug"
2013 Dec 05
0
[LLVMdev] X86 - Help on fixing a poor code generation bug
Hi Andrea,
Thanks for working on this. I can see two approaches to solving this problem. The first one (that you suggested) is to catch this pattern after register allocation. The second approach is to eliminate this redundancy during instruction selection. Can you please look into catching this pattern during iSel? The idea is that ADDSS does an ADD plus BLEND operations, and you can easily
2012 Oct 24
3
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
Hi,
I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below.
The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is :
// BEFORE LOOP
... Some COPYs....
400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2
2016 Mar 10
2
Greedy register allocator allocates live sub-register
Hi all,
I've come across a problem with register allocation which I have been
unable to track down the root cause of.
6728B %vreg304<def> = COPY %vreg278; VRF128:%vreg304,%vreg278
6736B %vreg302<def> = COPY %vreg278; VRF128:%vreg302,%vreg278
6752B %vreg278<def,tied1> = foo %vreg278<tied0>, %vreg277, 14, pred:1,
pred:%noreg, 5; VRF128:%vreg278 VRF64_l:%vreg277
* bar
2015 Aug 16
2
[LLVMdev] Adding a stack probe function attribute
I started to implement inlining of the stack probe function based on
Microsoft's inlined stack probes in
https://github.com/Microsoft/llvm/tree/MS.
Do we know why the stack pointer cannot be updated in a loop (which
results in ideal code)? I noticed that was commented in Microsoft's
code.
I suspect this is due to debug or unwinding information, since it is
allowed on Windows x86-32.
I
2017 Nov 30
2
TwoAddressInstructionPass bug?
Hi,
we are in the midst of an interesting work that begun with setting
'guessInstructionProperties = 0' in the SystemZ backend. We have found
this to be useful, and discovered many instructions where the
hasSideEffects flag was incorrectly set while it actually shouldn't.
The attached patch and test case triggers an assert in TwoAddress.
(bin/llc ./tc_TwoAddr_crash.ll
2014 Sep 05
3
[LLVMdev] [PATCH] [MachineSinking] Conservatively clear kill flags after coalescing.
Hi Quentin,
Jonas looked further into the problem below, and asked me to submit his patch. Note the we have our own out-of-tree target, and we have not been able to reproduce this problem on an in-tree target. /Patrik Hägglund
[MachineSinking] Conservatively clear kill flags after coalescing.
This solves the problem of having a kill flag inside a loop
with a definition of the register prior to
2013 Sep 26
2
[LLVMdev] Register scavenger and SP/FP adjustments
Consider this example:
--- ex.ll ---
declare void @bar()
; Function Attrs: nounwind optsize
define void @main() {
entry:
%hin = alloca [256 x i32], align 4
%xin = alloca [256 x i32], align 4
call void @bar()
ret void
}
-------------
Freshly built llc:
llc -O2 -march=x86 < ex.ll -print-before-all
# *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***:
#
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
The code has changed a lot over the years. Looks like at some point of time the assumption was broken. calculateCallsInformation() may have eliminated the pseudo set up instructions already.
// If call frames are not being included as part of the stack frame, and
2013 Sep 26
1
[LLVMdev] Register scavenger and SP/FP adjustments
Thanks, I'll look into that. Still, the case where the function does
not call anything remains---in such a situation there are no
ADJCALLSTACK pseudos, so regardless of what that function you pointed at
does, there won't be any target-independent information about the SP
adjustment by the time the frame index elimination runs.
Would it make sense to have ADJCALLSTACK pseudos every
2017 Jun 05
2
[newbie] trouble with global variables and CreateLoad/Store in JIT
Since the getelementptrs were implicitly generated by the CreateStore/Load
I'm not sure how to get access to them.
So I hacked the assignment to be done thrice: once using a manual
decomposition into two GEPs and stores, once using the "big" CreateStore,
once via the setGlobal function, printing addresses and memory contents at
each point to the degree that I have access to them.
2014 Sep 05
5
[LLVMdev] [PATCH] [MachineSinking] Conservatively clear kill flags after coalescing.
On Sep 5, 2014, at 10:21 AM, Juergen Ributzka <juergen at apple.com> wrote:
> clearKillFlags seems a little "overkill" to me. In this case you could just simply transfer the value of the kill flag from the SrcReg to the DstReg.
We are extending the live-range of SrcReg. I do not see how you could relate that to the kill flag of DstReg.
Therefore, I still think, this is the
2015 Aug 10
2
ARM: Predicated returns considered analyzable?
Hello,
The function ARMBaseInstrInfo::AnalyzeBranch contains the following
piece of code:
} else if (I->isReturn()) {
// Returns can't be analyzed, but we should run cleanup.
CantAnalyze = !isPredicated(I);
} else {
This could lead to cases where for a block that ends with a
conditional return, AnalyzeBranch returns false (i.e. analyzed),
both TBB and FBB are
2015 Oct 13
2
MachineSink optimization in code containing a setjmp
Hello LLVM-dev,
I think I've found an issue with the MachineSink optimization on a program
that uses setjmp. It looks like MachineSink will happily move a machine
instruction into a following machine basic block (not necessarily a
successor), even when that later block can be reached through a setjmp.
Here is some example debug output from llc that I'm seeing:
Sinking along critical
2017 Jun 06
2
[newbie] trouble with global variables and CreateLoad/Store in JIT
On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <
nikodemus at random-state.net> wrote:
> Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it
> through an opaque identity function ... then everything works fine.
>
> Is this a bug in LLVM or is there some magic involving globals I'm
> misunderstanding?
>
This looks like a bug in the handling of
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
Hi,
We recently found a testcase showing that simplifications in
instcombine sometimes change the instruction without reducing the
instruction cost, but causing problems in TwoAddressInstruction pass.
And it looks like the problem is generic and other simplification may
have the same issue. I want to get some ideas about what is the best
way to fix such kind of problem.
The testcase:
2017 Jun 06
2
[newbie] trouble with global variables and CreateLoad/Store in JIT
That's useful to know that the static compilation code path works.
Furthermore, as expected from that:
52: c7 05 04 00 00 00 d5 00 00 00 movl $213, 4
00000054: IMAGE_REL_I386_DIR32 _foo
It looks like the offset `4` of the second field of your struct is correct
in the object file, so this does seem to be a problem in the JIT-specific
linking/loading.
2015 Aug 12
2
ARM: Predicated returns considered analyzable?
Doh. I missed the list in my first reply... Here's the replay of the
conversation:
----- Renato:
On 10 August 2015 at 14:05, Krzysztof Parzyszek via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> --> %SP<def,tied1> = t2LDMIA_RET %SP<tied0>, pred:8, pred:%CPSR,
> %R7<def>, %PC<def>, %SP<imp-use,undef>, %R7<imp-use,undef>,
>
2017 Jun 07
2
[newbie] trouble with global variables and CreateLoad/Store in JIT
My code was hinky, but only in the sense that I was accidentally
duplicating the definition variable in the module where the function was.
With only the declaration in the second module loading the bitcode
reproduces the issue.
Managed an lli reproduction:
$ cat jit-0.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple =
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote:
> So to write this in a more condensed form, you have:
>
> %v0 = ...
> %v1 = and %v0, 255
> %v2 = and %v1, 31
> use %v1
> use %v2
>
> and transform this to
> %v0 = ...
> %v1 = and %v0, 255
> %v2 = and %v0, 31
> ...
>
> This is a classical problem with instruction
2013 May 09
2
[LLVMdev] Predicated Vector Operations
On May 9, 2013, at 3:05 PM, Jeff Bush <jeffbush001 at gmail.com> wrote:
> On Thu, May 9, 2013 at 8:10 AM, <dag at cray.com> wrote:
>> Jeff Bush <jeffbush001 at gmail.com> writes:
>>
>>> %tx = select %mask, %x, <0.0, 0.0, 0.0 ...>
>>> %ty = select %mask, %y, <0.0, 0.0, 0.0 ...>
>>> %sum = fadd %tx, %ty
>>> %newvalue