Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Marking implicit registers as "trashed""
2012 Oct 24
3
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
Hi,
I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below.
The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is :
// BEFORE LOOP
... Some COPYs....
400B%vreg47<def> = COPY %vreg2<kill>; R600_Reg32:%vreg47,%vreg2
2013 May 13
1
[LLVMdev] Tracking down a SELECT expansion to predicated moves
I've inherited some target code, but there is no SELECT lowering in my
target. But somewhere/somehow SELECT is being transformed into a
predicated move. I've traced SELECT everywhere in Codegen/SelectionDAG.
Any ideas on tracking this down to the point in Codegen
lowering/dag-conversion to a predicated series? Again, I do not have a
lowering rule in my target for SELECT.
If I do a IR
2014 Oct 27
4
[LLVMdev] Problem in X86 backend
Hi,
I'm having some trouble wirting an instruction in the X86 backend.
I made a new intrinsic and I wrote a custom inserter for my intrinsic in the X86 backend.
Everything works fine, except for one instruction that I can't find how to write.
I want to add this instruction in one of my machine basic block: mov [rdi], 0
How can I achieve that with the LLVM api? I tried several
2017 Nov 30
2
TwoAddressInstructionPass bug?
Hi,
we are in the midst of an interesting work that begun with setting
'guessInstructionProperties = 0' in the SystemZ backend. We have found
this to be useful, and discovered many instructions where the
hasSideEffects flag was incorrectly set while it actually shouldn't.
The attached patch and test case triggers an assert in TwoAddress.
(bin/llc ./tc_TwoAddr_crash.ll
2013 Sep 26
2
[LLVMdev] Register scavenger and SP/FP adjustments
Consider this example:
--- ex.ll ---
declare void @bar()
; Function Attrs: nounwind optsize
define void @main() {
entry:
%hin = alloca [256 x i32], align 4
%xin = alloca [256 x i32], align 4
call void @bar()
ret void
}
-------------
Freshly built llc:
llc -O2 -march=x86 < ex.ll -print-before-all
# *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***:
#
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
Hi,
We recently found a testcase showing that simplifications in
instcombine sometimes change the instruction without reducing the
instruction cost, but causing problems in TwoAddressInstruction pass.
And it looks like the problem is generic and other simplification may
have the same issue. I want to get some ideas about what is the best
way to fix such kind of problem.
The testcase:
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
CallFrameSetupOpcode is a pseudo opcode like X86::ADJCALLSTACKDOWN64. That means when the code is expected to be called before the pseudo instructions are eliminated. I don't know why it's not the case for you. A quick look at PEI code indicates the pseudo's should not have been removed at the time when replaceFrameIndices are run.
Evan
On Sep 25, 2013, at 8:57 AM, Krzysztof
2013 Sep 25
2
[LLVMdev] Register scavenger and SP/FP adjustments
Hi All,
I'm dealing with a problem where the spill/restore instructions inserted
during scavenging span an adjustment of the SP/FP register. The result
is that despite the base register (SP/FP) being changed between the
spill and the restore, both store and load use the same immediate offset.
I see code in the PEI (replaceFrameIndices) that is supposed to track
the SP/FP adjustment:
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
The code has changed a lot over the years. Looks like at some point of time the assumption was broken. calculateCallsInformation() may have eliminated the pseudo set up instructions already.
// If call frames are not being included as part of the stack frame, and
2014 Oct 24
2
[LLVMdev] Adding masked vector load and store intrinsics
> So %passthrough can *only* be undef or zeroinitializer?
No, that wasn't the intent. %passthrough can be any other definition that is needed. Zero and undef were simply two possible values that illustrated some interesting behavior.
Mapping of the %passthrough to the actual semantics of many vector instruction sets where the masked instructions leave the masked-off elements of the
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote:
> So to write this in a more condensed form, you have:
>
> %v0 = ...
> %v1 = and %v0, 255
> %v2 = and %v1, 31
> use %v1
> use %v2
>
> and transform this to
> %v0 = ...
> %v1 = and %v0, 255
> %v2 = and %v0, 31
> ...
>
> This is a classical problem with instruction
2013 Sep 26
1
[LLVMdev] Register scavenger and SP/FP adjustments
Thanks, I'll look into that. Still, the case where the function does
not call anything remains---in such a situation there are no
ADJCALLSTACK pseudos, so regardless of what that function you pointed at
does, there won't be any target-independent information about the SP
adjustment by the time the frame index elimination runs.
Would it make sense to have ADJCALLSTACK pseudos every
2016 Mar 10
2
Greedy register allocator allocates live sub-register
Hi all,
I've come across a problem with register allocation which I have been
unable to track down the root cause of.
6728B %vreg304<def> = COPY %vreg278; VRF128:%vreg304,%vreg278
6736B %vreg302<def> = COPY %vreg278; VRF128:%vreg302,%vreg278
6752B %vreg278<def,tied1> = foo %vreg278<tied0>, %vreg277, 14, pred:1,
pred:%noreg, 5; VRF128:%vreg278 VRF64_l:%vreg277
* bar
2012 Oct 25
0
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
Hi Vincent,
On 24/10/2012 23:26, Vincent Lejeune wrote:
> Hi,
>
> I don't know if my llvm ir code is faulty, or if I spot a bug in the RegisterCoalescing Pass, so I'm posting my issue on the ML. Shader and print-before-all dump are given below.
>
> The interessing part is the vreg6/vreg48 reduction : before RegCoalescing, the machine code is :
>
> // BEFORE LOOP
>
2016 Mar 05
2
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
I wonder if we can do this in a separate analysis MachineFunction SSA pass.
1) SelectionDAG will generate a pseudo instruction MutatingFMA. When it's
generated it's allowed to have d = a * b + c form, where d doesn't have to
be in {a, b, c}.
2) Later, the proposed pass uses an algorithm to decide for instruction MI:
`%vreg0 = MutatingFMA %vreg1, %vreg2, %vreg3`, it should tie %vreg0
2014 Dec 10
2
[LLVMdev] Virtual register problem in X86 backend
Hi,
Thx for your help...
Here is the IR code:
; ModuleID = 'foo_bar.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
@.str = private unnamed_addr constant [6 x i8] c"MAIN\0A\00", align 1
; Function Attrs: nounwind uwtable
define i32 @main(i32 %argc, i8** %argv) #0 {
entry:
%retval = alloca i32,
2016 Feb 29
2
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
Ping?
On Mon, Feb 22, 2016 at 1:06 PM Tim Shen <timshen at google.com> wrote:
> On Fri, Feb 19, 2016 at 5:10 PM Tim Shen <timshen at google.com> wrote:
>
>> I wonder if we can fix this by making the transformation simpler, that
>> is, instead of doing:
>>
>
> I wrote a prototype (see attach) for this idea, it actually improves some
> of the test cases
2016 Mar 16
2
[VSXFMAMutate] OldFMAReg may be wrongly rewritten
I implemented a proof of concept of a new generic MachineFunction SSA pass.
The code is not readable and not efficient yet, but it shows interesting
results:
In fma.ll @test_FMSUB2 (return dummy(A * B + C, A * B - D)):
before:
fmr 0, 1
xsmaddadp 3, 0, 2
xsmsubmdp 0, 2, 4
fmr 1, 3
fmr 2, 0
bl dummy2
after:
xsmsubadp 4, 1, 2
xsmaddmdp
2013 Dec 05
3
[LLVMdev] X86 - Help on fixing a poor code generation bug
Hi all,
I noticed that the x86 backend tends to emit unnecessary vector insert
instructions immediately after sse scalar fp instructions like
addss/mulss.
For example:
/////////////////////////////////
__m128 foo(__m128 A, __m128 B) {
_mm_add_ss(A, B);
}
/////////////////////////////////
produces the sequence:
addss %xmm0, %xmm1
movss %xmm1, %xmm0
which could be easily optimized into
2014 Sep 05
3
[LLVMdev] [PATCH] [MachineSinking] Conservatively clear kill flags after coalescing.
Hi Quentin,
Jonas looked further into the problem below, and asked me to submit his patch. Note the we have our own out-of-tree target, and we have not been able to reproduce this problem on an in-tree target. /Patrik Hägglund
[MachineSinking] Conservatively clear kill flags after coalescing.
This solves the problem of having a kill flag inside a loop
with a definition of the register prior to