Displaying 20 results from an estimated 56 matches for "r9d".
Did you mean:
r9
2015 Sep 01
2
[RFC] New pass: LoopExitValues
...unsigned int Val) {
for (int Outer = 0; Outer < Size; ++Outer)
for (int Inner = 0; Inner < Size; ++Inner)
Dst[Outer * Size + Inner] = Src[Outer * Size + Inner] * Val;
}
With LoopExitValues
-------------------------------
matrix_mul:
testl %edi, %edi
je .LBB0_5
xorl %r9d, %r9d
xorl %r8d, %r8d
.LBB0_2:
xorl %r11d, %r11d
.LBB0_3:
movl %r9d, %r10d
movl (%rdx,%r10,4), %eax
imull %ecx, %eax
movl %eax, (%rsi,%r10,4)
incl %r11d
incl %r9d
cmpl %r11d, %edi
jne .LBB0_3
incl %r8d
cmpl %edi, %r8d
jne .LBB0_2
.LBB0_5:
retq...
2015 Aug 31
2
[RFC] New pass: LoopExitValues
Hello LLVM,
This is a proposal for a new pass that improves performance and code
size in some nested loop situations. The pass is target independent.
>From the description in the file header:
This optimization finds loop exit values reevaluated after the loop
execution and replaces them by the corresponding exit values if they
are available. Such sequences can arise after the
2019 Nov 08
2
Register Dataflow Analysis on X86
Do you know whether it has been fixed on the 8.0.1 release?
Scott
On Fri, Nov 8, 2019 at 9:45 AM Krzysztof Parzyszek <kparzysz at quicinc.com<mailto:kparzysz at quicinc.com>> wrote:
The one blocking issue that existed in the past has been fixed. I haven’t had time to do any work on it lately, but I’m not aware of any fundamental problems that would make it not work on x86.
--
2019 Dec 23
2
Register Dataflow Analysis on X86
Hi Scott,
That #1073741833 is a register mask. They are treated as aggregate registers (essentially sets of registers), so if it includes R9D and R11D, it will be treated as being aliased with both.
These separate defs are there because they reach disjoint registers.
--
Krzysztof Parzyszek kparzysz at quicinc.com<mailto:kparzysz at quicinc.com> AI tools development
From: Scott Douglas Constable <sdconsta at syr.edu>
Se...
2012 Jul 26
1
[LLVMdev] Question about ExpandPostRAPseudos.cpp
...er ***
- function: autogen_SD24657
- basic block: BB 0x2662d60 (BB#0)
- instruction: %XMM0<def> = MOV64toPQIrr %RAX<kill>
- operand 1: %RAX<kill>
LLVM ERROR: Found 1 machine code errors.
This happens because, on entry to the pass, we have
%RAX<def> = SUBREG_TO_REG 0, %R9D, 4
%XMM0<def> = MOV64toPQIrr %RAX<kill>
The pass converts (around about line 132 in ExpandPostRAPseudos.cpp) the SUBREG_TO_REG pseudo op to
%EAX<def> = MOV32rr %R9D
Because of "-mcpu-atom", post RA scheduling is enabled, so is post RA li...
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...CODE XREF: _main+107 j
mov ebx, 1
jmp loc_100000D30
; ---------------------------------------------------------------------------
loc_100000DFA: ; CODE XREF: _main+5E j
mov ecx, [rax+r8*4]
lea r9d, [rcx+1]
mov [rax+r8*4], r9d
cmp ecx, r8d
jge loc_100000F0E
lea r12, [rax+4]
xor r14d, r14d
db 2Eh
nop word ptr [rax+rax+00000000h]
loc_100000E20:...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...1
>> jmp loc_100000D30
>> ; ---------------------------------------------------------------------------
>>
>> loc_100000DFA: ; CODE XREF: _main+5E j
>> mov ecx, [rax+r8*4]
>> lea r9d, [rcx+1]
>> mov [rax+r8*4], r9d
>> cmp ecx, r8d
>> jge loc_100000F0E
>> lea r12, [rax+4]
>> xor r14d, r14d
>> db 2Eh
>>...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...00000D30
>>>> ; ---------------------------------------------------------------------------
>>>>
>>>> loc_100000DFA: ; CODE XREF: _main+5E j
>>>> mov ecx, [rax+r8*4]
>>>> lea r9d, [rcx+1]
>>>> mov [rax+r8*4], r9d
>>>> cmp ecx, r8d
>>>> jge loc_100000F0E
>>>> lea r12, [rax+4]
>>>> xor r14d, r14d
>>>>...
2020 Jan 10
2
Register Dataflow Analysis on X86
...<RSP>!(d1555):, u1568<SSP>!(d1556):]
s1569: MOV32r0 [d1570<R10D>(\~d1554",,u3776"):d1565, d1571<EFLAGS>!(d1565,d1574,):]
s1572: MOV32r0 [d1573<R8D>(\~d1554",,u3775"):d1570, d1574<EFLAGS>!(d1571,d1577,):]
s1575: MOV32r0 [d1576<R9D>(\~d1554",,u3774"):d1573, d1577<EFLAGS>!(d1574,,u3773"):]
---> s1578: MOV64rm [d1579<R11>(\~d1554",,u3226"):d1576]
b1580: --- %bb.37 --- preds(3): %bb.36, %bb.49, %bb.64 succs(1): %bb.38
p3209: phi [+d3210<RBP>(,d1731,u3212):, u3211<RB...
2015 Jul 28
1
[LLVMdev] Splice and undefined physical reg
...*** Bad machine code: Using an undefined physical register ***
- function: foo
- basic block: BB#126 (null) (0x6127658)
- instruction: CALL64r %vreg41, <regmask>, %RSP<imp-use>, %EDI<imp-use>,
%RSI<imp-use>, %RDX<imp-use>, %RCX<imp-use>, %R8<imp-use>, %R9D<imp-use>,
%AL<imp-use>, %RSP<imp-def>, %EAX<imp-def>; GR64:%vreg41
How can i get rid of this errors?
Thank you very much,
-- Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/201507...
2016 Nov 17
4
RFC: Insertion of nops for performance stability
...7: 01 c8 addl %ecx, %eax
9: 44 39 c0 cmpl %r8d, %eax
c: 75 0f jne 15 <foo+0x1D>
e: ff 05 00 00 00 00 incl (%rip)
14: ff 05 00 00 00 00 incl (%rip)
1a: 31 c0 xorl %eax, %eax
1c: c3 retq
1d: 44 39 c9 cmpl %r9d, %ecx
20: 74 ec je -20 <foo+0xE>
22: 48 8b 44 24 30 movq 48(%rsp), %rax
27: 2b 08 subl (%rax), %ecx
29: 39 d1 cmpl %edx, %ecx
2b: 7f e1 jg -31 <foo+0xE>
2d: 31 c0 xorl %eax, %eax
2f: c3 retq
Note: the first...
2013 Sep 12
1
[LLVMdev] bug in X86 disasm code?
...de in X86DisassemblerDecoder.h
#define EA_BASES_32BIT \
ENTRY(EAX) \
ENTRY(ECX) \
ENTRY(EDX) \
ENTRY(EBX) \
ENTRY(sib) \
ENTRY(EBP) \
ENTRY(ESI) \
ENTRY(EDI) \
ENTRY(R8D) \
ENTRY(R9D) \
ENTRY(R10D) \
ENTRY(R11D) \
ENTRY(R12D) \
ENTRY(R13D) \
ENTRY(R14D) \
ENTRY(R15D)
the ENTRY(sib) looks suspicious. that should be ENTRY(ESP), no?
thanks.
J
-------------- next part --------------
An HTML attachment was s...
2015 Jul 28
1
[LLVMdev] splice and undefined physical reg
...code: Using an undefined physical register ***
- function: parse_and_dump_tv_tag
- basic block: BB#126 (null) (0x6127658)
- instruction: CALL64r %vreg41, <regmask>, %RSP<imp-use>, %EDI<imp-use>,
%RSI<imp-use>, %RDX<imp-use>, %RCX<imp-use>, %R8<imp-use>, %R9D<imp-use>,
%AL<imp-use>, %RSP<imp-def>, %EAX<imp-def>; GR64:%vreg41
How can i get rid of this errors?
Thank you very much,
-- Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/201507...
2016 May 02
3
[PATCH] MSVC2015U2 workaround, version 2
...esidual_partition_sums[partition] = (FLAC__uint32)_mm_cvtsi128_si32(mm_sum);
into this:
movq QWORD PTR [rsi], xmm2
while it should be
movd eax, xmm2
mov QWORD PTR [rsi], rax
With this patch, MSVC emits
movq QWORD PTR [rsi], xmm2
mov DWORD PTR [rsi+4], r9d
so the price of this workaround is 1 extra write instruction per partition.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: msvc_bug_v2.patch
Type: application/octet-stream
Size: 2700 bytes
Desc: not available
URL: <http://lists.xiph.org/pipermail/flac-dev/at...
2016 Jan 02
13
[Bug 93557] New: Kernel Panic on Linux Kernel 4.4 when loading KDE/KDM on Nvidia GeForce 7025 / nForce 630a
https://bugs.freedesktop.org/show_bug.cgi?id=93557
Bug ID: 93557
Summary: Kernel Panic on Linux Kernel 4.4 when loading KDE/KDM
on Nvidia GeForce 7025 / nForce 630a
Product: xorg
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: blocker
2018 Apr 26
2
windows ABI problem with i128?
...%xmm0,-0x30(%rbp)
89: 66 0f d6 4d d8 movq %xmm1,-0x28(%rbp)
8e: 0f 10 45 d0 movups -0x30(%rbp),%xmm0
92: 0f 10 4d e0 movups -0x20(%rbp),%xmm1
96: 66 0f 74 c1 pcmpeqb %xmm1,%xmm0
9a: 66 44 0f d7 c8 pmovmskb %xmm0,%r9d
9f: 41 81 e9 ff ff 00 00 sub $0xffff,%r9d
a6: 44 89 4d ac mov %r9d,-0x54(%rbp)
aa: 74 06 je b2 <_start+0xa2>
ac: eb 00 jmp ae <_start+0x9e>
ae: eb 00 jmp b0 <_start+0xa0...
2010 Jun 03
4
[LLVMdev] Passing structures by value on Windows
...Some research showed, that { i16 i16 } addition also fails on x86, so
I guess the problem is in passing structures as values.
On x64 VC++ passes two { i32 i32 } structs in RCX and RDX respectively
and reads result from RAX, but it seems LLVM reads parameters from
ECX, EDX (first vector) and R8D, R9D (second vector).
Currently, I can't figure out how to dump IR, but there is a link with
disassembly shown by Visual Studio for generated functions:
comparing { i32 i32 } add on 32-bit and 64-bit (first one works):
http://pastebin.com/ijjCNWKJ
Best regards,
Milovanov Victor.
2013 Apr 08
6
[Bug 63263] New: X server crash in nouveau_xv.c:NVPutImage (NVCopyNV12ColorPlanes)
...887>: mov %rdx,%r11
0x000000000000a62a <+4890>: mov %eax,%r10d
0x000000000000a62d <+4893>: nopl (%rax)
0x000000000000a630 <+4896>: lea (%rsi,%r13,1),%rdi
0x000000000000a634 <+4900>: xor %edx,%edx
0x000000000000a636 <+4902>: test %r9d,%r9d
0x000000000000a639 <+4905>: jle 0xaa2a <NVPutImage+5914>
0x000000000000a63f <+4911>: nop
0x000000000000a640 <+4912>: movzbl (%rdi,%rdx,2),%eax
--> 0x000000000000a644 <+4916>: movzbl 0x1(%rsi,%rdx,2),%ecx
0x000000000000a649 <+4921>: s...
2018 Apr 26
0
windows ABI problem with i128?
...89: 66 0f d6 4d d8 movq %xmm1,-0x28(%rbp)
> 8e: 0f 10 45 d0 movups -0x30(%rbp),%xmm0
> 92: 0f 10 4d e0 movups -0x20(%rbp),%xmm1
> 96: 66 0f 74 c1 pcmpeqb %xmm1,%xmm0
> 9a: 66 44 0f d7 c8 pmovmskb %xmm0,%r9d
> 9f: 41 81 e9 ff ff 00 00 sub $0xffff,%r9d
> a6: 44 89 4d ac mov %r9d,-0x54(%rbp)
> aa: 74 06 je b2 <_start+0xa2>
> ac: eb 00 jmp ae <_start+0x9e>
> ae: eb 00...
2014 Feb 19
2
[LLVMdev] better code for IV
...br label %2
; <label>:2 ; preds = %L_exit
ret void
}
Asm code:
ArrayAdd1: # @ArrayAdd1
.cfi_startproc
# BB#0: # %Entry
xorl %r9d, %r9d
movabsq $4294967296, %r8 # imm = 0x100000000
.align 16, 0x90
.LBB0_1: # %L_entry
# =>This Inner Loop Header: Depth=1
movq %r9, %rax
sarq $32, %rax...