Displaying 20 results from an estimated 50 matches for "r15d".
Did you mean:
r15
2016 Jun 30
4
Help required regarding IPRA and Local Function optimization
...rge function with recursion and object oriented code
so I am not able to find a pattern which is causing failure. So I tried
following simple case to understand expected behavior from this
optimization.
Consider following code :
define void @bar() #0 {
call void asm sideeffect "movl %ecx, %r15d", "~{r15}"() #0
call void @foo()
call void asm sideeffect "movl %r15d, %ebx", "~{rbx}"() #0
ret void
}
define internal void @foo() #0 {
call void asm sideeffect "movl %r14d, %r15d", "~{r15}"() #0
ret void
}
and its generated assembl...
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...ov rsi, offset __mh_execute_header
add rsi, rax
sar rsi, 20h ; size_t
mov edi, 4 ; size_t
call _calloc
lea edx, [r15-1]
movsxd r8, edx
mov ecx, r15d
add ecx, 0FFFFFFFEh
js loc_100000DFA
test r15d, r15d
mov r11d, [rax+r8*4]
jle loc_100000EAE
mov ecx, r15d
add ecx, 0FFFFFFFEh
mov [rsp+48h+...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...add rsi, rax
>> sar rsi, 20h ; size_t
>> mov edi, 4 ; size_t
>> call _calloc
>> lea edx, [r15-1]
>> movsxd r8, edx
>> mov ecx, r15d
>> add ecx, 0FFFFFFFEh
>> js loc_100000DFA
>> test r15d, r15d
>> mov r11d, [rax+r8*4]
>> jle loc_100000EAE
>> mov ecx, r15d
>>...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...sar rsi, 20h ; size_t
>>>> mov edi, 4 ; size_t
>>>> call _calloc
>>>> lea edx, [r15-1]
>>>> movsxd r8, edx
>>>> mov ecx, r15d
>>>> add ecx, 0FFFFFFFEh
>>>> js loc_100000DFA
>>>> test r15d, r15d
>>>> mov r11d, [rax+r8*4]
>>>> jle loc_100000EAE
>>>>...
2016 Jun 30
0
Help required regarding IPRA and Local Function optimization
...19508 <+8>: pushq %r13
0x10001950a <+10>: pushq %r12
0x10001950c <+12>: pushq %rbx
0x10001950d <+13>: pushq %rax
0x10001950e <+14>: movl %ecx, %r12d
0x100019511 <+17>: movl %edx, %r13d
0x100019514 <+20>: movl %esi, %r15d
0x100019517 <+23>: movq %rdi, %rbx
-> 0x10001951a <+26>: movl 0x18(%rbx), %r14d
Please correct me if any thing is wrong and also please provide some help.
-Vivek
2016-06-30 14:21 GMT+05:30 vivek pandya <vivekvpandya at gmail.com>:
> Hello Mentors,
>
> I...
2016 Jul 06
3
IPRA, interprocedural register allocation, question
...orrect register mask (and
that also means that skipping clobbers list while IPRA enabled may broke
executable)
For example in following code:
int gcd( int a, int b ) {
int result ;
/* Compute Greatest Common Divisor using Euclid's Algorithm */
__asm__ __volatile__ ( "movl %1, %%r15d;"
"movl %2, %%ecx;"
"CONTD: cmpl $0, %%ecx;"
"je DONE;"
"xorl %%r13d, %%r13d;"
"idivl %%ecx;"...
2016 Jul 08
2
IPRA, interprocedural register allocation, question
...clobbers list while IPRA enabled may broke
> executable)
>
> For example in following code:
>
> int gcd( int a, int b ) {
>
> int result ;
>
> /* Compute Greatest Common Divisor using Euclid's Algorithm */
>
> __asm__ __volatile__ ( "movl %1, %%r15d;"
>
> "movl %2, %%ecx;"
>
> "CONTD: cmpl $0, %%ecx;"
>
> "je DONE;"
>
> "xorl %%r13d, %%r13d;"
>
>...
2018 Feb 06
3
What does a dead register mean?
...see the following
sequence:
ADJCALLSTACKDOWN64 0, 0, 0, *implicit-def dead %rsp*, implicit-def dead
%eflags, implicit-def dead %ssp, implicit %rsp, implicit %ssp
CALL64pcrel32 @foo, <regmask %bh %bl %bp %bpl %bx %ebp %ebx %rbp %rbx %r12
%r13 %r14 %r15 %r12b %r13b %r14b %r15b %r12d %r13d %r14d %r15d %r12w %r13w
%r14w %r15w>, *implicit %rsp*, implicit %ssp, implicit-def %rsp,
implicit-def %ssp
ADJCALLSTACKUP64 0, 0, implicit-def dead %rsp, implicit-def dead %eflags,
implicit-def dead %ssp, implicit %rsp, implicit %ssp
RET 0
The ADJCALLSTACKDOWN64 has implicit-def dead %rsp. However the nex...
2016 Jul 09
3
IPRA, interprocedural register allocation, question
...orrect register mask (and that also means that skipping clobbers list while IPRA enabled may broke executable)
For example in following code:
int gcd( int a, int b ) {
int result ;
/* Compute Greatest Common Divisor using Euclid's Algorithm */
__asm__ __volatile__ ( "movl %1, %%r15d;"
"movl %2, %%ecx;"
"CONTD: cmpl $0, %%ecx;"
"je DONE;"
"xorl %%r13d, %%r13d;"
"idivl %%ecx;"...
2013 Sep 12
1
[LLVMdev] bug in X86 disasm code?
...\
ENTRY(sib) \
ENTRY(EBP) \
ENTRY(ESI) \
ENTRY(EDI) \
ENTRY(R8D) \
ENTRY(R9D) \
ENTRY(R10D) \
ENTRY(R11D) \
ENTRY(R12D) \
ENTRY(R13D) \
ENTRY(R14D) \
ENTRY(R15D)
the ENTRY(sib) looks suspicious. that should be ENTRY(ESP), no?
thanks.
J
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130912/a2bd068c/attachment.html>
2020 May 26
6
[RFC] Loading Bitfields with Smallest Needed Types
...;
...
}
Right before the "same_flow = ... ->same_flow;" statement is executed,
a store is made to the bitfield at the end of a called function:
NAPI_GRO_CB(skb)->same_flow = 1;
The store is a byte:
orb $0x1,0x4a(%rbx)
while the read is a word:
movzwl 0x4a(%r12),%r15d
The problem is that between the store and the load the value hasn't
been retired / placed in the cache. One would expect store-to-load
forwarding to kick in, but on x86 that doesn't happen because x86
requires the store to be of equal or greater size than the load. So
instead the load take...
2017 Mar 14
3
llvm-stress crash
...fi#10>, 0, %noreg; mem:ST16[FixedStack10](align=8)
GR128Bit:%vreg265
-> regalloc
%R5L<def> = LLCRMux %R6L, %R4Q<imp-def>
ST128 %R4Q<kill>, <fi#10>, 0, %noreg; mem:ST16[FixedStack10](align=8)
-> pseudo expansion
%R5L<def> = LLCR %R6L
STG %R4D<kill>, %R15D, 200, %noreg; mem:ST16[FixedStack7](align=8)
STG %R5D<kill>, %R15D, 208, %noreg; mem:ST16[FixedStack7](align=8)
*** Bad machine code: Using an undefined physical register ***
- function: autogen_SD29355
- basic block: BB#19 CF257 (0x4cb6b00)
- instruction: STG
- operand 0: %R4D<kill&...
2016 Jul 12
2
IPRA, interprocedural register allocation, question
...orrect register mask (and that also means that skipping clobbers list while IPRA enabled may broke executable)
For example in following code:
int gcd( int a, int b ) {
int result ;
/* Compute Greatest Common Divisor using Euclid's Algorithm */
__asm__ __volatile__ ( "movl %1, %%r15d;"
"movl %2, %%ecx;"
"CONTD: cmpl $0, %%ecx;"
"je DONE;"
"xorl %%r13d, %%r13d;"
"idivl %%ecx;"...
2020 May 27
2
[cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
...McCall via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> On 26 May 2020, at 18:28, Bill Wendling via llvm-dev wrote:
> > [...] The store is a byte:
> >
> > orb $0x1,0x4a(%rbx)
> >
> > while the read is a word:
> >
> > movzwl 0x4a(%r12),%r15d
> >
> > The problem is that between the store and the load the value hasn't
> > been retired / placed in the cache. One would expect store-to-load
> > forwarding to kick in, but on x86 that doesn't happen because x86
> > requires the store to be of equal or gre...
2016 Jun 25
3
Tail call optimization is getting affected due to local function related optimization with IPRA
...cted by RegUsageInfoCollector pass.
Function Name : bitrv2
Clobbered Registers:
AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI ESP
RAX
RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B R9B
R10B
R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W R10W
R11W
R12W R13W R14W R15W
How ever caller of bitrv2, makewt has callee saved registers as per CC, but
this
code results in segmentation fault when compliled with O1 because makewt
has value
of *ip in R14 register and that is stored and restore by makewt at begining
of call
but due to t...
2018 Feb 06
0
What does a dead register mean?
...:
>
> ADJCALLSTACKDOWN64 0, 0, 0, *implicit-def dead %rsp*, implicit-def dead
> %eflags, implicit-def dead %ssp, implicit %rsp, implicit %ssp
> CALL64pcrel32 @foo, <regmask %bh %bl %bp %bpl %bx %ebp %ebx %rbp %rbx
> %r12 %r13 %r14 %r15 %r12b %r13b %r14b %r15b %r12d %r13d %r14d %r15d
> %r12w %r13w %r14w %r15w>, *implicit %rsp*, implicit %ssp, implicit-def
> %rsp, implicit-def %ssp
> ADJCALLSTACKUP64 0, 0, implicit-def dead %rsp, implicit-def dead
> %eflags, implicit-def dead %ssp, implicit %rsp, implicit %ssp
> RET 0
>
>
> The ADJCALLSTACKDOWN64...
2016 Jan 02
13
[Bug 93557] New: Kernel Panic on Linux Kernel 4.4 when loading KDE/KDM on Nvidia GeForce 7025 / nForce 630a
https://bugs.freedesktop.org/show_bug.cgi?id=93557
Bug ID: 93557
Summary: Kernel Panic on Linux Kernel 4.4 when loading KDE/KDM
on Nvidia GeForce 7025 / nForce 630a
Product: xorg
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: blocker
2016 Jul 12
3
IPRA, interprocedural register allocation, question
...orrect register mask (and that also means that skipping clobbers list while IPRA enabled may broke executable)
For example in following code:
int gcd( int a, int b ) {
int result ;
/* Compute Greatest Common Divisor using Euclid's Algorithm */
__asm__ __volatile__ ( "movl %1, %%r15d;"
"movl %2, %%ecx;"
"CONTD: cmpl $0, %%ecx;"
"je DONE;"
"xorl %%r13d, %%r13d;"
"idivl %%ecx;"...
2016 Jun 27
3
Finding caller-saved registers at a function call site
...storage location rbp - 0x8) is used
in the addition to calculate the returned value. However, when I print the
RegMask operand for the call machine instruction, I get the following:
<regmask %BH %BL %BP %BPL %BX %EBP %EBX %RBP %RBX %R12 %R13 %R14 %R15 %R12B
%R13B %R14B %R15B %R12D %R13D %R14D %R15D %R12W %R13W %R14W %R15W>
I don't see xmm1 as being preserved across this call. Am I missing
something? Thanks for your help!
On Wed, Jun 22, 2016 at 5:01 PM, Sanjoy Das <sanjoy at playingwithpointers.com>
wrote:
> Hi Rob,
>
> Rob Lyerly via llvm-dev wrote:
> > I'...
2011 Mar 19
2
[LLVMdev] Apparent optimizer bug on X86_64
...code:
1300 /*-----------------------------.
1301 | yyreduce -- Do a reduction. |
1302 `-----------------------------*/
1303 yyreduce:
1304 /* yyn is the number of a rule to reduce with. */
1305 yylen = yyr2[yyn];
0x0000000000400c14 <rpcalc_parse+628>: mov r15d,r14d
0x0000000000400c17 <rpcalc_parse+631>: movzx r12d,BYTE PTR
[r15+0x4015e2]
0x0000000000400c1f <rpcalc_parse+639>: mov eax,0x1
0x0000000000400c24 <rpcalc_parse+644>: mov r13,rax
0x0000000000400c27 <rpcalc_parse+647>: sub r13,r...