Displaying 20 results from an estimated 410 matches for "jne".
2016 May 13
4
RFC: callee saved register verifier
...p
movabsq $0xCA5FCA5FCA5FCA5F, %rbx # can also be movq %rbp, %rbx etc.
movabsq $0xCA5FCA5FCA5FCA5F, %r12
movabsq $0xCA5FCA5FCA5FCA5F, %r13
movabsq $0xCA5FCA5FCA5FCA5F, %r14
movabsq $0xCA5FCA5FCA5FCA5F, %r15
callq foo
movabsq $0xCA5FCA5FCA5FCA5F, %rax
cmpq %rax, %rbp
jne .LBB1_5
movabsq $0xCA5FCA5FCA5FCA5F, %rax
cmpq %rax, %rbx
jne .LBB1_5
movabsq $0xCA5FCA5FCA5FCA5F, %rax
cmpq %rax, %r12
jne .LBB1_5
movabsq $0xCA5FCA5FCA5FCA5F, %rax
cmpq %rax, %r13
jne .LBB1_5
movabsq $0xCA5FCA5FCA5FCA5F, %rax
cmpq %rax,...
2009 Aug 02
2
[LLVMdev] code-altering Passes for llc
Greetinigs,
I am extending llc to include runtime checks for calls (in X86). So a call
'call target' is altered to look like this:
[some check]
jne error_function
call target
I've done this by implementing a MachineFunctionPass that is instantiated
and added to the PassManager in X86TargetMachine::addPreRegAlloc.
In order to create the jne-instruction I need some BasicBlock that contains
the error routine. So I tried to create a Module...
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
...// these 4 lines is
crc >>= 1; // rather poor!
}
return ~crc;
}
See <https://godbolt.org/z/eYJeWt> (-O1) and <https://godbolt.org/z/zeExHm> (-O2)
crc32be: # @crc32be
xor eax, eax
test esi, esi
jne .LBB0_2
jmp .LBB0_5
.LBB0_4: # in Loop: Header=BB0_2 Depth=1
add rdi, 1
test esi, esi
je .LBB0_5
.LBB0_2: # =>This Loop Header: Depth=1
add esi, -1
movzx edx, byte ptr [rdi]
shl edx, 24
xor edx, eax
m...
2007 Apr 30
0
[LLVMdev] Boostrap Failure -- Expected Differences?
...> + 3bf: 73 65 jae 426 <__FUNCTION__.20866+0xa>
> 3c1: 5f pop %edi
> 3c2: 64 65 63 6c 00 2f arpl %bp,%fs:%gs:0x2f(%eax,%eax,1)
>
> 000003c7 <.str>:
> 3c7: 2f das
> - 3c8: 75 73 jne 43d <__FUNCTION__.21160+0x4>
> + 3c8: 75 73 jne 43d <__FUNCTION__.21073+0x4>
> 3ca: 65 gs
> - 3cb: 72 73 jb 440 <__FUNCTION__.21160+0x7>
> + 3cb: 72 73 jb 440 <__FUNCTION__.21073+0x7&...
2020 Oct 09
1
[PATCH] drm/nouveau/kms: Fix NULL pointer dereference in nouveau_connector_detect_depth
...0a 00 or (%rax),%al
2: 00 48 8b add %cl,-0x75(%rax)
5: 49 rex.WB
6: 48 c7 87 b8 00 00 00 movq $0x6,0xb8(%rdi)
d: 06 00 00 00
11: 80 b9 4d 0a 00 00 00 cmpb $0x0,0xa4d(%rcx)
18: 75 1e jne 0x38
1a: 83 fa 41 cmp $0x41,%edx
1d: 75 05 jne 0x24
1f: 48 85 c0 test %rax,%rax
22: 75 29 jne 0x4d
24: 8b 81 10 0d 00 00 mov 0xd10(%rcx),%eax
2a:* 39 06 cmp %eax,(%rs...
2013 Dec 20
2
[LLVMdev] Commutability of X86 FMA3 instructions.
...or the curious, the reason that I'm asking is that we currently
always select the 213 variant, but this introduces an extra copies in
accumulator-style loops. Something like:
while (...)
accumulator = x * y + accumulator;
yields:
loop:
vfmadd.213 y, x, acc
vmovaps acc, x
decl count
jne loop
instead of
loop:
vfmadd.231 acc, x, y
decl count
jne loop
I have started writing a patch to generate the 231 variant by default,
and I want to know whether I need to go to the trouble of adding
custom commute logic. If these things aren't commutable then I don't
need to worry...
2020 Oct 13
1
[PATCH v2] drm/nouveau/kms: Fix NULL pointer dereference in nouveau_connector_detect_depth
...0a 00 or (%rax),%al
2: 00 48 8b add %cl,-0x75(%rax)
5: 49 rex.WB
6: 48 c7 87 b8 00 00 00 movq $0x6,0xb8(%rdi)
d: 06 00 00 00
11: 80 b9 4d 0a 00 00 00 cmpb $0x0,0xa4d(%rcx)
18: 75 1e jne 0x38
1a: 83 fa 41 cmp $0x41,%edx
1d: 75 05 jne 0x24
1f: 48 85 c0 test %rax,%rax
22: 75 29 jne 0x4d
24: 8b 81 10 0d 00 00 mov 0xd10(%rcx),%eax
2a:* 39 06 cmp %eax,(%rs...
2007 Apr 27
2
[LLVMdev] Boostrap Failure -- Expected Differences?
The saga continues.
I've been tracking the interface changes and merging them with
the refactoring work I'm doing. I got as far as building stage3
of llvm-gcc but the object files from stage2 and stage3 differ:
warning: ./cc1-checksum.o differs
warning: ./cc1plus-checksum.o differs
(Are the above two ok?)
The list below is clearly bad. I think it's every object file in
the
2018 Nov 27
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
...poor!
>> }
>> return ~crc;
>> }
>>
>> See <https://godbolt.org/z/eYJeWt> (-O1) and <https://godbolt.org/z/zeExHm>
>> (-O2)
>>
>> crc32be: # @crc32be
>> xor eax, eax
>> test esi, esi
>> jne .LBB0_2
>> jmp .LBB0_5
>> .LBB0_4: # in Loop: Header=BB0_2 Depth=1
>> add rdi, 1
>> test esi, esi
>> je .LBB0_5
>> .LBB0_2: # =>This Loop Header: Depth=1
>> add esi, -1
>> movzx edx,...
2018 Apr 04
0
SCEV and LoopStrengthReduction Formulae
> cmpq %rbx, %r14
> jne .LBB0_1
>
> LLVM can perform compare-jump fusion, it already does in certain cases, but
> not in the case above. We can remove the cmp above if we were to perform
> the following transformation:
Do you mean branch-fusion (https://en.wikichip.org/wiki/macro-operation_fusion)?
Is there...
2016 May 13
2
RFC: callee saved register verifier
...%rbx etc.
> > movabsq $0xCA5FCA5FCA5FCA5F, %r12
> > movabsq $0xCA5FCA5FCA5FCA5F, %r13
> > movabsq $0xCA5FCA5FCA5FCA5F, %r14
> > movabsq $0xCA5FCA5FCA5FCA5F, %r15
> > callq foo
> > movabsq $0xCA5FCA5FCA5FCA5F, %rax
> > cmpq %rax, %rbp
> > jne .LBB1_5
> > movabsq $0xCA5FCA5FCA5FCA5F, %rax
> > cmpq %rax, %rbx
> > jne .LBB1_5
> > movabsq $0xCA5FCA5FCA5FCA5F, %rax
> > cmpq %rax, %r12
> > jne .LBB1_5
> > movabsq $0xCA5FCA5FCA5FCA5F, %rax
> > cmpq %rax, %r13
> >...
2015 Sep 01
2
[RFC] New pass: LoopExitValues
...-----------------
matrix_mul:
testl %edi, %edi
je .LBB0_5
xorl %r9d, %r9d
xorl %r8d, %r8d
.LBB0_2:
xorl %r11d, %r11d
.LBB0_3:
movl %r9d, %r10d
movl (%rdx,%r10,4), %eax
imull %ecx, %eax
movl %eax, (%rsi,%r10,4)
incl %r11d
incl %r9d
cmpl %r11d, %edi
jne .LBB0_3
incl %r8d
cmpl %edi, %r8d
jne .LBB0_2
.LBB0_5:
retq
Without LoopExitValues:
-----------------------------------
matrix_mul:
pushq %rbx # Eliminated by L.E.V. pass
.Ltmp0:
.Ltmp1:
testl %edi, %edi
je .LBB0_5
xorl %r8d, %r8d
xorl %r9d, %r9d
.LB...
2009 Aug 02
0
[LLVMdev] code-altering Passes for llc
On Aug 2, 2009, at 7:09 AM, Artjom Kochtchi wrote:
>
> Greetinigs,
>
> I am extending llc to include runtime checks for calls (in X86). So
> a call
> 'call target' is altered to look like this:
>
> [some check]
> jne error_function
> call target
>
> I've done this by implementing a MachineFunctionPass that is
> instantiated
> and added to the PassManager in X86TargetMachine::addPreRegAlloc.
>
> In order to create the jne-instruction I need some BasicBlock that
> contains
> th...
2017 Oct 20
3
[PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support
...a pointer to the C function implementing the syscall.
>> * IRQs are on.
>> */
>> - cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
>> + leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
>> + cmpq %r11, (%rsp)
>> jne 1f
>>
>> /*
>> @@ -1172,7 +1176,8 @@ ENTRY(error_entry)
>> movl %ecx, %eax /* zero extend */
>> cmpq %rax, RIP+8(%rsp)
>> je .Lbstep_iret
>> - cmpq $.Lgs_change, RIP+8(%rsp)
>> + l...
2017 Oct 20
3
[PATCH v1 06/27] x86/entry/64: Adapt assembly for PIE support
...a pointer to the C function implementing the syscall.
>> * IRQs are on.
>> */
>> - cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
>> + leaq .Lentry_SYSCALL_64_after_fastpath_call(%rip), %r11
>> + cmpq %r11, (%rsp)
>> jne 1f
>>
>> /*
>> @@ -1172,7 +1176,8 @@ ENTRY(error_entry)
>> movl %ecx, %eax /* zero extend */
>> cmpq %rax, RIP+8(%rsp)
>> je .Lbstep_iret
>> - cmpq $.Lgs_change, RIP+8(%rsp)
>> + l...
2020 Jul 17
1
[PATCH] drm/nouveau: Accept 'legacy' format modifiers
...51 08 mov 0x8(%rcx),%edx
3: 48 89 c8 mov %rcx,%rax
6: 65 48 03 05 d4 0e ca add %gs:0x70ca0ed4(%rip),%rax # 0x70ca0ee2
d: 70
e: 48 8b 70 08 mov 0x8(%rax),%rsi
12: 48 39 f2 cmp %rsi,%rdx
15: 75 e7 jne 0xfffffffffffffffe
17: 4c 8b 38 mov (%rax),%r15
1a: 4d 85 ff test %r15,%r15
1d: 0f 84 8f 01 00 00 je 0x1b2
23: 8b 45 20 mov 0x20(%rbp),%eax
26: 48 8b 7d 00 mov 0x0(%rbp),%rdi
2a:* 49 8b 1c 07 mov (%r15,%...
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
...n be optimized via cmp/jmp fusion.
// clang -O3 -S test.c
extern void g(int);
void f(int *p, long long n) {
do {
g(*p++);
} while (--n);
}
LLVM currently generates the following sequence for x86_64 targets:
LBB0_1:
movl (%r15,%rbx,4), %edi
callq g
addq $1, %rbx
cmpq %rbx, %r14
jne .LBB0_1
LLVM can perform compare-jump fusion, it already does in certain cases, but not
in the case above. We can remove the cmp above if we were to perform
the following transformation:
1.0) Initialize the induction variable, %rbx, to be 'n' instead of zero.
1.1) Negate the induction var...
2011 Feb 18
1
[PATCH] core: Allow pasting from a VMware host by typing Ctrl-P
...inc b/core/ui.inc
index 2d44447..0e82779 100644
--- a/core/ui.inc
+++ b/core/ui.inc
@@ -125,6 +125,8 @@ not_ascii:
je print_version
cmp al,'X' & 1Fh ; <Ctrl-X>
je force_text_mode
+ cmp al,'P' & 1Fh ; <Ctrl-P>
+ je paste
cmp al,08h ; Backspace
jne get_char
backspace: cmp di,command_line ; Make sure there is anything
@@ -143,6 +145,10 @@ force_text_mode:
call vgaclearmode
jmp enter_command
+paste:
+ call vmware_paste
+ jmp get_char
+
set_func_flag:
mov byte [FuncFlag],1
jmp short get_char_2
@@ -568,6 +574,72 @@ getchar_time...
2015 Aug 31
2
[RFC] New pass: LoopExitValues
Hello LLVM,
This is a proposal for a new pass that improves performance and code
size in some nested loop situations. The pass is target independent.
>From the description in the file header:
This optimization finds loop exit values reevaluated after the loop
execution and replaces them by the corresponding exit values if they
are available. Such sequences can arise after the
2013 Dec 20
0
[LLVMdev] Commutability of X86 FMA3 instructions.
...ently
> always select the 213 variant, but this introduces an extra copies in
> accumulator-style loops. Something like:
>
> while (...)
> accumulator = x * y + accumulator;
>
> yields:
>
> loop:
> vfmadd.213 y, x, acc
> vmovaps acc, x
> decl count
> jne loop
>
> instead of
>
> loop:
> vfmadd.231 acc, x, y
> decl count
> jne loop
>
> I have started writing a patch to generate the 231 variant by default,
> and I want to know whether I need to go to the trouble of adding
> custom commute logic. If these things a...