thr3ads.net - search: "cmoveq"

2006 Aug 21

5

[LLVMdev] selecting select_cc

...truction has no results. It only alters the CPSR (current program status register). ARM::SELECT would expand to a conditional move (moveq for example). Something similar is done by the Alpha backend: --------------------------------------------------------------------------------------------- def CMOVEQ : OForm4< 0x11, 0x24, "cmoveq $RCOND,$RTRUE,$RDEST", [(set GPRC:$RDEST, (select (seteq GPRC:$RCOND, 0), GPRC:$RTRUE, GPRC:$RFALSE))], s_cmov>; ---------------------------------------------------------------------------------------------- One thing that I don'...

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 29

2

avx512 JIT backend generates wrong code on <4 x float>

...hle,xsaveopt,-sha,sse2,sse3,-avx512dq, Assembly: .text .file "module_KFxOBX_i4_after.ll" .globl adjmul .align 16, 0x90 .type adjmul, at function adjmul: .cfi_startproc leaq (%rdi,%r8), %rdx addq %rsi, %r8 testb $1, %cl cmoveq %rdi, %rdx cmoveq %rsi, %r8 movq %rdx, %rax sarq $63, %rax shrq $62, %rax addq %rdx, %rax sarq $2, %rax movq %r8, %rcx sarq $63, %rcx shrq $62, %rcx addq %r8, %rcx sarq $2, %rcx movq %rax, %rdx s...

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 29

0

avx512 JIT backend generates wrong code on <4 x float>

....text > .file "module_KFxOBX_i4_after.ll" > .globl adjmul > .align 16, 0x90 > .type adjmul, at function > adjmul: > .cfi_startproc > leaq (%rdi,%r8), %rdx > addq %rsi, %r8 > testb $1, %cl > cmoveq %rdi, %rdx > cmoveq %rsi, %r8 > movq %rdx, %rax > sarq $63, %rax > shrq $62, %rax > addq %rdx, %rax > sarq $2, %rax > movq %r8, %rcx > sarq $63, %rcx > shrq $62, %rcx > addq %r8, %rcx &g...

avx512 JIT backend generates wrong code on <4 x float>

2016 Jun 30

1

avx512 JIT backend generates wrong code on <4 x float>

...xOBX_i4_after.ll" >> .globl adjmul >> .align 16, 0x90 >> .type adjmul, at function >> adjmul: >> .cfi_startproc >> leaq (%rdi,%r8), %rdx >> addq %rsi, %r8 >> testb $1, %cl >> cmoveq %rdi, %rdx >> cmoveq %rsi, %r8 >> movq %rdx, %rax >> sarq $63, %rax >> shrq $62, %rax >> addq %rdx, %rax >> sarq $2, %rax >> movq %r8, %rcx >> sarq $63, %rcx >> shr...

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

2

AVX512 instruction generated when JIT compiling for an avx2 architecture

...tion which shouldn't be there. Assembly: .text .file "module" .globl main .align 16, 0x90 .type main, at function main: .cfi_startproc movq 8(%rsp), %r10 leaq (%rdi,%r8), %rdx addq %rsi, %r8 testb $1, %cl cmoveq %rdi, %rdx cmoveq %rsi, %r8 movq %rdx, %rax sarq $63, %rax shrq $62, %rax addq %rdx, %rax sarq $2, %rax movq %r8, %rcx sarq $63, %rcx shrq $62, %rcx addq %r8, %rcx sarq $2, %rcx movq (%r10), %r8...

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

2

AVX512 instruction generated when JIT compiling for an avx2 architecture

...ule" > .globl main > .align 16, 0x90 > .type main, at function > main: > .cfi_startproc > movq 8(%rsp), %r10 > leaq (%rdi,%r8), %rdx > addq %rsi, %r8 > testb $1, %cl > cmoveq %rdi, %rdx > cmoveq %rsi, %r8 > movq %rdx, %rax > sarq $63, %rax > shrq $62, %rax > addq %rdx, %rax > sarq $2, %rax > movq %r8, %rcx > sarq $63, %rcx > shrq $62, %rcx &gt...

[LLVMdev] X86TarIgetLowering::LowerToBT

2015 Jan 23

2

[LLVMdev] X86TarIgetLowering::LowerToBT

> icc generates testq for 0-30 and btq for 31-63. > That seems like a small bug in the bit 31 case. You can’t use testq for bit 31, because the immediate gets sign-extended. You *can* use the 32b form, of course.

Missed opportunity in the midend, unsigned comparison

2018 Feb 28

1

Missed opportunity in the midend, unsigned comparison

...plement it at? Thank you very much in advance. Best, Alex -------------------------- GCC x86 ASM: testl %edi, %edi movl $0, %edx movl $arr, %eax cmovne %rdx, %rax ret LLVM x86 ASM: xorl %eax, %eax testl %edi, %edi movl %edi, %ecx leaq arr(%rcx), %rcx cmoveq %rcx, %rax retq -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180228/0dbb1182/attachment.html>

[LLVMdev] is createCFGSimplificationPass unused?

2006 Nov 03

4

[LLVMdev] is createCFGSimplificationPass unused?

...tures that use conditional moves to implement select (alpha and ARM). For example, on 2006/09/03 a "if (a) return 0; else return 1;" compiled to ---------------------------------------- zapnot $17,15,$1 zapnot $16,15,$2 bis $31,$31,$0 cmpeq $2,$1,$1 cmoveq $1,1,$0 ret $31,($26),1 ---------------------------------------- Now it compiles to ---------------------------------- zapnot $17,15,$0 zapnot $16,15,$1 cmpeq $1,$0,$0 beq $0,$BB1_2 #return $BB1_1: #cond_true bis $31,$31,$0 ret $31,($26),1...

[LLVMdev] is createCFGSimplificationPass unused?

2006 Nov 03

0

[LLVMdev] is createCFGSimplificationPass unused?

.... For example: int %foo(int %x) { %b = seteq int %x, 5 %r = select bool %b, int 3, int 7 ret int %r } int %bar(int %x) { %b = seteq int %x, 5 br bool %b, label %t, label %f t: ret int 1 f: ret int 2 } compiles to: foo: lda $0,3($31) zapnot $16,15,$1 cmpeq $1,5,$1 cmoveq $1,7,$0 ret $31,($26),1 bar: zapnot $16,15,$0 cmpeq $0,5,$0 beq $0,$BB2_2 #f $BB2_1: #t lda $0,1($31) ret $31,($26),1 $BB2_2: #f lda $0,2($31) ret $31,($26),1 Which is not a problem with the instruction selector's use of cmov. I...

[LLVMdev] Solicit code review (change to CodeGen)

2012 Oct 10

2

[LLVMdev] Solicit code review (change to CodeGen)

...x86_64-apple-darwin10 -mcpu=corei7 | FileCheck %s + +define i64 @test1(i64 %x) nounwind { +entry: + %cmp = icmp eq i64 %x, 2 + %add = add i64 %x, 1 + %retval.0 = select i1 %cmp, i64 2, i64 %add + ret i64 %retval.0 + +; CHECK: test1: +; CHECK: leaq 1(%rdi), %rax +; CHECK: cmpq $2, %rdi +; CHECK: cmoveq %rdi, %rax +; CHECK: ret + +}

[PATCH] x86-64: syscall/sysenter support for 32-bit apps

2007 Aug 08

2

[PATCH] x86-64: syscall/sysenter support for 32-bit apps

...2007-08-08 11:37:08.000000000 +0200 @@ -26,15 +26,19 @@ ALIGN /* %rbx: struct vcpu */ switch_to_kernel: - leaq VCPU_trap_bounce(%rbx),%rdx + cmpw $FLAT_USER_CS32,UREGS_cs(%rsp) movq VCPU_syscall_addr(%rbx),%rax + leaq VCPU_trap_bounce(%rbx),%rdx + cmoveq VCPU_syscall32_addr(%rbx),%rax + btl $_VGCF_syscall_disables_events,VCPU_guest_context_flags(%rbx) movq %rax,TRAPBOUNCE_eip(%rdx) - movb $0,TRAPBOUNCE_flags(%rdx) - bt $_VGCF_syscall_disables_events,VCPU_guest_context_flags(%rbx) - jnc 1f - movb...

[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

2017 Oct 11

1

[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

...SK movaps IV, CTR PSHUFB_XMM BSWAP_MASK CTR mov $1, TCTR_LOW @@ -2850,12 +2852,12 @@ ENTRY(aesni_xts_crypt8) cmpb $0, %cl movl $0, %ecx movl $240, %r10d - leaq _aesni_enc4, %r11 - leaq _aesni_dec4, %rax + leaq _aesni_enc4(%rip), %r11 + leaq _aesni_dec4(%rip), %rax cmovel %r10d, %ecx cmoveq %rax, %r11 - movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK + movdqa .Lgf128mul_x_ble_mask(%rip), GF128MUL_MASK movups (IVP), IV mov 480(KEYP), KLEN diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S index faecb1518bf8..488605b19fe8 100644 --- a/ar...

RFC: Speculative Load Hardening (a Spectre variant #1 mitigation)

2018 Mar 23

5

RFC: Speculative Load Hardening (a Spectre variant #1 mitigation)

...# Conditionally update predicate state. testl %esi, %esi jne .LBB0_1 # %bb.3: # %then2 cmovneq %r8, %rax # Conditionally update predicate state. testl %edx, %edx je .LBB0_4 .LBB0_1: cmoveq %r8, %rax # Conditionally update predicate state. popq %rax retq .LBB0_4: # %danger cmovneq %r8, %rax # Conditionally update predicate state. ... ``` Here we create the "empty" or "correct...

[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization

2018 Mar 13

32

[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization

Changes: - patch v2: - Adapt patch to work post KPTI and compiler changes - Redo all performance testing with latest configs and compilers - Simplify mov macro on PIE (MOVABS now) - Reduce GOT footprint - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce

[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization

2018 Mar 13

32

[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization

Changes: - patch v2: - Adapt patch to work post KPTI and compiler changes - Redo all performance testing with latest configs and compilers - Simplify mov macro on PIE (MOVABS now) - Reduce GOT footprint - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce

x86: PIE support and option to extend KASLR randomization

2017 Oct 04

28

x86: PIE support and option to extend KASLR randomization

These patches make the changes necessary to build the kernel as Position Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below the top 2G of the virtual address space. It allows to optionally extend the KASLR randomization range from 1G to 3G. Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler changes, PIE support and KASLR in general. Thanks to

x86: PIE support and option to extend KASLR randomization

2017 Oct 04

28

x86: PIE support and option to extend KASLR randomization

These patches make the changes necessary to build the kernel as Position Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below the top 2G of the virtual address space. It allows to optionally extend the KASLR randomization range from 1G to 3G. Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler changes, PIE support and KASLR in general. Thanks to

[PATCH v3 00/27] x86: PIE support and option to extend KASLR randomization

2018 May 23

33

[PATCH v3 00/27] x86: PIE support and option to extend KASLR randomization

Changes: - patch v3: - Update on message to describe longer term PIE goal. - Minor change on ftrace if condition. - Changed code using xchgq. - patch v2: - Adapt patch to work post KPTI and compiler changes - Redo all performance testing with latest configs and compilers - Simplify mov macro on PIE (MOVABS now) - Reduce GOT footprint - patch v1: - Simplify ftrace

[PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization

2017 Oct 11

32

[PATCH v1 00/27] x86: PIE support and option to extend KASLR randomization

Changes: - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce dynamic relocation space on mapped memory. It also simplifies the relocation process. - Move the start the module section next to the kernel. Remove the need for -mcmodel=large on modules. Extends

search for: cmoveq