Displaying 17 results from an estimated 17 matches for "sarq".
Did you mean:
sar
2015 Jul 27
3
[LLVMdev] i1* function argument on x86-64
I am running into a problem with 'i1*' as a function's argument which
seems to have appeared since I switched to LLVM 3.6 (but can have other
source, of course). If I look at the assembler that the MCJIT generates
for an x86-64 target I see that the array 'i1*' is taken as a sequence
of 1 bit wide elements. (I guess that's correct). However, I used to
call the function
2016 Jun 29
2
avx512 JIT backend generates wrong code on <4 x float>
...quot;module_KFxOBX_i4_after.ll"
.globl adjmul
.align 16, 0x90
.type adjmul, at function
adjmul:
.cfi_startproc
leaq (%rdi,%r8), %rdx
addq %rsi, %r8
testb $1, %cl
cmoveq %rdi, %rdx
cmoveq %rsi, %r8
movq %rdx, %rax
sarq $63, %rax
shrq $62, %rax
addq %rdx, %rax
sarq $2, %rax
movq %r8, %rcx
sarq $63, %rcx
shrq $62, %rcx
addq %r8, %rcx
sarq $2, %rcx
movq %rax, %rdx
shlq $5, %rdx
leaq 16(%r9,%rdx), %rsi
orq $16, %rdx...
2016 Jun 29
0
avx512 JIT backend generates wrong code on <4 x float>
...mul
> .align 16, 0x90
> .type adjmul, at function
> adjmul:
> .cfi_startproc
> leaq (%rdi,%r8), %rdx
> addq %rsi, %r8
> testb $1, %cl
> cmoveq %rdi, %rdx
> cmoveq %rsi, %r8
> movq %rdx, %rax
> sarq $63, %rax
> shrq $62, %rax
> addq %rdx, %rax
> sarq $2, %rax
> movq %r8, %rcx
> sarq $63, %rcx
> shrq $62, %rcx
> addq %r8, %rcx
> sarq $2, %rcx
> movq %rax, %rdx
> shlq $5, %rdx
>...
2016 Jun 30
1
avx512 JIT backend generates wrong code on <4 x float>
...type adjmul, at function
>> adjmul:
>> .cfi_startproc
>> leaq (%rdi,%r8), %rdx
>> addq %rsi, %r8
>> testb $1, %cl
>> cmoveq %rdi, %rdx
>> cmoveq %rsi, %r8
>> movq %rdx, %rax
>> sarq $63, %rax
>> shrq $62, %rax
>> addq %rdx, %rax
>> sarq $2, %rax
>> movq %r8, %rcx
>> sarq $63, %rcx
>> shrq $62, %rcx
>> addq %r8, %rcx
>> sarq $2, %rcx
>> movq...
2016 Jun 23
2
AVX512 instruction generated when JIT compiling for an avx2 architecture
...ot;module"
.globl main
.align 16, 0x90
.type main, at function
main:
.cfi_startproc
movq 8(%rsp), %r10
leaq (%rdi,%r8), %rdx
addq %rsi, %r8
testb $1, %cl
cmoveq %rdi, %rdx
cmoveq %rsi, %r8
movq %rdx, %rax
sarq $63, %rax
shrq $62, %rax
addq %rdx, %rax
sarq $2, %rax
movq %r8, %rcx
sarq $63, %rcx
shrq $62, %rcx
addq %r8, %rcx
sarq $2, %rcx
movq (%r10), %r8
movq 8(%r10), %r10
movq %r8, %rdi
shrq $32, %rdi...
2016 Jun 23
2
AVX512 instruction generated when JIT compiling for an avx2 architecture
...t function
> main:
> .cfi_startproc
> movq 8(%rsp), %r10
> leaq (%rdi,%r8), %rdx
> addq %rsi, %r8
> testb $1, %cl
> cmoveq %rdi, %rdx
> cmoveq %rsi, %r8
> movq %rdx, %rax
> sarq $63, %rax
> shrq $62, %rax
> addq %rdx, %rax
> sarq $2, %rax
> movq %r8, %rcx
> sarq $63, %rcx
> shrq $62, %rcx
> addq %r8, %rcx
> sarq $2, %rcx
> movq (%r10), %r8
>...
2003 Jun 27
2
Probs with smbfs
Hi all
I am having trouble with my SMBFS and it is the following
Every time I try to connect to other machine in my network, throught the command MOUNT, the folowing ERROR appears. I've already tried to see the manpage but i had not success.
[root@backup_sp bin] mount -t smbfs //sarq/c /mnt/windows
Password:
ERROR: smbfs filesystem not supported by the kernel
Please refer to the smbnt (8) manual page
smbmnt failed: 255
I want to remember that service smb is running and last week, it was working properly.
Please i need this help
I get very please about your attention. Thank...
2014 Mar 12
2
[LLVMdev] Autovectorization questions
...-march=core-avx2 -S -o bb.S
-fslp-vectorize-aggressive
and loop body looks like:
.LBB1_2: # %for.body
# =>This Inner Loop Header: Depth=1
cltq
vmovsd (%rsi,%rax,8), %xmm0
movq %r9, %r10
sarq $32, %r10
vaddsd (%rdi,%r10,8), %xmm0, %xmm0
vmovsd %xmm0, (%rdi,%r10,8)
addq %r8, %r9
addl %ecx, %eax
decl %edx
jne .LBB1_2
so vector instructions for scalars (vaddsd, vmovsd) were used in the loop
and no real gather/scatter emitte...
2011 Sep 09
1
[LLVMdev] Reserved call frame
...f the testcase with the reservedCallFrame
enabled:
# BB#26: # %L51
movq 536(%rsp), %rbp
movq $-5, 536(%rsp)
movq 552(%rsp), %rsi
movq 560(%rsp), %rdx
movq 544(%rsp), %rcx
movq %rdi, (%rsp)
movq $2, 8(%rsp)
sarq $4, %rbp
leaq *128*(%rsp), %rdi
movq %rbp, %r8
movq *40*(%rsp), %r9 # 8-byte Reload
callq bs_put_big_integer # a function call with 5 arguments
movq 128(%rsp), %rax
movq 136(%rsp), %rcx
cmpq $0, 144(%rsp)
movq %rcx, 560(%rsp...
2013 Oct 30
1
[LLVMdev] Optimization bug - spurious shift in partial word test
...lets say >0, by shifting
left to get the sign bit into the msb and testing llvm is inserting a
spurious right shift instruction.
For example this IR:
...
%0 = load i64* %a.addr, align 8
%shl = shl i64 %0, 28
%cmp = icmp sgt i64 %shl, 0
...
results in
...
shlq $28, %rdi
sarq $28, %rdi ; <<< spurious shift
testq %rdi, %rdi
gcc doesnt have this problem. It just emits the shift and test.
The reason appears to be that the instruction combining pass decides that
the shift and test is equivalent to a test on the partial word, in this
case an I36.
>...
2015 Jul 24
0
[LLVMdev] SIMD for sdiv <2 x i64>
...nt
vmovaps %xmm2, 96(%rsp) # 16-byte Spill
vmovdqa 48(%rsp), %xmm0 # 16-byte Reload
vpsubq %xmm0, %xmm2, %xmm0
vpextrq $1, %xmm0, %rax
movabsq $3074457345618258603, %rcx # imm = 0x2AAAAAAAAAAAAAAB
imulq %rcx
movq %rdx, %rax
shrq $63, %rax
sarq $2, %rdx
addq %rax, %rdx
vmovq %rdx, %xmm1
vmovq %xmm0, %rax
imulq %rcx
movq %rdx, %rax
shrq $63, %rax
sarq $2, %rdx
addq %rax, %rdx
vmovq %rdx, %xmm0
vpunpcklqdq %xmm1, %xmm0, %xmm1 # xmm1 = xmm0[0],xmm1[0]
vpxor %xmm4, %xmm1,...
2015 Jul 24
1
[LLVMdev] SIMD for sdiv <2 x i64>
...16-byte Spill
> vmovdqa 48(%rsp), %xmm0 # 16-byte Reload
> vpsubq %xmm0, %xmm2, %xmm0
> vpextrq $1, %xmm0, %rax
> movabsq $3074457345618258603, %rcx # imm = 0x2AAAAAAAAAAAAAAB
> imulq %rcx
> movq %rdx, %rax
> shrq $63, %rax
> sarq $2, %rdx
> addq %rax, %rdx
> vmovq %rdx, %xmm1
> vmovq %xmm0, %rax
> imulq %rcx
> movq %rdx, %rax
> shrq $63, %rax
> sarq $2, %rdx
> addq %rax, %rdx
> vmovq %rdx, %xmm0
> vpunpcklqdq %xmm1, %xmm0, %xmm1...
2015 Jul 24
2
[LLVMdev] SIMD for sdiv <2 x i64>
On 07/24/2015 03:42 AM, Benjamin Kramer wrote:
>> On 24.07.2015, at 08:06, zhi chen <zchenhn at gmail.com> wrote:
>>
>> It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing
2014 Mar 12
4
[LLVMdev] Autovectorization questions
...oop body looks like:
>>
>> .LBB1_2: # %for.body
>> # =>This Inner Loop Header: Depth=1
>> cltq
>> vmovsd (%rsi,%rax,8), %xmm0
>> movq %r9, %r10
>> sarq $32, %r10
>> vaddsd (%rdi,%r10,8), %xmm0, %xmm0
>> vmovsd %xmm0, (%rdi,%r10,8)
>> addq %r8, %r9
>> addl %ecx, %eax
>> decl %edx
>> jne .LBB1_2
>>
>> so vector instructions for scalars...
2014 Feb 19
2
[LLVMdev] better code for IV
...xorl %r9d, %r9d
movabsq $4294967296, %r8 # imm = 0x100000000
.align 16, 0x90
.LBB0_1: # %L_entry
# =>This Inner Loop Header: Depth=1
movq %r9, %rax
sarq $32, %rax
movss (%rdi,%rax,4), %xmm0
addss (%rsi,%rax,4), %xmm0
movss %xmm0, (%rdx,%rax,4)
addq %r8, %r9
decq %rcx
jne .LBB0_1
# BB#2:
Ret
This is what I want to get:
ArrayAdd2:...
2019 Jun 13
2
[RFC] Coding Standards: "prefer `int` for regular arithmetic, use `unsigned` only for bitmask and when you intend to rely on wrapping behavior."
FWIW, the talks linked by Mehdi really do talk about these things and why I
don't think the really are the correct trade-off.
Even if you imagine an unsigned type that doesn't allow wrapping, I think
this is a really bad type. The problem is that you have made the most
common value of the type (zero in every study I'm aware of) be a boundary
condition. Today, it wraps to a huge value
2018 Mar 23
5
RFC: Speculative Load Hardening (a Spectre variant #1 mitigation)
...-speculated state value of `-1`):
```
...
.LBB0_4: # %danger
cmovneq %r8, %rax # Conditionally update predicate
state.
shlq $47, %rax
orq %rax, %rsp
callq other_function
movq %rsp, %rax
sarq 63, %rax # Sign extend the high bit to all
bits.
```
This first puts the predicate state into the high bits of `%rsp` before
calling
the function and then reads it back out of high bits of `%rsp` afterward.
When
correctly executing (speculatively or not), these are all no-ops. Wh...