Displaying 20 results from an estimated 31 matches for "incq".
Did you mean:
inc
2015 Aug 08
2
RFC: PGO Late instrumentation for LLVM
...llbacks)
>> can incur overhead due to indirect call target profiling.
>>
>>
>> 1.1 Redundant Counter Update
>>
>> If checking the assembly of the instrumented binary generated by current
>> LLVM implementation, we can find many sequence of consecutive 'incq'
>> instructions that updating difference counters in the same basic block. As
>> an example that extracted from real binary:
>> ...
>> incq 0xa91d80(%rip) # 14df4b8
>> <__llvm_profile_counters__ZN13LowLevelAlloc5ArenaC2Ev+0x1b8>
>> incq...
2015 Aug 10
3
RFC: PGO Late instrumentation for LLVM
...target profiling.
> >>>
> >>>
> >>> 1.1 Redundant Counter Update
> >>>
> >>> If checking the assembly of the instrumented binary generated by
> current
> >>> LLVM implementation, we can find many sequence of consecutive 'incq'
> >>> instructions that updating difference counters in the same basic
> block. As
> >>> an example that extracted from real binary:
> >>> ...
> >>> incq 0xa91d80(%rip) # 14df4b8
> >>> <__llvm_profile_counters__ZN13L...
2015 Aug 08
3
RFC: PGO Late instrumentation for LLVM
.... Small and hot callee functions taking function pointer (callbacks)
can incur overhead due to indirect call target profiling.
1.1 Redundant Counter Update
If checking the assembly of the instrumented binary generated by current
LLVM implementation, we can find many sequence of consecutive 'incq'
instructions that updating difference counters in the same basic block. As
an example that extracted from real binary:
...
incq 0xa91d80(%rip) # 14df4b8
<__llvm_profile_counters__ZN13LowLevelAlloc5ArenaC2Ev+0x1b8>
incq 0xa79011(%rip) # 14c6750
<__llvm_profile_cou...
2019 Sep 02
3
AVX2 codegen - question reg. FMA generation
...#39;s the snippet in the output it generates:
$ llc -O3 -mcpu=skylake
---------------------
.LBB0_2: # =>This Inner Loop Header: Depth=1
vbroadcastss (%rsi,%rdx,4), %ymm0
vmulps (%rdi,%rcx), %ymm0, %ymm0
vaddps (%rax,%rcx), %ymm0, %ymm0
vmovups %ymm0, (%rax,%rcx)
incq %rdx
addq $32, %rcx
cmpq $15, %rdx
jle .LBB0_2
-----------------------
$ llc --version
LLVM (http://llvm.org/):
LLVM version 8.0.0
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: skylake
(llvm commit 198009ae8db11d7c0b0517f17358870dc486fcfb from Aug 31)
Using opt -O3 f...
2014 Feb 18
2
[LLVMdev] asan coverage
...ompared it with AsanCoverage.
AsanCoverage produces code like this:
mov 0xe86cce(%rip),%al
test %al,%al
je 48b4a0 # to call __sanitizer_cov
...
callq 4715b0 <__sanitizer_cov>
A simple counter-based thing (which just increments counters and does
nothing else useful) produces this:
incq 0xe719c6(%rip)
The performance is more or less the same, although the issue with false
sharing still remains
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-October/066116.html)
Do you have any more details about the planned clang coverage?
Thanks,
--kcc
On Tue, Feb 18, 2014 at 1:00 PM, K...
2014 Feb 19
2
[LLVMdev] asan coverage
...produces code like this:
> mov 0xe86cce(%rip),%al
> test %al,%al
> je 48b4a0 # to call __sanitizer_cov
> ...
> callq 4715b0 <__sanitizer_cov>
>
> A simple counter-based thing (which just increments counters and does
> nothing else useful) produces this:
> incq 0xe719c6(%rip)
>
> The performance is more or less the same, although the issue with false
> sharing still remains
> (http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-October/066116.html)
>
> Do you have any more details about the planned clang coverage?
>
> Thanks,
>
&...
2014 Apr 18
4
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
On Fri, Apr 18, 2014 at 12:13 AM, Dmitry Vyukov <dvyukov at google.com> wrote:
> Hi,
>
> This is long thread, so I will combine several comments into single email.
>
>
> >> - 8-bit per-thread counters, dumping into central counters on overflow.
> >The overflow will happen very quickly with 8bit counter.
>
> Yes, but it reduces contention by 256x (a thread
2017 Jul 17
2
A bug related with undef value when bootstrap MemorySSA.cpp
....LBB1_1: # =>This Inner Loop Header:
Depth=1
86 testb $1, %sil
87 je .LBB1_3
88 # BB#2: # in Loop: Header=BB1_1 Depth=1
89 movq b(%rip), %rsi
90 addq %rax, %rsi
91 movq %rsi, c(%rip)
92 movq $3, i_hasval(%rip)
93 incq %rdx
94 xorl %esi, %esi
95 cmpq %rcx, %rdx
96 jl .LBB1_1
97 .LBB1_3:
98 retq
```
IMHO, enhancing `isGuaranteedNotToBeUndefOrPoison` and using it as a
precondition in loop unswitching is
not enough. undef (and poison) value can be stored into memory, and also be
passed by a function arg...
2015 Jun 26
2
[LLVMdev] Can LLVM vectorize <2 x i32> type
...label %middle.block, label %vector.ph
The corresponding assembly code is:
# BB#3: # %for.cond.preheader
imull %r9d, %ebx
testl %ebx, %ebx
jle .LBB10_63
# BB#4: # %for.body.preheader
leal -1(%rbx), %eax
incq %rax
xorl %edx, %edx
movabsq $8589934584, %rcx # imm = 0x1FFFFFFF8
andq %rax, %rcx
je .LBB10_8
I changed all the scalar operands to <2 x ValueType> ones. The IR becomes
the following
for.cond.preheader: ; preds = %if.end18
%...
2017 Jul 17
3
A bug related with undef value when bootstrap MemorySSA.cpp
...t;> 86 testb $1, %sil
>> 87 je .LBB1_3
>> 88 # BB#2: # in Loop: Header=BB1_1
>> Depth=1
>> 89 movq b(%rip), %rsi
>> 90 addq %rax, %rsi
>> 91 movq %rsi, c(%rip)
>> 92 movq $3, i_hasval(%rip)
>> 93 incq %rdx
>> 94 xorl %esi, %esi
>> 95 cmpq %rcx, %rdx
>> 96 jl .LBB1_1
>> 97 .LBB1_3:
>> 98 retq
>> ```
>>
>> IMHO, enhancing `isGuaranteedNotToBeUndefOrPoison` and using it as a
>> precondition in loop unswitching is
>> not enough....
2014 Jul 23
4
[LLVMdev] the clang 3.5 loop optimizer seems to jump in unintentional for simple loops
...fferent (not just inlined) to the_func
clang -DITER -O2
clang -DITER -O3
gives:
the_func:
leaq 12(%rdi), %rcx
leaq 4(%rdi), %rax
cmpq %rax, %rcx
cmovaq %rcx, %rax
movq %rdi, %rsi
notq %rsi
addq %rax, %rsi
shrq $2, %rsi
incq %rsi
xorl %edx, %edx
movabsq $9223372036854775800, %rax # imm = 0x7FFFFFFFFFFFFFF8
andq %rsi, %rax
pxor %xmm0, %xmm0
je .LBB0_1
# BB#2: # %vector.body.preheader
leaq (%rdi,%rax,4), %r8
addq $16, %rdi...
2017 Jul 17
3
A bug related with undef value when bootstrap MemorySSA.cpp
...gt;> 88 # BB#2: # in Loop: Header=BB1_1
>> >> Depth=1
>> >> 89 movq b(%rip), %rsi
>> >> 90 addq %rax, %rsi
>> >> 91 movq %rsi, c(%rip)
>> >> 92 movq $3, i_hasval(%rip)
>> >> 93 incq %rdx
>> >> 94 xorl %esi, %esi
>> >> 95 cmpq %rcx, %rdx
>> >> 96 jl .LBB1_1
>> >> 97 .LBB1_3:
>> >> 98 retq
>> >> ```
>> >>
>> >> IMHO, enhancing `isGuaranteedNotToBeUndefOrPoison` and using i...
2014 Apr 23
4
[LLVMdev] multithreaded performance disaster with -fprofile-instr-generate (contention on profile counters)
...) { v[0] = 42; }
>
> Here we have a single basic block and a call, but since the coverage is emitted by the
> FE before inlining (and is also emitted for std::vector methods) we get this assembler at -O2:
> 0000000000400b90 <_Z3foov>:
> 400b90: 48 ff 05 11 25 20 00 incq 0x202511(%rip) # 6030a8 <__llvm_profile_counters__Z3foov>
> 400b97: 48 ff 05 42 25 20 00 incq 0x202542(%rip) # 6030e0 <__llvm_profile_counters__ZNSt6vectorIiSaIiEEixEm>
> 400b9e: 48 8b 05 4b 26 20 00 mov 0x20264b(%rip),%rax # 6031f...
2010 Jun 13
2
[LLVMdev] Bignum development
...t; # => This Inner Loop Header: Depth=2
> addq (%rbx,%rsi,8), %rdi
> movl $0, %r8d
> adcq $0, %r8
> addq (%r14,%rsi,8), %rdi
> adcq $0, %r8
> movq %rdi, (%r15,%rsi,8)
> incq %rsi
> cmpq $1000, %rsi # imm = 0x3E8
> movq %r8, %rdi
> jne .LBB1_7
>
> So it basically tries to keep track of the carry in %r8 instead of in
> the carry flag.
>
> As hinted, the other optimisation missed here, is that instead o...
2010 Jun 12
0
[LLVMdev] Bignum development
...# Parent Loop BB1_6 Depth=1
# => This Inner Loop Header: Depth=2
addq (%rbx,%rsi,8), %rdi
movl $0, %r8d
adcq $0, %r8
addq (%r14,%rsi,8), %rdi
adcq $0, %r8
movq %rdi, (%r15,%rsi,8)
incq %rsi
cmpq $1000, %rsi # imm = 0x3E8
movq %r8, %rdi
jne .LBB1_7
So it basically tries to keep track of the carry in %r8 instead of in
the carry flag.
As hinted, the other optimisation missed here, is that instead of
comparing with $1000 it can start...
2017 Jul 18
4
A bug related with undef value when bootstrap MemorySSA.cpp
...# in Loop: Header=BB1_1
>>>> >> Depth=1
>>>> >> 89 movq b(%rip), %rsi
>>>> >> 90 addq %rax, %rsi
>>>> >> 91 movq %rsi, c(%rip)
>>>> >> 92 movq $3, i_hasval(%rip)
>>>> >> 93 incq %rdx
>>>> >> 94 xorl %esi, %esi
>>>> >> 95 cmpq %rcx, %rdx
>>>> >> 96 jl .LBB1_1
>>>> >> 97 .LBB1_3:
>>>> >> 98 retq
>>>> >> ```
>>>> >>
>>>> >>...
2011 Dec 22
1
[LLVMdev] tail call optimization question
...## %if.no
movq %rdi, %rbx
testq %rsi, %rsi
jle LBB1_4
## BB#2: ## %if.no2
decq %rsi
movq %rbx, %rdi
callq _ack.15
movq %rbx, %rdi
decq %rdi
movq %rax, %rsi
popq %rbx
jmp _ack.15 ## TAILCALL
LBB1_3: ## %if.yes
incq %rsi
movq %rsi, %rax
popq %rbx
ret
LBB1_4: ## %if.yes1
movq %rbx, %rdi
decq %rdi
movl $1, %esi
popq %rbx
jmp _ack.15 ## TAILCALL
Leh_func_end1:
<snip>
Thanks very much,
N
2010 Jun 13
0
[LLVMdev] Bignum development
...# => This Inner Loop Header: Depth=2
>> addq (%rbx,%rsi,8), %rdi
>> movl $0, %r8d
>> adcq $0, %r8
>> addq (%r14,%rsi,8), %rdi
>> adcq $0, %r8
>> movq %rdi, (%r15,%rsi,8)
>> incq %rsi
>> cmpq $1000, %rsi # imm = 0x3E8
>> movq %r8, %rdi
>> jne .LBB1_7
>>
>> So it basically tries to keep track of the carry in %r8 instead of in
>> the carry flag.
>>
>> As hinted, the other optimisat...
2014 Feb 18
2
[LLVMdev] asan coverage
On Feb 17, 2014, at 5:13 AM, Kostya Serebryany <kcc at google.com> wrote:
> Then my question: will there be any objection if I disentangle AsanCoverage from ASan and make it a separate LLVM phase with the proper clang driver support?
> Or it will be an unwelcome competition with the planned clang coverage?
I don’t view it as a competition, but assuming that we both succeed in our
2014 Feb 19
2
[LLVMdev] better code for IV
....LBB1_1: # %L_entry
# =>This Inner Loop Header: Depth=1
movslq %eax, %r8
movss (%rdi,%r8,4), %xmm0
addss (%rsi,%r8,4), %xmm0
movss %xmm0, (%rdx,%r8,4)
incq %rax
cmpq %rax, %rcx
jne .LBB1_1
# BB#2:
Ret
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipi...