Displaying 8 results from an estimated 8 matches for "vaddsd".
Did you mean:
addsd
2015 Oct 02
2
Register Spill Caused by the Reassociation pass
This conflict is with many optimizations incl. copy prop, coalescing, hoisting etc. Each could increase register pressure and with similar impact. Attempts to control the register pressure locally (within an optimization pass) tend to get hard to tune and maintain. Would it be a better way to describe eg in metadata how to undo an optimization? Optimizations that attempt to reduce pressure like
2014 Mar 12
2
[LLVMdev] Autovectorization questions
....S
-fslp-vectorize-aggressive
and loop body looks like:
.LBB1_2: # %for.body
# =>This Inner Loop Header: Depth=1
cltq
vmovsd (%rsi,%rax,8), %xmm0
movq %r9, %r10
sarq $32, %r10
vaddsd (%rdi,%r10,8), %xmm0, %xmm0
vmovsd %xmm0, (%rdi,%r10,8)
addq %r8, %r9
addl %ecx, %eax
decl %edx
jne .LBB1_2
so vector instructions for scalars (vaddsd, vmovsd) were used in the loop
and no real gather/scatter emitted.
The question is why this...
2014 Mar 12
4
[LLVMdev] Autovectorization questions
...> .LBB1_2: # %for.body
>> # =>This Inner Loop Header: Depth=1
>> cltq
>> vmovsd (%rsi,%rax,8), %xmm0
>> movq %r9, %r10
>> sarq $32, %r10
>> vaddsd (%rdi,%r10,8), %xmm0, %xmm0
>> vmovsd %xmm0, (%rdi,%r10,8)
>> addq %r8, %r9
>> addl %ecx, %eax
>> decl %edx
>> jne .LBB1_2
>>
>> so vector instructions for scalars (vaddsd, vmovsd) were used in the loo...
2012 Jul 26
0
[LLVMdev] X86 FMA4
Ah, bad example. This is a general problem for all (maybe most) SSE and AVX
SS/SD patterns though, which is why I mentioned Sandybridge. You can swap
out VFMADDSD in my example for VADDSD or whatever you like.
I have a lion's share of such a change implemented already and performance
is greatly affected. If the community is interested in this change, I would
be happy to prepare a patch.
-Cameron
On Thu, Jul 26, 2012 at 2:27 PM, Jan Sjodin <jan_sjodin at yahoo.com> wrote...
2012 Jul 27
2
[LLVMdev] X86 FMA4
...to make that vmovsd has the same stats as well.
Michael
On Jul 26, 2012, at 11:46 AM, Cameron McInally wrote:
> Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like.
>
> I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch.
>
> -Cameron
>
> On Thu, Jul 26, 2012 at 2:27 PM, Jan Sjodin <jan...
2012 Jul 25
6
[LLVMdev] X86 FMA4
We're migrating to LLVM 3.1 and trying to use the upstream FMA patterns.
Why is VFMADDSD4 defined with vector types? Is this simply because the
gcc intrinsic uses vector types? It's quite unnatural if you have a
compiler that generates FMAs as opposed to requiring user intrinsics.
-Dave
2012 Jul 27
0
[LLVMdev] X86 FMA4
...e stats as well.
>
> Michael
>
> On Jul 26, 2012, at 11:46 AM, Cameron McInally wrote:
>
> Ah, bad example. This is a general problem for all (maybe most) SSE and
> AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can
> swap out VFMADDSD in my example for VADDSD or whatever you like.
>
> I have a lion's share of such a change implemented already and performance
> is greatly affected. If the community is interested in this change, I would
> be happy to prepare a patch.
>
> -Cameron
>
> On Thu, Jul 26, 2012 at 2:27 PM, Jan Sjodin...
2012 Jul 27
3
[LLVMdev] X86 FMA4
...same stats as well.
>
> Michael
>
> On Jul 26, 2012, at 11:46 AM, Cameron McInally wrote:
>
>> Ah, bad example. This is a general problem for all (maybe most) SSE and AVX SS/SD patterns though, which is why I mentioned Sandybridge. You can swap out VFMADDSD in my example for VADDSD or whatever you like.
>>
>> I have a lion's share of such a change implemented already and performance is greatly affected. If the community is interested in this change, I would be happy to prepare a patch.
>>
>> -Cameron
>>
>> On Thu, Jul 26, 2012 at 2:2...