Displaying 20 results from an estimated 23 matches for "mulsd".
Did you mean:
muls
2017 Mar 01
2
[Codegen bug in LLVM 3.8?] br following `fcmp une` is present in ll, absent in asm
...cx
movq %rax, 728(%rcx)
movq 184(%rsp), %rax
movq 728(%rax), %rcx
movq %rcx, 736(%rax)
movq 184(%rsp), %rax
movq $0, 744(%rax)
movq 184(%rsp), %rax
movq $0, 752(%rax)
movq 184(%rsp), %rax
movq $0, 760(%rax)
movq 176(%rsp), %rax
movsd 5608(%rax), %xmm0 # xmm0 = mem[0],zero
movq 184(%rsp), %rax
mulsd 648(%rax), %xmm0
movsd 160(%rsp), %xmm1 # 8-byte Reload
# xmm1 = mem[0],zero
addsd %xmm0, %xmm1
movsd %xmm1, 672(%rax)
movq 176(%rsp), %rax
movsd 5648(%rax), %xmm0 # xmm0 = mem[0],zero
movq 184(%rsp), %rax
mulsd 648(%rax), %xmm0
movsd %xmm0, 704(...
2016 Oct 12
4
[test-suite] making polybench/symm succeed with "-Ofast" and "-ffp-contract=on"
On Wed, Oct 12, 2016 at 10:53 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> I don't think that Clang/LLVM uses it by default on x86_64. If you're using -Ofast, however, that would explain it. I recommend looking at -O3 vs -O0 and make sure those are the same. -Ofast enables -ffast-math, which can legitimately cause differences.
>
The following tests pass at "-O3" and
2009 Dec 07
4
[LLVMdev] 2.6 JIT using wrong address for external functions
...6: mov %r14d,%esi
>>> 0xfffffd7ff9302549: callq 0xfffffd800066f690
0xfffffd7ff930254e: cvtsi2sd %rax,%xmm0
0xfffffd7ff9302553: mov $0xfffffd7ff93024d0,%rax
0xfffffd7ff930255d: movsd (%rax),%xmm1
0xfffffd7ff9302561: movsd %xmm1,(%rsp)
0xfffffd7ff9302566: mulsd %xmm1,%xmm0
(gdb) x/i 0xfffffd800066f690
0xfffffd800066f690: Cannot access memory at address
0xfffffd800066f690
(gdb) disassemble 0x66f690
Dump of assembler code for function
_ZN12ContextFrame13GetInt64ValueEPKS_ix:
0x000000000066f690 <_ZN12ContextFrame13GetInt64ValueEPKS_ix+0>: push
%...
2012 Jun 30
2
[LLVMdev] llc -O# / opt -O# differences
...x - a[0].y;
return dx * dx;
}
Running through opt
$ llvm-as < x.ll | opt -O3 | llc > y.s
Produces the following:
_foo: ## @foo
.cfi_startproc
## BB#0: ## %entry
movsd (%rdi), %xmm0
subsd (%rsi), %xmm0
mulsd %xmm0, %xmm0
ret
.cfi_endproc
This also matches what clang compiles from the C function. However,
running through llc with the same optimization flag
$ llc -O3 x.ll -o z.s
_foo: ## @foo
.cfi_startproc
## BB#0: ## %en...
2012 Jan 04
1
[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?
...32;
.endef
.text
.globl _f
.align 16, 0x90
_f: # @f
# BB#0:
movl $-800, %eax # imm = 0xFFFFFFFFFFFFFCE0
movsd _DA, %xmm0
.align 16, 0x90
LBB0_1: # =>This Inner Loop Header: Depth=1
movsd _X+800(%eax), %xmm1
mulsd %xmm0, %xmm1
movsd _Y+800(%eax), %xmm2
subsd %xmm1, %xmm2
movsd %xmm2, _Y+800(%eax)
addl $8, %eax
jne LBB0_1
# BB#2:
xorl %eax, %eax
ret
.data
.globl _DA # @DA
.align 8
_DA:
.quad 4599075939470750515 # double 3.000000e-01
.comm _Y,800,3 # @Y
.comm...
2013 Jul 15
3
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...king at some performance counters on Friday, but I did not find anything suspicious yet.
+0x00 movupd 16(%rsi), %xmm0
+0x05 movupd 16(%rsp), %xmm1
+0x0b subpd %xmm1, %xmm0 <———— 18% of the runtime of bh ?
+0x0f movapd %xmm0, %xmm2
+0x13 mulsd %xmm2, %xmm2
+0x17 xorpd %xmm1, %xmm1
+0x1b addsd %xmm2, %xmm1
I spent less time on Bullet. Bullet also has one hot function (“resolveSingleConstraintRowLowerLimit”). On this code the vectorizer generates several trees that use the <3 x float>...
2010 Jun 07
1
[LLVMdev] XMM in X86 Backend
Hi all,
I am observing an excessive use of xmm registers in the output assembly
produced by x86 backend. Basically, for a code like this
double test(double a, double b) {
double c;
c = 1.0 + sin (a + b*b);
return c;
}
llc produced somthing like....
movsd 16(%ebp), %xmm0
mulsd %xmm0, %xmm0
addsd 8(%ebp), %xmm0
movsd %xmm0, (%esp)
.......
fstpl -8(%ebp
movsd -8(%ebp), %xmm0
addsd .LC1, %xmm0
movsd %xmm0, -8(%ebp)
fldl -8(%ebp)
LLVM Backend is using xmms it involves a lot of register moves. llc has...
2013 Jan 05
0
[LLVMdev] RuntimeDyld bug in resolving addresses with offset?
...std::string fileName = "rtdyldbug.o";
myFun fptr = (myFun)getFunctionPointer(funName, fileName);
double w[5] = {0, 0, 0, 0, 0};
fptr(4, w);
printf("%f \n", w[2]);
return 0;
}
The printed result should be 148, but its 132. The instruction which reads numbers[4] is
mulsd _numbers+0x00000020(%rip),%xmm0
When I did debugging at the assembly level, I found that the offset 0x20 is ignored. The resolved address points to numbers[0] instead of numbers[4].
I compiled the attached rtdyldbug.c as "clang -c -o rtdyldbug.o rtdyldbug.c". I compiled myrtdyld.cpp as...
2013 Jul 23
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
...ers on Friday, but I did not find anything suspicious yet.
>
> +0x00 movupd 16(%rsi), %xmm0
> +0x05 movupd 16(%rsp), %xmm1
> +0x0b subpd %xmm1, %xmm0 <———— 18% of the runtime of bh ?
> +0x0f movapd %xmm0, %xmm2
> +0x13 mulsd %xmm2, %xmm2
> +0x17 xorpd %xmm1, %xmm1
> +0x1b addsd %xmm2, %xmm1
>
> I spent less time on Bullet. Bullet also has one hot function (“resolveSingleConstraintRowLowerLimit”). On this code the vectorizer generates several trees that use th...
2009 Dec 07
0
[LLVMdev] 2.6 JIT using wrong address for external functions
...>>> 0xfffffd7ff9302549: callq 0xfffffd800066f690
> 0xfffffd7ff930254e: cvtsi2sd %rax,%xmm0
> 0xfffffd7ff9302553: mov $0xfffffd7ff93024d0,%rax
> 0xfffffd7ff930255d: movsd (%rax),%xmm1
> 0xfffffd7ff9302561: movsd %xmm1,(%rsp)
> 0xfffffd7ff9302566: mulsd %xmm1,%xmm0
>
> (gdb) x/i 0xfffffd800066f690
> 0xfffffd800066f690: Cannot access memory at address
> 0xfffffd800066f690
>
> (gdb) disassemble 0x66f690
> Dump of assembler code for function
> _ZN12ContextFrame13GetInt64ValueEPKS_ix:
> 0x000000000066f690 <_ZN12Conte...
2016 Jun 27
3
Finding caller-saved registers at a function call site
...th clang/LLVM 3.8 (-O3) on Ubuntu 14.04 looks like this:
...
400694: ff c7 inc %edi # Add 1
to depth
400696: f2 0f 10 05 a2 92 05 movsd 0x592a2(%rip),%xmm0 # Move
constant 1.2 into xmm0
40069d: 00
40069e: f2 0f 59 c1 mulsd %xmm1,%xmm0 # val *
1.2
4006a2: f2 0f 11 4d f8 movsd %xmm1,-0x8(%rbp) # Spill
val to the stack
4006a7: e8 d4 ff ff ff callq 400680 <recurse>
4006ac: f2 0f 58 45 f8 addsd -0x8(%rbp),%xmm0 #
recurse's return value + val...
2012 Apr 03
1
[LLVMdev] pb05 results for current llvm/dragonegg
Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn
on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks
for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
(http://llvm.org/bugs/show_bug.cgi?id=12434). I've added two
2018 Nov 15
2
[RFC][llvm-mca] Adding binary support to llvm-mca.
...rmed into assembly labels. While the markers are presented as
function calls, in reality they are no-ops.
test:
pushq %rbp
movq %rsp, %rbp
movsd %xmm0, -8(%rbp)
movsd %xmm1, -16(%rbp)
.Lmca_code_region_start_0: # LLVM-MCA-START ID: 42
xorps %xmm0, %xmm0
movsd %xmm0, -24(%rbp)
movsd -8(%rbp), %xmm0
mulsd -16(%rbp), %xmm0
addsd -24(%rbp), %xmm0
movsd %xmm0, -24(%rbp)
.Lmca_code_region_end_0: # LLVM-MCA-END ID: 42
movsd -24(%rbp), %xmm0
popq %rbp
retq
.section .mca_code_regions,"", at progbits
.quad 42
.quad .Lmca_code_region_start_0
.quad .Lmca_code_region_end_0-.Lmca_code_region_start_0...
2016 Jun 22
0
Finding caller-saved registers at a function call site
Hi Rob,
Rob Lyerly via llvm-dev wrote:
> I'm looking for a way to get all the caller-saved registers (both the
> register and the stack slot at which it was saved) for a given function
> call site in the backend. What's the best way to grab this
> information? Is it possible to get this information if I have the
> MachineInstr of the function call? I'm currently
2016 Jun 22
3
Finding caller-saved registers at a function call site
Hi everyone,
I'm looking for a way to get all the caller-saved registers (both the
register and the stack slot at which it was saved) for a given function
call site in the backend. What's the best way to grab this information?
Is it possible to get this information if I have the MachineInstr of the
function call? I'm currently targeting the AArch64 & X86 backends.
Thanks!
--
2013 Jul 15
0
[LLVMdev] Enabling the SLP vectorizer by default for -O3
On Jul 13, 2013, at 11:30 PM, Nadav Rotem <nrotem at apple.com> wrote:
> Hi,
>
> LLVM’s SLP-vectorizer is a new pass that combines similar independent instructions in a straight-line code. It is currently not enabled by default, and people who want to experiment with it can use the clang command line flag “-fslp-vectorize”. I ran LLVM’s test suite with and without the SLP
2016 Jun 27
0
Finding caller-saved registers at a function call site
...looks like this:
>
> ...
> 400694: ff c7 inc %edi # Add
> 1 to depth
> 400696: f2 0f 10 05 a2 92 05 movsd 0x592a2(%rip),%xmm0 # Move
> constant 1.2 into xmm0
> 40069d: 00
> 40069e: f2 0f 59 c1 mulsd %xmm1,%xmm0 # val
> * 1.2
> 4006a2: f2 0f 11 4d f8 movsd %xmm1,-0x8(%rbp) #
> Spill val to the stack
> 4006a7: e8 d4 ff ff ff callq 400680 <recurse>
> 4006ac: f2 0f 58 45 f8 addsd -0x8(%rbp),%xmm0 #
> re...
2018 Nov 21
2
[RFC][llvm-mca] Adding binary support to llvm-mca.
...; test:
> > pushq %rbp
> > movq %rsp, %rbp
> > movsd %xmm0, -8(%rbp)
> > movsd %xmm1, -16(%rbp)
> > .Lmca_code_region_start_0: # LLVM-MCA-START ID: 42
> > xorps %xmm0, %xmm0
> > movsd %xmm0, -24(%rbp)
> > movsd -8(%rbp), %xmm0
> > mulsd -16(%rbp), %xmm0
> > addsd -24(%rbp), %xmm0
> > movsd %xmm0, -24(%rbp)
> > .Lmca_code_region_end_0: # LLVM-MCA-END ID: 42
> > movsd -24(%rbp), %xmm0
> > popq %rbp
> > retq
> > .section .mca_code_regions,"", at progbits
> >...
2013 Jul 14
6
[LLVMdev] Enabling the SLP vectorizer by default for -O3
Hi,
LLVM’s SLP-vectorizer is a new pass that combines similar independent instructions in a straight-line code. It is currently not enabled by default, and people who want to experiment with it can use the clang command line flag “-fslp-vectorize”. I ran LLVM’s test suite with and without the SLP vectorizer on a Sandybridge mac (using SSE4, w/o AVX). Based on my performance measurements
2018 Nov 27
2
[RFC][llvm-mca] Adding binary support to llvm-mca.
...> > movsd %xmm0, -8(%rbp)
> > > > movsd %xmm1, -16(%rbp)
> > > > .Lmca_code_region_start_0: # LLVM-MCA-START ID: 42
> > > > xorps %xmm0, %xmm0
> > > > movsd %xmm0, -24(%rbp)
> > > > movsd -8(%rbp), %xmm0
> > > > mulsd -16(%rbp), %xmm0
> > > > addsd -24(%rbp), %xmm0
> > > > movsd %xmm0, -24(%rbp)
> > > > .Lmca_code_region_end_0: # LLVM-MCA-END ID: 42
> > > > movsd -24(%rbp), %xmm0
> > > > popq %rbp
> > > > retq
> > > >...