Displaying 8 results from an estimated 8 matches for "lcpi0_1".
Did you mean:
lcpi0_0
2014 Mar 14
3
[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
>> Any thoughs?
>
> I'm now struggling to see how GCC justifies it. What if a different
> translation-unit declared those variables in a different order? I also
> can't get the same behaviour here, do you have a more complete
> command-line?
Ah, I see; the translation-unit that does the optimisation needs to
have them as a definition (i.e. "= {0}") rather
2013 Jun 25
2
[LLVMdev] Contants generation
Hi again,
Actually, I've just been looking at the existing code and the ARM
solution may be over-complicated for this situation.
You should be able to override EmitConstantPool directly, or possibly
even just override getSectionForConstantKind in
X86LinuxTargetObjectFile (and perhaps others) to return .text.
Tim.
2013 Jun 25
0
[LLVMdev] Contants generation
That what I actually did now, locally in the code.
But I still see the " movabsq"
.text
.align 8, 0x90
.LCPI0_0:
.quad 4606281698874543309 # double 0.9
.LCPI0_1:
.quad 4631147119616759172 # double 42.2794408
.LCPI0_2:
.long 1065353216 # float 1
.zero 4
...
movabsq $.LCPI0_1, %rax # encoding: [0x48,0xb8,A,A,A,A,A,A,A,A]
# fixup A - offset: 2, value: .LCPI0_1, kind: FK_Data_8
vbroadcasts...
2014 Oct 13
2
[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets
...w-raw-insn ./a.out
...
00000000004005c0 <main>:
4005c0: movdqa 0x1a58(%rip),%xmm0 # 402020 <x>
4005c8: psrld $0x17,%xmm0
4005cd: paddd 0x12b(%rip),%xmm0 # 400700 <.LCPI0_0>
4005d5: cvtdq2ps %xmm0,%xmm1
4005d8: divps 0x131(%rip),%xmm1 # 400710 <.LCPI0_1>
4005df: cvttps2dq %xmm1,%xmm1
4005e3: pmullw 0x135(%rip),%xmm1 # 400720 <.LCPI0_2>
4005eb: psubd %xmm1,%xmm0
4005ef: movq %xmm0,%rax
4005f4: movslq %eax,%rcx
4005f7: sar $0x20,%rax
4005fb: punpckhqdq %xmm0,%xmm0
4005ff: movq %xmm0,%rdx
400604: movslq %edx,...
2014 Mar 14
2
[LLVMdev] [ARM] [PIC] optimizing the loading of hidden global variable
...), %eax movl %eax, (%esp) calll bar at PLT
Which is ok , since the add of ebx is folded and the constant is an immediate in x86.
On ARM, that is not the case. We produce
ldr r0, .LCPI0_0
add r4, pc, r0 // r4 is the equivalent of ebx in the x86 case.
ldr r0, .LCPI0_1 // r0 is the constant that is an
immediate in x86.
add r0, r0, r4 // that is the add that is folded in x86
...
.LCPI0_0:
.long _GLOBAL_OFFSET_TABLE_-(.LPC0_0+8)
.LCPI0_1:
.long g0(GOTOFF)
For ARM, codegen already keeps tracks of offset so it can implement the consta...
2014 Oct 07
4
[LLVMdev] Stange behavior in fp arithmetics on x86 (bug possibly)
...es -0.5 is equal
to -0.0, so correct exit code is 1.
llvm-3.4.2 on x86 linux target produced the following assembly:
.file "fpfail.ll"
.section .rodata.cst8,"aM", at progbits,8
.align 8
.LCPI0_0:
.quad -4620693217682128896 # double -0.5
.LCPI0_1:
.quad -9223372036854775808 # double -0
.text
.globl main
.align 16, 0x90
.type main, at function
main: # @main
.cfi_startproc
# BB#0:
vmovsd g, %xmm0
vmulsd .LCPI0_0, %xmm0, %xmm0...
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...;>>> .quad 11 # 0xb
>>>>> .quad 12 # 0xc
>>>>> .quad 13 # 0xd
>>>>> .quad 14 # 0xe
>>>>> .quad 15 # 0xf
>>>>> .LCPI0_1:
>>>>> .quad 0 # 0x0
>>>>> .quad 1 # 0x1
>>>>> .quad 2 # 0x2
>>>>> .quad 3 # 0x3
>>>>> .quad 4 # 0x4
>>>&...
2015 Jul 14
4
[LLVMdev] Poor register allocation (constants causing spilling)
...gh the constant is clearly rematerializable, the allocator has
gone to some length to keep it in a register, and it has spilled a
value to the stack. It would have been cheaper to simply fold the
constant load into the 3 uses.
This is not the only example. Later on we can see this:
vmovaps .LCPI0_1(%rip), %xmm6 # xmm6 = [2147483648,2147483648,...]
vxorps %xmm6, %xmm2, %xmm3
...
vandps %xmm6, %xmm5, %xmm2
...
vmovaps %xmm1, -56(%rsp) # 16-byte Spill
vmovaps %xmm6, %xmm1
...
vmovaps -56(%rsp), %xmm0 # 16-byte Reload
...
vxorps %xmm1, %x...