Displaying 20 results from an estimated 88 matches for "p2align".
2017 Aug 21
3
DragonEgg for GCC v8.x and LLVM v6.x is just able to work
....file "hello.s"
# Start of file scope inline
assembly
.ident "GCC: (GNU) 6.4.1 20170727 (Red Hat 6.4.1-1) LLVM: 3.9.1"
# End of file scope inline assembly
.globl foo
.p2align 4, 0x90
.type foo, at function
foo: # @foo
.cfi_startproc
# BB#0: # %entry
pushq %rbp
.Ltmp0:
.cfi_def_cfa_offset 16
.Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
.Ltmp2:
.cfi_def_cfa_r...
2005 Jul 31
1
Updating to nlme 3.1-62 failing from source (OS X)
...:** Removing '/Library/Frameworks/
R.framework/Versions/2.1.1/Resources/library/nlme'
** Restoring previous '/Library/Frameworks/R.framework/Versions/2.1.1/
Resources/library/nlme'
The downloaded packages are in
/private/tmp/RtmpL1j8Sa/downloaded_packages
Unknown pseudo-op: .p2align
/var/tmp//ccOTBSDa.s:6088:Rest of line ignored. 1st junk character
valued 50 (2).
/var/tmp//ccOTBSDa.s:6414:Unknown pseudo-op: .p2align
/var/tmp//ccOTBSDa.s:6414:Rest of line ignored. 1st junk character
valued 50 (2).
/var/tmp//ccOTBSDa.s:6608:Unknown pseudo-op: .p2align
/var/tmp//ccOTBSDa.s:66...
2017 Jul 12
5
[LLD] Linker Relaxation
...ks but isn't
optimal. Since we intend to contribute this target back to upstream
later on, we'd like to discuss how this should be properly handled.
Note that RISC-V also handles alignment as part of relaxation, so it
isn't really optional. For example:
_start:
mv a0, a0
.p2align 2
li a0, 0
The assembler inserts a 3-byte padding (note: this behavior isn't
merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88):
00000000 <_start>:
0: 852a mv a0,a0
2: 00 01 00 # R_RISCV_ALIGN...
2011 Mar 24
2
[LLVMdev] GCC vs. LLVM difference on simple code example
...I had always thought that
it was legal to hoist the load of a global variable outside of the loop as
long as it was not declared volatile....
Here is the x86 assembly code generated by gcc 4.5.2. The load of "b" is
highlighted:
.file "foo.c"
.text
.p2align 4,,15
.globl foo
.type foo, @function
foo:
* movl b, %ecx*
movl $1, %eax
movl a, %edx
pushl %ebp
movl %esp, %ebp
.p2align 4,,7
.p2align 3
.L2:
addl (%ecx,%eax,4), %edx
addl $1, %eax
cmpl...
2011 Dec 14
2
[LLVMdev] Failure to optimize ? operator
...e same experiment with gcc I get identical code for the two functions:
==============================================
_f1: pushl %ebp xorl %eax, %eax movl
%esp, %ebp movl 8(%ebp), %edx testl %edx, %edx
jle L5 popl %ebp ret .p2align 4,,7L5:
movl %edx, %ecx imull %edx, %ecx popl %ebp
leal 3(%ecx,%ecx,4), %eax imull %edx, %eax
leal 1(%eax,%ecx,2), %eax ret .p2align 4,,15
_f2:
pushl %ebp xorl %eax, %eax movl %esp,
%ebp movl 8(%ebp)...
2020 Jun 11
2
Issue with __attribute__((constructor)) and -Os -fno-common
...ough to cause the issue.
----8<--------8<--------8<--------8<--------8<--------8<--------
$ clang --target=arm-linux-gnueabihf -Os -fno-common -S ctor.c \
-o /dev/stdout | grep init_fn
$ clang --target=arm-linux-gnueabihf -Os -S ctor.c \
-o /dev/stdout | grep init_fn
.p2align 2 @ -- Begin function init_fn
.type init_fn,%function
.code 32 @ @init_fn
init_fn:
.size init_fn, .Lfunc_end0-init_fn
.long init_fn(target1)
.addrsig_sym init_fn
$ clang --target=arm-linux-gnueabihf -fno-commo...
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...>> here .s file: * the code that i want to ask is in red color.*
>>>>>
>>>>> .text
>>>>> .intel_syntax noprefix
>>>>> .file "matn_o3.ll"
>>>>> .section .rodata,"a", at progbits
>>>>> .p2align 6
>>>>> .LCPI0_0:
>>>>> .quad 8 # 0x8
>>>>> .quad 9 # 0x9
>>>>> .quad 10 # 0xa
>>>>> .quad 11 # 0xb
>>>>> .quad 12...
2018 Sep 20
3
Comparing Clang and GCC: only clang stores updated value in each iteration.
...nd
stores it then once to 'a'.
/Jonas
int a = 1;
void b() {
do
if (a)
a++;
while (a != 0);
}
bin/clang -O3 -march=z13 -mllvm -unroll-count=1
.text
.file "testfun.i"
.globl b # -- Begin function b
.p2align 4
.type b, at function
b: # @b
# %bb.0: # %entry
lrl %r0, a
.LBB0_1: # %do.body
# =>This Inner Loop Header: Depth=1
...
2017 Nov 21
2
question about xray tls data initialization
...y_fn_idx[]
I replace them with __declspec(selectany) , but I'm not sure they
have same meanings.
some random generated code:
.text
.intel_syntax noprefix
.def call;
.scl 2;
.type 32;
.endef
.globl call # -- Begin function call
.p2align 4, 0x90
call: # @call
.seh_proc call
# BB#0: # %entry
.p2align 1, 0x90
.Lxray_sled_0:
.ascii "\353\t"
nop word ptr [rax + rax + 512]
sub rsp, 16
.seh_stackalloc 16
.seh_endprologue...
2018 Sep 11
3
OpenJDK8 failed to work after compiled by LLVM 8 for X86
Hi Dimitry,
Thanks for your kind response!
Thanks for the commit message of Jung's patch, I found that the bug had
been fixed in OpenJDK 12 by Zhengyu
https://bugs.openjdk.java.net/browse/JDK-8205965 But only backported to
11. So Jung could backport it for OpenJDK 8, thanks a lot!
But I argue that the root cause might be in the compiler side, why
clang-3.9.1, gcc-6.4.1 couldn't
2020 Jun 12
2
Issue with __attribute__((constructor)) and -Os -fno-common
...lt;--------8<--------8<--------8<--------8<--------
>> $ clang --target=arm-linux-gnueabihf -Os -fno-common -S ctor.c \
>> -o /dev/stdout | grep init_fn
>> $ clang --target=arm-linux-gnueabihf -Os -S ctor.c \
>> -o /dev/stdout | grep init_fn
>> .p2align 2 @ -- Begin function init_fn
>> .type init_fn,%function
>> .code 32 @ @init_fn
>> init_fn:
>> .size init_fn, .Lfunc_end0-init_fn
>> .long init_fn(target1)
>> .addrsig_sym i...
2013 Sep 04
6
[LLVMdev] Aliasing of volatile and non-volatile
...ne .LBB0_1
.LBB0_2: # %for.end
ret
.Ltmp0:
.size foo, .Ltmp0-foo
.cfi_endproc
.section ".note.GNU-stack","", at progbits
For comparison, GCC has only one load in the loop:
.text
.p2align 4,,15
.globl foo
.type foo, @function
foo:
.LFB0:
.cfi_startproc
xorl %eax, %eax
testl %edx, %edx
jle .L3
movl (%rdi), %r8d
xorl %ecx, %ecx
.p2align 4,,10
.p2align 3
.L4:
movl (%rsi), %edi...
2017 Jul 17
2
A bug related with undef value when bootstrap MemorySSA.cpp
...mpiled your code (1.c) with LLVM r308173 with the 5 patches applied,
and it generated assembly like this. Now it contains store to c(%rip).
It tries to store a(%rip) + b(%rip) to c(%rip). I wish this translation is
now correct.
```
73 .globl hoo # -- Begin function hoo
74 .p2align 4, 0x90
75 .type hoo, at function
76 hoo: # @hoo
77 .cfi_startproc
78 # BB#0:
79 movq a(%rip), %rax
80 movq cnt(%rip), %rcx
81 cmpq $0, i_hasval(%rip)
82 sete %sil
83 xorl %edx, %edx
84 .p2align 4, 0x90
85 .LBB1_1:...
2011 Dec 14
0
[LLVMdev] Failure to optimize ? operator
On Tue, Dec 13, 2011 at 5:59 AM, Brent Walker <brenthwalker at gmail.com> wrote:
> The following seemingly identical functions, get compiled to quite
> different machine code. The first is correctly optimized (the
> computation of var y is nicely moved into the else branch of the "if"
> statement), which the second one is not (the full computation of var y
> is
2012 Jan 04
1
[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?
...B#2:
xorl %eax, %eax
ret
.data
.globl _DA # @DA
.align 8
_DA:
.quad 4599075939470750515 # double 3.000000e-01
.comm _Y,800,3 # @Y
.comm _X,800,3 # @X
gcc -S -O3 -o test2.s test.c -march=native
result:
.file "test.c"
.text
.p2align 4,,15
.globl _f
.def _f; .scl 2; .type 32; .endef
_f:
pushl %ebp
movddup _DA, %xmm2
movl %esp, %ebp
xorl %eax, %eax
.p2align 4,,10
L2:
movapd _Y(%eax), %xmm0
movapd _X(%eax), %xmm1
mulpd %xmm2, %xmm1
subpd %xmm1, %xmm0
movapd %xmm0, _Y(%eax)
addl $16, %eax
cmpl $800, %eax
jne L2
xorw...
2016 Jun 30
4
Help required regarding IPRA and Local Function optimization
...t;, "~{rbx}"() #0
ret void
}
define internal void @foo() #0 {
call void asm sideeffect "movl %r14d, %r15d", "~{r15}"() #0
ret void
}
and its generated assembly code when IPRA enabled:
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 12
.p2align 4, 0x90
_foo: ## @foo
.cfi_startproc
## BB#0:
## InlineAsm Start
movl %r14d, %r15d
## InlineAsm End
retq
.cfi_endproc
.globl _bar
.p2align 4, 0x90
_bar: ## @bar
.cfi_startproc
## BB#0:
pushq %r15
Ltmp0:
.cfi_def_cfa_offset 16
push...
2019 Aug 08
2
Suboptimal code generated by clang+llc in quite a common scenario (?)
...t %esi, -12
movb 16(%ebp), %al
movb 12(%ebp), %cl
movb 8(%ebp), %dl
movl _scscx, %esi
movb %dl, (%esi)
movl _scscx, %edx
movb %cl, 1(%edx)
movl _scscx, %ecx
movb %al, 2(%ecx)
xorl %eax, %eax
popl %esi
popl %ebp
retl
.cfi_endproc
.comm _pp,3,0
.section __DATA,__data
.globl _scscx
.p2align 3
_scscx:
.long _pp
Again, the _scscx is loaded three times instead of reusing a register, which is suboptimal.
NOW, if I replace the original code by this:
int pp[3];
int *scscx = pp;
int tst( int i, int j, int k )
{
scscx[0] = i;
scscx[1] = j;
scscx[2] = k;
return 0;
}
I get the f...
2017 Jul 17
3
A bug related with undef value when bootstrap MemorySSA.cpp
...plied,
>> and it generated assembly like this. Now it contains store to c(%rip).
>> It tries to store a(%rip) + b(%rip) to c(%rip). I wish this translation is
>> now correct.
>>
>> ```
>> 73 .globl hoo # -- Begin function hoo
>> 74 .p2align 4, 0x90
>> 75 .type hoo, at function
>> 76 hoo: # @hoo
>> 77 .cfi_startproc
>> 78 # BB#0:
>> 79 movq a(%rip), %rax
>> 80 movq cnt(%rip), %rcx
>> 81 cmpq $0, i_hasval(%rip)
>> 82 sete %sil
>> 83...
2017 Nov 16
2
question about xray tls data initialization
I'm learning the xray library and try if it can be built on windows, in
xray_fdr_logging_impl.h
line 152 , comment written as
// Using pthread_once(...) to initialize the thread-local data structures
but at line 175, 183, code written as
thread_local pthread_key_t key;
// Ensure that we only actually ever do the pthread initialization once.
thread_local bool UNUSED Unused = [] {
2013 Sep 07
0
[LLVMdev] Aliasing of volatile and non-volatile
...# %for.end
> ret
> .Ltmp0:
> .size foo, .Ltmp0-foo
> .cfi_endproc
> .section ".note.GNU-stack","", at progbits
>
>
> For comparison, GCC has only one load in the loop:
>
> .text
> .p2align 4,,15
> .globl foo
> .type foo, @function
> foo:
> .LFB0:
> .cfi_startproc
> xorl %eax, %eax
> testl %edx, %edx
> jle .L3
> movl (%rdi), %r8d
> xorl %ecx, %ecx
> .p2align 4,,10
>...