search for: p2align

Displaying 20 results from an estimated 88 matches for "p2align".

2017 Aug 21
3
DragonEgg for GCC v8.x and LLVM v6.x is just able to work
....file "hello.s" # Start of file scope inline assembly .ident "GCC: (GNU) 6.4.1 20170727 (Red Hat 6.4.1-1) LLVM: 3.9.1" # End of file scope inline assembly .globl foo .p2align 4, 0x90 .type foo, at function foo: # @foo .cfi_startproc # BB#0: # %entry pushq %rbp .Ltmp0: .cfi_def_cfa_offset 16 .Ltmp1: .cfi_offset %rbp, -16 movq %rsp, %rbp .Ltmp2: .cfi_def_cfa_r...
2005 Jul 31
1
Updating to nlme 3.1-62 failing from source (OS X)
...:** Removing '/Library/Frameworks/ R.framework/Versions/2.1.1/Resources/library/nlme' ** Restoring previous '/Library/Frameworks/R.framework/Versions/2.1.1/ Resources/library/nlme' The downloaded packages are in /private/tmp/RtmpL1j8Sa/downloaded_packages Unknown pseudo-op: .p2align /var/tmp//ccOTBSDa.s:6088:Rest of line ignored. 1st junk character valued 50 (2). /var/tmp//ccOTBSDa.s:6414:Unknown pseudo-op: .p2align /var/tmp//ccOTBSDa.s:6414:Rest of line ignored. 1st junk character valued 50 (2). /var/tmp//ccOTBSDa.s:6608:Unknown pseudo-op: .p2align /var/tmp//ccOTBSDa.s:66...
2017 Jul 12
5
[LLD] Linker Relaxation
...ks but isn't optimal. Since we intend to contribute this target back to upstream later on, we'd like to discuss how this should be properly handled. Note that RISC-V also handles alignment as part of relaxation, so it isn't really optional. For example: _start: mv a0, a0 .p2align 2 li a0, 0 The assembler inserts a 3-byte padding (note: this behavior isn't merged yet, see: https://github.com/riscv/riscv-binutils-gdb/pull/88): 00000000 <_start>: 0: 852a mv a0,a0 2: 00 01 00 # R_RISCV_ALIGN...
2011 Mar 24
2
[LLVMdev] GCC vs. LLVM difference on simple code example
...I had always thought that it was legal to hoist the load of a global variable outside of the loop as long as it was not declared volatile.... Here is the x86 assembly code generated by gcc 4.5.2. The load of "b" is highlighted: .file "foo.c" .text .p2align 4,,15 .globl foo .type foo, @function foo: * movl b, %ecx* movl $1, %eax movl a, %edx pushl %ebp movl %esp, %ebp .p2align 4,,7 .p2align 3 .L2: addl (%ecx,%eax,4), %edx addl $1, %eax cmpl...
2011 Dec 14
2
[LLVMdev] Failure to optimize ? operator
...e same experiment with gcc I get identical code for the two functions: ============================================== _f1:        pushl   %ebp        xorl    %eax, %eax        movl %esp, %ebp        movl    8(%ebp), %edx        testl   %edx, %edx   jle     L5        popl    %ebp        ret        .p2align 4,,7L5:     movl    %edx, %ecx        imull   %edx, %ecx        popl    %ebp     leal    3(%ecx,%ecx,4), %eax        imull   %edx, %eax leal    1(%eax,%ecx,2), %eax        ret        .p2align 4,,15 _f2:         pushl   %ebp        xorl    %eax, %eax        movl    %esp, %ebp        movl    8(%ebp)...
2020 Jun 11
2
Issue with __attribute__((constructor)) and -Os -fno-common
...ough to cause the issue. ----8<--------8<--------8<--------8<--------8<--------8<-------- $ clang --target=arm-linux-gnueabihf -Os -fno-common -S ctor.c \ -o /dev/stdout | grep init_fn $ clang --target=arm-linux-gnueabihf -Os -S ctor.c \ -o /dev/stdout | grep init_fn .p2align 2 @ -- Begin function init_fn .type init_fn,%function .code 32 @ @init_fn init_fn: .size init_fn, .Lfunc_end0-init_fn .long init_fn(target1) .addrsig_sym init_fn $ clang --target=arm-linux-gnueabihf -fno-commo...
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...>> here .s file: * the code that i want to ask is in red color.* >>>>> >>>>> .text >>>>> .intel_syntax noprefix >>>>> .file "matn_o3.ll" >>>>> .section .rodata,"a", at progbits >>>>> .p2align 6 >>>>> .LCPI0_0: >>>>> .quad 8 # 0x8 >>>>> .quad 9 # 0x9 >>>>> .quad 10 # 0xa >>>>> .quad 11 # 0xb >>>>> .quad 12...
2018 Sep 20
3
Comparing Clang and GCC: only clang stores updated value in each iteration.
...nd stores it then once to 'a'. /Jonas int a = 1; void b() {   do     if (a)       a++;   while (a != 0); } bin/clang -O3 -march=z13 -mllvm -unroll-count=1         .text         .file   "testfun.i"         .globl  b                       # -- Begin function b         .p2align        4         .type   b, at function b:                                      # @b # %bb.0:                                # %entry         lrl     %r0, a .LBB0_1:                                # %do.body                                         # =>This Inner Loop Header: Depth=1        ...
2017 Nov 21
2
question about xray tls data initialization
...y_fn_idx[] I replace them with __declspec(selectany) , but I'm not sure they have same meanings. some random generated code: .text .intel_syntax noprefix .def call; .scl 2; .type 32; .endef .globl call # -- Begin function call .p2align 4, 0x90 call: # @call .seh_proc call # BB#0: # %entry .p2align 1, 0x90 .Lxray_sled_0: .ascii "\353\t" nop word ptr [rax + rax + 512] sub rsp, 16 .seh_stackalloc 16 .seh_endprologue...
2018 Sep 11
3
OpenJDK8 failed to work after compiled by LLVM 8 for X86
Hi Dimitry, Thanks for your kind response! Thanks for the commit message of Jung's patch, I found that the bug had been fixed in OpenJDK 12 by Zhengyu https://bugs.openjdk.java.net/browse/JDK-8205965 But only backported to 11. So Jung could backport it for OpenJDK 8, thanks a lot! But I argue that the root cause might be in the compiler side, why clang-3.9.1, gcc-6.4.1 couldn't
2020 Jun 12
2
Issue with __attribute__((constructor)) and -Os -fno-common
...lt;--------8<--------8<--------8<--------8<-------- >> $ clang --target=arm-linux-gnueabihf -Os -fno-common -S ctor.c \ >> -o /dev/stdout | grep init_fn >> $ clang --target=arm-linux-gnueabihf -Os -S ctor.c \ >> -o /dev/stdout | grep init_fn >> .p2align 2 @ -- Begin function init_fn >> .type init_fn,%function >> .code 32 @ @init_fn >> init_fn: >> .size init_fn, .Lfunc_end0-init_fn >> .long init_fn(target1) >> .addrsig_sym i...
2013 Sep 04
6
[LLVMdev] Aliasing of volatile and non-volatile
...ne .LBB0_1 .LBB0_2: # %for.end ret .Ltmp0: .size foo, .Ltmp0-foo .cfi_endproc .section ".note.GNU-stack","", at progbits For comparison, GCC has only one load in the loop: .text .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc xorl %eax, %eax testl %edx, %edx jle .L3 movl (%rdi), %r8d xorl %ecx, %ecx .p2align 4,,10 .p2align 3 .L4: movl (%rsi), %edi...
2017 Jul 17
2
A bug related with undef value when bootstrap MemorySSA.cpp
...mpiled your code (1.c) with LLVM r308173 with the 5 patches applied, and it generated assembly like this. Now it contains store to c(%rip). It tries to store a(%rip) + b(%rip) to c(%rip). I wish this translation is now correct. ``` 73 .globl hoo # -- Begin function hoo 74 .p2align 4, 0x90 75 .type hoo, at function 76 hoo: # @hoo 77 .cfi_startproc 78 # BB#0: 79 movq a(%rip), %rax 80 movq cnt(%rip), %rcx 81 cmpq $0, i_hasval(%rip) 82 sete %sil 83 xorl %edx, %edx 84 .p2align 4, 0x90 85 .LBB1_1:...
2011 Dec 14
0
[LLVMdev] Failure to optimize ? operator
On Tue, Dec 13, 2011 at 5:59 AM, Brent Walker <brenthwalker at gmail.com> wrote: > The following seemingly identical functions, get compiled to quite > different machine code.  The first is correctly optimized (the > computation of var y is nicely moved into the else branch of the "if" > statement), which the second one is not (the full computation of var y > is
2012 Jan 04
1
[LLVMdev] How can I compile a c source file to use SSE2 Data Movement Instructions?
...B#2: xorl %eax, %eax ret .data .globl _DA # @DA .align 8 _DA: .quad 4599075939470750515 # double 3.000000e-01 .comm _Y,800,3 # @Y .comm _X,800,3 # @X gcc -S -O3 -o test2.s test.c -march=native result: .file "test.c" .text .p2align 4,,15 .globl _f .def _f; .scl 2; .type 32; .endef _f: pushl %ebp movddup _DA, %xmm2 movl %esp, %ebp xorl %eax, %eax .p2align 4,,10 L2: movapd _Y(%eax), %xmm0 movapd _X(%eax), %xmm1 mulpd %xmm2, %xmm1 subpd %xmm1, %xmm0 movapd %xmm0, _Y(%eax) addl $16, %eax cmpl $800, %eax jne L2 xorw...
2016 Jun 30
4
Help required regarding IPRA and Local Function optimization
...t;, "~{rbx}"() #0 ret void } define internal void @foo() #0 { call void asm sideeffect "movl %r14d, %r15d", "~{r15}"() #0 ret void } and its generated assembly code when IPRA enabled: .section __TEXT,__text,regular,pure_instructions .macosx_version_min 10, 12 .p2align 4, 0x90 _foo: ## @foo .cfi_startproc ## BB#0: ## InlineAsm Start movl %r14d, %r15d ## InlineAsm End retq .cfi_endproc .globl _bar .p2align 4, 0x90 _bar: ## @bar .cfi_startproc ## BB#0: pushq %r15 Ltmp0: .cfi_def_cfa_offset 16 push...
2019 Aug 08
2
Suboptimal code generated by clang+llc in quite a common scenario (?)
...t %esi, -12 movb 16(%ebp), %al movb 12(%ebp), %cl movb 8(%ebp), %dl movl _scscx, %esi movb %dl, (%esi) movl _scscx, %edx movb %cl, 1(%edx) movl _scscx, %ecx movb %al, 2(%ecx) xorl %eax, %eax popl %esi popl %ebp retl .cfi_endproc .comm _pp,3,0 .section __DATA,__data .globl _scscx .p2align 3 _scscx: .long _pp Again, the _scscx is loaded three times instead of reusing a register, which is suboptimal. NOW, if I replace the original code by this: int pp[3]; int *scscx = pp; int tst( int i, int j, int k ) { scscx[0] = i; scscx[1] = j; scscx[2] = k; return 0; } I get the f...
2017 Jul 17
3
A bug related with undef value when bootstrap MemorySSA.cpp
...plied, >> and it generated assembly like this. Now it contains store to c(%rip). >> It tries to store a(%rip) + b(%rip) to c(%rip). I wish this translation is >> now correct. >> >> ``` >> 73 .globl hoo # -- Begin function hoo >> 74 .p2align 4, 0x90 >> 75 .type hoo, at function >> 76 hoo: # @hoo >> 77 .cfi_startproc >> 78 # BB#0: >> 79 movq a(%rip), %rax >> 80 movq cnt(%rip), %rcx >> 81 cmpq $0, i_hasval(%rip) >> 82 sete %sil >> 83...
2017 Nov 16
2
question about xray tls data initialization
I'm learning the xray library and try if it can be built on windows, in xray_fdr_logging_impl.h line 152 , comment written as // Using pthread_once(...) to initialize the thread-local data structures but at line 175, 183, code written as thread_local pthread_key_t key; // Ensure that we only actually ever do the pthread initialization once. thread_local bool UNUSED Unused = [] {
2013 Sep 07
0
[LLVMdev] Aliasing of volatile and non-volatile
...# %for.end > ret > .Ltmp0: > .size foo, .Ltmp0-foo > .cfi_endproc > .section ".note.GNU-stack","", at progbits > > > For comparison, GCC has only one load in the loop: > > .text > .p2align 4,,15 > .globl foo > .type foo, @function > foo: > .LFB0: > .cfi_startproc > xorl %eax, %eax > testl %edx, %edx > jle .L3 > movl (%rdi), %r8d > xorl %ecx, %ecx > .p2align 4,,10 >...