search for: upper16

Displaying 20 results from an estimated 25 matches for "upper16".

2011 Nov 12
2
[LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels
...tinuously]": Ltmp265: Lfunc_begin24: .loc 1 380 0 .loc 1 380 1 prologue_end push {r4, r5, r6, r7, lr} add r7, sp, #12 push.w {r8, r10, r11} vpush {d8} sub sp, #4 .loc 1 382 2 Ltmp266: movw r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_7-(LPC24_0+4)) Ltmp267: mov r4, r0 Ltmp268: movt r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_7-(LPC24_0+4)) movw r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_62-(LPC24_1+4)) movt r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_62-(LPC24_1+4)) LPC24_0: add r1, pc LPC24_1: add r0, pc ldr r1, [r1] ldr r0, [r0] blx _objc_msgSend movw r1, :lower16:(L_OBJC_SELECTOR...
2011 Feb 18
0
[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions
On Feb 17, 2011, at 10:35 PM, Вадим Марковцев wrote: > Hello everyone, > > I've added the "S" suffixed versions of ARM and Thumb2 instructions to tablegen. Those are, for example, "movs" or "muls". > Of course, some instructions have already had their twins, such as add/adds, and I leaved them untouched. Adding separate "s" instructions is
2011 Feb 18
2
[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions
Hello everyone, I've added the "S" suffixed versions of ARM and Thumb2 instructions to tablegen. Those are, for example, "movs" or "muls". Of course, some instructions have already had their twins, such as add/adds, and I leaved them untouched. Besides, I propose the codegen optimization based on them, which removes the redundant comparison in patterns like orr
2010 Nov 12
2
[LLVMdev] Simple NEON optimization
...es (just a bit) the code generated. The case is simple: uint32x2_t x, res; res = vceq_u32(x, vcreate_u32(0)); This will generate the following code: ; zero d16 vmov.i32 d16, #0x0 ; load a into d17 movw r0, :lower16:a movt r0, :upper16:a vld1.32 {d17}, [r0] ; compare two registers vceq.i32 d17, d17, d16 But, because the vector is zero, and there is a NEON instruction to compare against an immediate zero (VCEQZ), we could combine the two instructions: ; load a into d17 movw r0, :...
2013 Feb 03
2
[LLVMdev] A bug in LLVM-GCC 4.2 with inlining __exchange_and_add
...add r7, sp, #1200000004 e92d0d00 stmdb sp!, {r8, sl, fp}00000008 ed2d8b10 vstmdb sp!, {d8-d15}0000000c b094 sub sp, #800000000e f2405088 movw r0, :lower16:__ZN5boost10statechart6detail9id_holderI10EvActivateE11idProvider_E-0x24+0xfffffffc00000012 2300 movs r3, #000000014 f2c00000 movt r0, :upper16:__ZN5boost10statechart6detail9id_holderI10EvActivateE11idProvider_E-0x24+0xfffffffc00000018 f2407140 movw r1, :lower16:0x770-0x2c+0xfffffffc0000001c f2c00100 movt r1, :upper16:0x770-0x2c+0xfffffffc00000020 f24052c8 movw r2, :lower16:__ZTV10EvActivate-0x34+0xfffffffc00000024 4478 add r0, pc00000...
2014 Sep 11
2
[LLVMdev] Is shortening a load a bug?
...32-i64:32:32" target triple = "thumbv7m-unknown-unknown" @f = external global i32 define zeroext i8 @bar() nounwind { L.0: %rv.0 = alloca i8 %0 = load i32* @f %1 = trunc i32 %0 to i8 ret i8 %1 } ---- Which for the arm cortex-m3 generates: ---- bar: movw r0, :lower16:f movt r0, :upper16:f ldrb r0, [r0] bx lr ---- Although we are only interested in low 8-bits, the load MUST be a 32-bit load. Using a "load volatile" fixes this, but this is overkill as the memory location is not volatile. Am I missing something, or is this a bug? brian
2013 Oct 15
0
[LLVMdev] MI scheduler produce badly code with inline function
On Oct 14, 2013, at 3:27 AM, Zakk <zakk0610 at gmail.com> wrote: > Hi all, > I meet this problem when compiling the TREAM benchmark (http://www.cs.virginia.edu/stream/FTP/Code/) with enable-misched > > The small function will be scheduled as good code, but if opt inline this function, the inline part will be scheduled as bad code. A bug for this is welcome. Pretty soon, I’ll
2013 Oct 14
2
[LLVMdev] MI scheduler produce badly code with inline function
Hi all, I meet this problem when compiling the TREAM benchmark ( http://www.cs.virginia.edu/stream/FTP/Code/) with enable-misched The small function will be scheduled as good code, but if opt inline this function, the inline part will be scheduled as bad code. so I rewrite a simple code as attached link (foo.c), and compiled with two different methods: *method A:* *$clang -O3 foo.c -static -S
2011 Jan 10
2
[LLVMdev] ARM/MC/ELF Support for pcrel movw/movt coming soon
...e nontrivial, so what I will do is commit the "hack" patch to 8721 separately, and then the main patch, as 8721 is blocking the testing. The interim hack for 8721 can then be rolled back separately once someone (ddunbar? pdox? me? :) get around to refactoring MCExpr so that :lower16: and :upper16: can apply to arbitrary expressions. Thanks -jason
2011 Jan 10
2
[LLVMdev] ARM/MC/ELF Support for pcrel movw/movt coming soon
...te subject? Hi Renato, If I am understanding you correctly, then the answer is no, because .s output doesn't care about relocations per se... BUT.. its also yes because sometimes, the asmwriter will sometimes need to generate sequences like below foo: movw r0, :lower16:bar-foo movt r0, :upper16:bar-foo The subtraction implies that the value bar-foo is implicitly pc-relative (at least according to GNU as). Thanks! -jason
2010 Nov 12
0
[LLVMdev] Simple NEON optimization
...simple: > > uint32x2_t x, res; > res = vceq_u32(x, vcreate_u32(0)); > > This will generate the following code: > > ; zero d16 > vmov.i32 d16, #0x0 > ; load a into d17 > movw r0, :lower16:a > movt r0, :upper16:a > vld1.32 {d17}, [r0] > ; compare two registers > vceq.i32 d17, d17, d16 > > But, because the vector is zero, and there is a NEON instruction to > compare against an immediate zero (VCEQZ), we could combine the two > instructions: > >...
2015 Apr 20
2
[LLVMdev] question about alignment of structures on the stack (arm 32)
Dear community, I faced with code which was generated by llvm, assembly instructions of that code is relying on 8-bytes alignment for structures on the stack. The part of Objective C code is following: -(void)getCharacters:(unichar *)unicode {     NSRange range;     range.location = 0;     range.length = [self length];     printf("%p, %p\n", &range.location, &range.length); And
2010 Nov 17
1
[LLVMdev] [llvm-commits] [patch] ARM/MC/ELF add new stub for movt/movw in ARMFixupKinds
...hod string to declare a special case handler. At the current time, for the assembly printing, MCAsmStreamer::EmitInstruction(const MCInst &Inst) calls out to MCExpr::print(raw_ostream &OS) which then calls out to MCSymbolRefExpr::getVariantKindName() to print the magic :lower16: and :upper16: asm tags for .s emission Currently, movt/movw emission works correctly in .s, but not in .o emission This lead me to believe that the correct place to put the code to handle MCSymbolRefExpr::VK_ARM_(HI||LO)16 for the .o path was to place a case in getMachineOpValue() (i.e. not ARMMCCodeEmitter::g...
2016 Jun 02
2
PBQP register allocation and copy propagation
...BQP allocator for Thumb-2 and have ran into a problem I'd love to get your input on. The problem is exemplfied in the codegen for the function @bar in the attached IR file: bar: push {r4, lr} sub sp, #12 (1) movw r2, :lower16:.L_MergedGlobals (1) movt r2, :upper16:.L_MergedGlobals ldm.w r2, {r0, r1, r3, r12, lr} ldrd r4, r2, [r2, #20] strd lr, r4, [sp] str r2, [sp, #8] (2) mov r2, r3 **** mov r3, r12 **** bl baz add sp, #12 pop {r4, pc} The tw...
2013 Oct 16
3
[LLVMdev] MI scheduler produce badly code with inline function
....c -static -S -o foo.s -mllvm -unroll-count=4 -mcpu=cortex-a9 -fno-vectorize -fno-slp-vectorize --target=arm -mfloat-abi=hard -mllvm -enable-misched -mllvm -scheditins=false per-operand cost model : Scale: push {lr} movw r12, :lower16:c movw lr, :lower16:b movw r3, #9216 movt r12, :upper16:c mov r1, #0 vmov.f64 d16, #3.000000e+00 movt lr, :upper16:b movt r3, #244 .LBB0_1: add r0, r12, r1 add r2, lr, r1 *vldr d17, [r0]* add r1, r1, #32 vmul.f64 d17, d17, d16 cmp r1, r3 vstr d17, [r2] * vldr d17, [r0, #8]* vmul.f64 d17, d17, d16 * * vstr d17, [r2, #8]...
2015 Apr 21
2
[LLVMdev] question about alignment of structures on the stack (arm 32)
...t: armv7l-unknown-linux-gnueabi Thread model: posix ----- And we get following code of assembler language: main:     push    {r11, lr}     mov    r11, sp     sub    sp, sp, #24     mov    r0, #0     str    r0, [r11, #-4]     add    r1, sp, #8     movw    r2, :lower16:.Lmain.mStruct     movt    r2, :upper16:.Lmain.mStruct     vldr    d16, [r2]     vstr    d16, [sp, #8]     orr    r2, r1, #4     movw    r3, :lower16:.L.str     movt    r3, :upper16:.L.str     str    r0, [sp, #4]     mov    r0, r3     bl    printf     ldr    r1, [sp, #4]     str    r0, [sp]     mov    r0, r1     mov    sp, r11     pop   ...
2011 Jan 10
0
[LLVMdev] ARM/MC/ELF Support for pcrel movw/movt coming soon
On 10 January 2011 22:59, Jason Kim <jasonwkim at google.com> wrote: > Hi everyone, happy new year. > > This note is to announce that support for PC relative reloc tags for > movw/movt is nearing completion (hopefully <48hrs!). This work is is > from Jan Voung, David Meyer and myself. Hi Jason, Happy new year! That seems a long patch... with many changes... can't
2011 Jan 11
0
[LLVMdev] ARM/MC/ELF Support for pcrel movw/movt coming soon
...arget1) and exception handling table symbols (prel31) are clearly disregarded by gas and subsequently discarded by armlink. > its also yes because sometimes, the asmwriter will sometimes need to > generate sequences like below > > foo: >   movw r0, :lower16:bar-foo >   movt r0, :upper16:bar-foo > > The subtraction implies that the value bar-foo is implicitly > pc-relative (at least according to GNU as). That was the other part of my question: will your new MC-relocationator also print the current ASM relocations? ;) cheers, --renato
2016 Jun 03
2
PBQP register allocation and copy propagation
...BQP allocator for Thumb-2 and have ran into a problem I'd love to get your input on. The problem is exemplfied in the codegen for the function @bar in the attached IR file: bar: push {r4, lr} sub sp, #12 (1) movw r2, :lower16:.L_MergedGlobals (1) movt r2, :upper16:.L_MergedGlobals ldm.w r2, {r0, r1, r3, r12, lr} ldrd r4, r2, [r2, #20] strd lr, r4, [sp] str r2, [sp, #8] (2) mov r2, r3 **** mov r3, r12 **** bl baz add sp, #12 pop {r4, pc} The tw...
2013 Mar 08
0
[LLVMdev] ARM assembler's syntax in clang
...thumb .thumb_func foo: /* these lines are from compiler's assembly output($(CC) -S): * extern int data_table[]; * int *wheres_data_table(void) { * return &data_table[0]; * } */ movw r1, :lower16:(L_data_table$non_lazy_ptr-(LPC0_0+4)) movt r1, :upper16:(L_data_table$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r1, pc ldr r1, [r1] bx lr .section __DATA,__nl_symbol_ptr,non_lazy_symbol_pointers .align 2 L_data_table$non_lazy_ptr: .indirect_symbol _data_table .long 0 .subsections_via_symbols /* ==e...