Displaying 20 results from an estimated 25 matches for "upper16".
2011 Nov 12
2
[LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels
...tinuously]":
Ltmp265:
Lfunc_begin24:
.loc 1 380 0
.loc 1 380 1 prologue_end
push {r4, r5, r6, r7, lr}
add r7, sp, #12
push.w {r8, r10, r11}
vpush {d8}
sub sp, #4
.loc 1 382 2
Ltmp266:
movw r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_7-(LPC24_0+4))
Ltmp267:
mov r4, r0
Ltmp268:
movt r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_7-(LPC24_0+4))
movw r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_62-(LPC24_1+4))
movt r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_62-(LPC24_1+4))
LPC24_0:
add r1, pc
LPC24_1:
add r0, pc
ldr r1, [r1]
ldr r0, [r0]
blx _objc_msgSend
movw r1, :lower16:(L_OBJC_SELECTOR...
2011 Feb 18
0
[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions
On Feb 17, 2011, at 10:35 PM, Вадим Марковцев wrote:
> Hello everyone,
>
> I've added the "S" suffixed versions of ARM and Thumb2 instructions to tablegen. Those are, for example, "movs" or "muls".
> Of course, some instructions have already had their twins, such as add/adds, and I leaved them untouched.
Adding separate "s" instructions is
2011 Feb 18
2
[LLVMdev] Adding "S" suffixed ARM/Thumb2 instructions
Hello everyone,
I've added the "S" suffixed versions of ARM and Thumb2 instructions to
tablegen. Those are, for example, "movs" or "muls".
Of course, some instructions have already had their twins, such as add/adds,
and I leaved them untouched.
Besides, I propose the codegen optimization based on them, which removes the
redundant comparison in patterns like
orr
2010 Nov 12
2
[LLVMdev] Simple NEON optimization
...es (just
a bit) the code generated.
The case is simple:
uint32x2_t x, res;
res = vceq_u32(x, vcreate_u32(0));
This will generate the following code:
; zero d16
vmov.i32 d16, #0x0
; load a into d17
movw r0, :lower16:a
movt r0, :upper16:a
vld1.32 {d17}, [r0]
; compare two registers
vceq.i32 d17, d17, d16
But, because the vector is zero, and there is a NEON instruction to
compare against an immediate zero (VCEQZ), we could combine the two
instructions:
; load a into d17
movw r0, :...
2013 Feb 03
2
[LLVMdev] A bug in LLVM-GCC 4.2 with inlining __exchange_and_add
...add r7, sp, #1200000004 e92d0d00 stmdb sp!, {r8, sl, fp}00000008 ed2d8b10 vstmdb sp!, {d8-d15}0000000c b094 sub sp, #800000000e f2405088 movw r0, :lower16:__ZN5boost10statechart6detail9id_holderI10EvActivateE11idProvider_E-0x24+0xfffffffc00000012 2300 movs r3, #000000014 f2c00000 movt r0, :upper16:__ZN5boost10statechart6detail9id_holderI10EvActivateE11idProvider_E-0x24+0xfffffffc00000018 f2407140 movw r1, :lower16:0x770-0x2c+0xfffffffc0000001c f2c00100 movt r1, :upper16:0x770-0x2c+0xfffffffc00000020 f24052c8 movw r2, :lower16:__ZTV10EvActivate-0x34+0xfffffffc00000024 4478 add r0, pc00000...
2014 Sep 11
2
[LLVMdev] Is shortening a load a bug?
...32-i64:32:32"
target triple = "thumbv7m-unknown-unknown"
@f = external global i32
define zeroext i8 @bar() nounwind {
L.0:
%rv.0 = alloca i8
%0 = load i32* @f
%1 = trunc i32 %0 to i8
ret i8 %1
}
----
Which for the arm cortex-m3 generates:
----
bar:
movw r0, :lower16:f
movt r0, :upper16:f
ldrb r0, [r0]
bx lr
----
Although we are only interested in low 8-bits, the load MUST be a 32-bit load.
Using a "load volatile" fixes this, but this is overkill as the memory location
is not volatile.
Am I missing something, or is this a bug?
brian
2013 Oct 15
0
[LLVMdev] MI scheduler produce badly code with inline function
On Oct 14, 2013, at 3:27 AM, Zakk <zakk0610 at gmail.com> wrote:
> Hi all,
> I meet this problem when compiling the TREAM benchmark (http://www.cs.virginia.edu/stream/FTP/Code/) with enable-misched
>
> The small function will be scheduled as good code, but if opt inline this function, the inline part will be scheduled as bad code.
A bug for this is welcome. Pretty soon, I’ll
2013 Oct 14
2
[LLVMdev] MI scheduler produce badly code with inline function
Hi all,
I meet this problem when compiling the TREAM benchmark (
http://www.cs.virginia.edu/stream/FTP/Code/) with enable-misched
The small function will be scheduled as good code, but if opt inline this
function, the inline part will be scheduled as bad code.
so I rewrite a simple code as attached link (foo.c), and compiled with two
different methods:
*method A:*
*$clang -O3 foo.c -static -S
2011 Jan 10
2
[LLVMdev] ARM/MC/ELF Support for pcrel movw/movt coming soon
...e nontrivial, so what I will do is
commit the "hack" patch to 8721 separately, and then the main patch,
as 8721 is blocking the testing.
The interim hack for 8721 can then be rolled back separately once
someone (ddunbar? pdox? me? :) get around to refactoring MCExpr so
that :lower16: and :upper16: can apply to arbitrary expressions.
Thanks
-jason
2011 Jan 10
2
[LLVMdev] ARM/MC/ELF Support for pcrel movw/movt coming soon
...te subject?
Hi Renato,
If I am understanding you correctly, then the answer is no, because .s
output doesn't care about relocations per se... BUT..
its also yes because sometimes, the asmwriter will sometimes need to
generate sequences like below
foo:
movw r0, :lower16:bar-foo
movt r0, :upper16:bar-foo
The subtraction implies that the value bar-foo is implicitly
pc-relative (at least according to GNU as).
Thanks!
-jason
2010 Nov 12
0
[LLVMdev] Simple NEON optimization
...simple:
>
> uint32x2_t x, res;
> res = vceq_u32(x, vcreate_u32(0));
>
> This will generate the following code:
>
> ; zero d16
> vmov.i32 d16, #0x0
> ; load a into d17
> movw r0, :lower16:a
> movt r0, :upper16:a
> vld1.32 {d17}, [r0]
> ; compare two registers
> vceq.i32 d17, d17, d16
>
> But, because the vector is zero, and there is a NEON instruction to
> compare against an immediate zero (VCEQZ), we could combine the two
> instructions:
>
>...
2015 Apr 20
2
[LLVMdev] question about alignment of structures on the stack (arm 32)
Dear community,
I faced with code which was generated by llvm, assembly instructions of that code is relying on 8-bytes alignment for structures on the stack.
The part of Objective C code is following:
-(void)getCharacters:(unichar *)unicode {
NSRange range;
range.location = 0;
range.length = [self length];
printf("%p, %p\n", &range.location, &range.length);
And
2010 Nov 17
1
[LLVMdev] [llvm-commits] [patch] ARM/MC/ELF add new stub for movt/movw in ARMFixupKinds
...hod string to declare a special case handler.
At the current time, for the assembly printing,
MCAsmStreamer::EmitInstruction(const MCInst &Inst) calls out to
MCExpr::print(raw_ostream &OS)
which then calls out to MCSymbolRefExpr::getVariantKindName() to
print the magic :lower16: and :upper16: asm tags for .s emission
Currently, movt/movw emission works correctly in .s, but not in .o emission
This lead me to believe that the correct place to put the code to handle
MCSymbolRefExpr::VK_ARM_(HI||LO)16 for the .o path was to place a case
in getMachineOpValue() (i.e. not
ARMMCCodeEmitter::g...
2016 Jun 02
2
PBQP register allocation and copy propagation
...BQP allocator for Thumb-2 and have ran into a
problem I'd love to get your input on.
The problem is exemplfied in the codegen for the function @bar in the
attached IR file:
bar:
push {r4, lr}
sub sp, #12
(1) movw r2, :lower16:.L_MergedGlobals
(1) movt r2, :upper16:.L_MergedGlobals
ldm.w r2, {r0, r1, r3, r12, lr}
ldrd r4, r2, [r2, #20]
strd lr, r4, [sp]
str r2, [sp, #8]
(2) mov r2, r3 ****
mov r3, r12 ****
bl baz
add sp, #12
pop {r4, pc}
The tw...
2013 Oct 16
3
[LLVMdev] MI scheduler produce badly code with inline function
....c -static -S -o foo.s -mllvm -unroll-count=4
-mcpu=cortex-a9 -fno-vectorize -fno-slp-vectorize --target=arm
-mfloat-abi=hard -mllvm -enable-misched -mllvm -scheditins=false
per-operand cost model :
Scale:
push {lr}
movw r12, :lower16:c
movw lr, :lower16:b
movw r3, #9216
movt r12, :upper16:c
mov r1, #0
vmov.f64 d16, #3.000000e+00
movt lr, :upper16:b
movt r3, #244
.LBB0_1:
add r0, r12, r1
add r2, lr, r1
*vldr d17, [r0]*
add r1, r1, #32
vmul.f64 d17, d17, d16
cmp r1, r3
vstr d17, [r2]
* vldr d17, [r0, #8]*
vmul.f64 d17, d17, d16
* * vstr d17, [r2, #8]...
2015 Apr 21
2
[LLVMdev] question about alignment of structures on the stack (arm 32)
...t: armv7l-unknown-linux-gnueabi
Thread model: posix
-----
And we get following code of assembler language:
main:
push {r11, lr}
mov r11, sp
sub sp, sp, #24
mov r0, #0
str r0, [r11, #-4]
add r1, sp, #8
movw r2, :lower16:.Lmain.mStruct
movt r2, :upper16:.Lmain.mStruct
vldr d16, [r2]
vstr d16, [sp, #8]
orr r2, r1, #4
movw r3, :lower16:.L.str
movt r3, :upper16:.L.str
str r0, [sp, #4]
mov r0, r3
bl printf
ldr r1, [sp, #4]
str r0, [sp]
mov r0, r1
mov sp, r11
pop ...
2011 Jan 10
0
[LLVMdev] ARM/MC/ELF Support for pcrel movw/movt coming soon
On 10 January 2011 22:59, Jason Kim <jasonwkim at google.com> wrote:
> Hi everyone, happy new year.
>
> This note is to announce that support for PC relative reloc tags for
> movw/movt is nearing completion (hopefully <48hrs!). This work is is
> from Jan Voung, David Meyer and myself.
Hi Jason,
Happy new year!
That seems a long patch... with many changes... can't
2011 Jan 11
0
[LLVMdev] ARM/MC/ELF Support for pcrel movw/movt coming soon
...arget1) and exception
handling table symbols (prel31) are clearly disregarded by gas and
subsequently discarded by armlink.
> its also yes because sometimes, the asmwriter will sometimes need to
> generate sequences like below
>
> foo:
> movw r0, :lower16:bar-foo
> movt r0, :upper16:bar-foo
>
> The subtraction implies that the value bar-foo is implicitly
> pc-relative (at least according to GNU as).
That was the other part of my question: will your new
MC-relocationator also print the current ASM relocations? ;)
cheers,
--renato
2016 Jun 03
2
PBQP register allocation and copy propagation
...BQP allocator for Thumb-2 and have ran into a problem I'd love to get your input on.
The problem is exemplfied in the codegen for the function @bar in the attached IR file:
bar:
push {r4, lr}
sub sp, #12
(1) movw r2, :lower16:.L_MergedGlobals
(1) movt r2, :upper16:.L_MergedGlobals
ldm.w r2, {r0, r1, r3, r12, lr}
ldrd r4, r2, [r2, #20]
strd lr, r4, [sp]
str r2, [sp, #8]
(2) mov r2, r3 ****
mov r3, r12 ****
bl baz
add sp, #12
pop {r4, pc}
The tw...
2013 Mar 08
0
[LLVMdev] ARM assembler's syntax in clang
...thumb
.thumb_func
foo:
/* these lines are from compiler's assembly output($(CC) -S):
* extern int data_table[];
* int *wheres_data_table(void) {
* return &data_table[0];
* }
*/
movw r1, :lower16:(L_data_table$non_lazy_ptr-(LPC0_0+4))
movt r1, :upper16:(L_data_table$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
add r1, pc
ldr r1, [r1]
bx lr
.section __DATA,__nl_symbol_ptr,non_lazy_symbol_pointers
.align 2
L_data_table$non_lazy_ptr:
.indirect_symbol _data_table
.long 0
.subsections_via_symbols
/* ==e...