Displaying 20 results from an estimated 27 matches for "ldrh".
Did you mean:
ldr
2017 Dec 06
2
[LLD] Slow callstacks in gdb
...With most current host architectures handling
packed_endian_specific_integral is fairly efficient. For example, on
x86_64 reading 32 bits with 1 2 and 4 byte alignment produces in all
cases:
movl (%rdi), %eax
But on armv6 the aligned case is
ldr r0, [r0]
the 2 byte aligned case is
ldrh r1, [r0, #2]
ldrh r0, [r0]
orr r0, r0, r1, lsl #16
and the unaligned case is
ldrb r1, [r0]
ldrb r2, [r0, #1]
ldrb r3, [r0, #2]
ldrb r0, [r0, #3]
orr r1, r1, r2, lsl #8
orr r0, r3, r0, lsl #8
orr r0, r1, r0, lsl #16
On armv7 it is a single ldr o...
2014 Feb 08
3
[PATCH 1/2] arm: Use the UAL syntax for ldr<cc>h instructions
On Fri, 7 Feb 2014, Timothy B. Terriberry wrote:
> Martin Storsjo wrote:
>> This is required in order to build using the built-in assembler
>> in clang.
>
> These patches break the gcc build (with "Error: bad instruction").
Ah, right, sorry about that.
> Documentation I've seen is contradictory on which order ({cond}{size} or
> {size}{cond}) is correct.
2018 Feb 22
2
Sink redundant spill after RA
...// 8-byte Folded Spill
ldrsw x8, [x0, #4424]
sxtw x10, w2 <------------- w2 is the use of
spilled value before spill.
sxtw x12, w1
madd x8, x8, x10, x12
ldr x9, [x0, #8]
add x9, x9, x8, lsl #2
ldrh w11, [x9]
ldrh w10, [x0, #16]
str x2, [sp, #120] // 8-byte Folded Spill
<------------- spill !!!
cmp w11, w10
b.eq .LBB2_32
// %bb.1: // %if.end
ldr x13, [sp, #120] // 8-byte Fol...
2013 Mar 04
1
[LLVMdev] Custom Lowering of ARM zero-extending loads
Hi,
For my research, I need to reshape the current ARM backend to support
armv2a. Zero-extend half word load (ldrh) is not supported by armv2a, so I
need to make the code generation to not generate ldrh instructions. I want
to replace all those instances with a 32-bit load (ldr) and then and the
result with 0xffff to mask out the upper bits.
These are the modifications that I have made to accomplish that:
1....
2018 Feb 22
2
Sink redundant spill after RA
...gt; sxtw x10, w2 <------------- w2 is the
> use of spilled value before spill.
>
> sxtw x12, w1
>
> madd x8, x8, x10, x12
>
> ldr x9, [x0, #8]
>
> add x9, x9, x8, lsl #2
>
> ldrh w11, [x9]
>
> ldrh w10, [x0, #16]
>
> str x2, [sp, #120] // 8-byte Folded Spill
> <------------- spill !!!
>
> cmp w11, w10
>
> b.eq .LBB2_32
>
> // %bb.1: // %if.end
>...
2014 Feb 08
0
[PATCH v2] arm: Use the UAL syntax for instructions
...--- a/celt/arm/celt_pitch_xcorr_arm.s
+++ b/celt/arm/celt_pitch_xcorr_arm.s
@@ -309,7 +309,7 @@ xcorr_kernel_edsp_process4_done
SUBS r2, r2, #1 ; j--
; Stall
SMLABB r6, r12, r10, r6 ; sum[0] = MAC16_16(sum[0],x,y_0)
- LDRGTH r14, [r4], #2 ; r14 = *x++
+ LDRHGT r14, [r4], #2 ; r14 = *x++
SMLABT r7, r12, r10, r7 ; sum[1] = MAC16_16(sum[1],x,y_1)
SMLABB r8, r12, r11, r8 ; sum[2] = MAC16_16(sum[2],x,y_2)
SMLABT r9, r12, r11, r9 ; sum[3] = MAC16_16(sum[3],x,y_3)
@@ -319,7 +319,7 @@ xcorr_kernel_edsp_process4_done...
2018 Feb 22
0
Sink redundant spill after RA
...// 8-byte Folded Spill
ldrsw x8, [x0, #4424]
sxtw x10, w2 <------------- w2 is the use of
spilled value before spill.
sxtw x12, w1
madd x8, x8, x10, x12
ldr x9, [x0, #8]
add x9, x9, x8, lsl #2
ldrh w11, [x9]
ldrh w10, [x0, #16]
str x2, [sp, #120] // 8-byte Folded Spill
<------------- spill !!!
cmp w11, w10
b.eq .LBB2_32
// %bb.1: // %if.end
Presumably there is a redefinition of x2 somewhere...
2018 Feb 22
0
Sink redundant spill after RA
...;------------- w2 is the
> > use of spilled value before spill.
> >
> > sxtw x12, w1
> >
> > madd x8, x8, x10, x12
> >
> > ldr x9, [x0, #8]
> >
> > add x9, x9, x8, lsl #2
> >
> > ldrh w11, [x9]
> >
> > ldrh w10, [x0, #16]
> >
> > str x2, [sp, #120] // 8-byte Folded Spill
> > <------------- spill !!!
> >
> > cmp w11, w10
> >
> > b.eq .LBB2_32
> >
> > // %...
2014 Feb 07
3
[PATCH 1/2] arm: Use the UAL syntax for ldr<cc>h instructions
...--- a/celt/arm/celt_pitch_xcorr_arm.s
+++ b/celt/arm/celt_pitch_xcorr_arm.s
@@ -309,7 +309,7 @@ xcorr_kernel_edsp_process4_done
SUBS r2, r2, #1 ; j--
; Stall
SMLABB r6, r12, r10, r6 ; sum[0] = MAC16_16(sum[0],x,y_0)
- LDRGTH r14, [r4], #2 ; r14 = *x++
+ LDRHGT r14, [r4], #2 ; r14 = *x++
SMLABT r7, r12, r10, r7 ; sum[1] = MAC16_16(sum[1],x,y_1)
SMLABB r8, r12, r11, r8 ; sum[2] = MAC16_16(sum[2],x,y_2)
SMLABT r9, r12, r11, r9 ; sum[3] = MAC16_16(sum[3],x,y_3)
@@ -319,7 +319,7 @@ xcorr_kernel_edsp_process4_done...
2017 Dec 05
2
[LLD] Slow callstacks in gdb
Martin Richtarsky <s at martinien.de> writes:
> Output looks as follows [1] Seems sh_offset is missing?
That is what readelf prints as Off
> [17] .rela.text RELA 0000000000000000 071423 001728 18
> 1 4 8
The offset of rela text should have been aligned, but it is not. Can you
report a bug on icc? As a work around using the gnu assembler if
possible
2007 Dec 12
2
Speex crashing on ARM with assembler optimization enabled.
...: blt 0x40030474
<open_loop_nbest_pitch+992>
0x40030328 <open_loop_nbest_pitch+660>: mov r6, r3
0x4003032c <open_loop_nbest_pitch+664>: mov r7, #0 ; 0x0
0x40030330 <open_loop_nbest_pitch+668>: ldr lr, [r11, #-84]
0x40030334 <open_loop_nbest_pitch+672>: ldrh r3, [r7, lr]
0x40030338 <open_loop_nbest_pitch+676>: smulbb r3, r3, r3
0x4003033c <open_loop_nbest_pitch+680>: mov r3, r3, lsl #16
0x40030340 <open_loop_nbest_pitch+684>: mov r12, r3, lsr #16
0x40030344 <open_loop_nbest_pitch+688>: ldr r3, [r4, r10, lsl #2]
0...
2007 Dec 12
2
Speex crashing on ARM with assembler optimization enabled.
Hi,
I'm trying to get speex working on an ARM board (ARM926EJ-Sid(wb) core,
ARM 5TE architecture) and getting segfaults if build with "--enable-fixed-point
--enable-arm5e-asm" options. If I use just "--enable-fixed-point", then
it runs fine, but once I add "--enable-arm5e-asm" it start crashing
(I use testenc to test it).
Further investigation showed, that it
2007 Dec 02
2
Optimised qmf_synth and iir_mem16
...stmia r9!, { r11-r12 }
bne 0b
@ Copy alternate members of mem1 and mem2 to last part of xx1 and xx2
mov r14, r5 @ Loop counter is M
add r6, r6, #2
add r7, r7, #2
stmdb sp!, { r6-r7 } @ Stack &mem1[1], &mem2[1]
0:
ldrh r10, [r6], #4
ldrh r11, [r6], #4
ldrh r12, [r7], #4
@ 1 cycle stall on Xscale
orr r10, r10, r11, lsl #16
ldrh r11, [r7], #4
str r10, [r8], #4
subs r14, r14, #4
orr r11, r12, r11, lsl #16
str r11, [r9], #4
bne 0b
sub...
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
...en FullFP16 is not supported. This is best illustrated with
this existing test which is a simple upconvert of f16 to f32:
define float @test_extend32(half* %addr) {
%val16 = load half, half* %addr
%val32 = fpext half %val16 to float
ret float %val32
}
It should generate this code::
ldrh r0, [r0] ; integer half word load
vmov s0, r0
vcvtb.f32.f16 s0, s0
vmov r0, s0
bx lr
when we don't have the Armv8.2-A FP16 instructions available, and thus only
have the conversion instructions.
The problem is in the conversion rules, s...
2017 Dec 06
2
[RFC] Half-Precision Support in the Arm Backends
Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good.
I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16
and FP16_TO_FP nodes are created to avoid inefficient code generation. I will
double check if I can't achieve the same without using these nodes (because I
really would like to get completely rid of them).
Cheers,
Sjoerd.
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
...en FullFP16 is not supported. This is best illustrated with
this existing test which is a simple upconvert of f16 to f32:
define float @test_extend32(half* %addr) {
%val16 = load half, half* %addr
%val32 = fpext half %val16 to float
ret float %val32
}
It should generate this code::
ldrh r0, [r0] ; integer half word load
vmov s0, r0
vcvtb.f32.f16 s0, s0
vmov r0, s0
bx lr
when we don't have the Armv8.2-A FP16 instructions available, and thus only
have the conversion instructions.
The problem is in the conversion rules, s...
2006 Jun 26
0
[klibc 22/43] arm support for klibc
...0, #0
+ strcs r2, [r3]
+ ldmfd sp!,{r4,r5,r7,pc}
+
+ .balign 4
+1:
+ .word errno
+
+#else
+ /* Thumb version - must still load r4 and r5 and run swi */
+
+ .thumb_func
+ .balign 2
+__syscall_common:
+ mov r7, lr
+ ldr r4, [sp,#16]
+ sub r7, #1 /* Remove the Thumb bit */
+ ldr r5, [sp,#20]
+ ldrh r7, [r7]
+ swi 0
+ ldr r1, 2f
+ cmp r0, r1
+ bcc 1f
+ ldr r1, 3f
+ neg r2, r0
+ mov r0, #1
+ str r2, [r1]
+ neg r0, r0
+1:
+ pop {r4,r5,r7,pc}
+
+ .balign 4
+2:
+ .word -4095
+3:
+ .word errno
+
+#endif
diff --git a/usr/klibc/arch/arm/sysstub.ph b/usr/klibc/arch/arm/sysstub.ph
new file mode 100644...
2018 Jan 24
2
[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
...memset-with-neon.ll
> llvm/trunk/test/CodeGen/ARM/2012-04-24-SplitEHCriticalEdge.ll
> llvm/trunk/test/CodeGen/ARM/Windows/memset.ll
> llvm/trunk/test/CodeGen/ARM/Windows/no-aeabi.ll
> llvm/trunk/test/CodeGen/ARM/arm-eabi.ll
> llvm/trunk/test/CodeGen/ARM/constantpool-promote-ldrh.ll
> llvm/trunk/test/CodeGen/ARM/constantpool-promote.ll
> llvm/trunk/test/CodeGen/ARM/crash-O0.ll
> llvm/trunk/test/CodeGen/ARM/debug-info-blocks.ll
> llvm/trunk/test/CodeGen/ARM/dyn-stackalloc.ll
> llvm/trunk/test/CodeGen/ARM/fast-isel-intrinsic.ll
> llvm/trunk/test/...
2018 Jan 24
0
[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
...t/CodeGen/ARM/2011-10-26-memset-with-neon.ll
llvm/trunk/test/CodeGen/ARM/2012-04-24-SplitEHCriticalEdge.ll
llvm/trunk/test/CodeGen/ARM/Windows/memset.ll
llvm/trunk/test/CodeGen/ARM/Windows/no-aeabi.ll
llvm/trunk/test/CodeGen/ARM/arm-eabi.ll
llvm/trunk/test/CodeGen/ARM/constantpool-promote-ldrh.ll
llvm/trunk/test/CodeGen/ARM/constantpool-promote.ll
llvm/trunk/test/CodeGen/ARM/crash-O0.ll
llvm/trunk/test/CodeGen/ARM/debug-info-blocks.ll
llvm/trunk/test/CodeGen/ARM/dyn-stackalloc.ll
llvm/trunk/test/CodeGen/ARM/fast-isel-intrinsic.ll
llvm/trunk/test/CodeGen/ARM/interval-update-re...
2018 Jan 25
2
[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
...>> llvm/trunk/test/CodeGen/ARM/2012-04-24-SplitEHCriticalEdge.ll
>> llvm/trunk/test/CodeGen/ARM/Windows/memset.ll
>> llvm/trunk/test/CodeGen/ARM/Windows/no-aeabi.ll
>> llvm/trunk/test/CodeGen/ARM/arm-eabi.ll
>> llvm/trunk/test/CodeGen/ARM/constantpool-promote-ldrh.ll
>> llvm/trunk/test/CodeGen/ARM/constantpool-promote.ll
>> llvm/trunk/test/CodeGen/ARM/crash-O0.ll
>> llvm/trunk/test/CodeGen/ARM/debug-info-blocks.ll
>> llvm/trunk/test/CodeGen/ARM/dyn-stackalloc.ll
>> llvm/trunk/test/CodeGen/ARM/fast-isel-intrinsic.ll
>...