Martin Storsjö
2014-Feb-08 10:57 UTC
[opus] [PATCH 1/2] arm: Use the UAL syntax for ldr<cc>h instructions
On Fri, 7 Feb 2014, Timothy B. Terriberry wrote:> Martin Storsjo wrote: >> This is required in order to build using the built-in assembler >> in clang. > > These patches break the gcc build (with "Error: bad instruction").Ah, right, sorry about that.> Documentation I've seen is contradictory on which order ({cond}{size} or > {size}{cond}) is correct.The reason you're finding contradictory information is that there's two syntaxes, the legacy syntax (where arm and thumb instructions have quite different syntaxes) and the new unified syntax (aka UAL, unified assembler language). Most modern tools default to interpreting the source as UAL (see e.g. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473j/dom1359731145130.html - the armasm tool defaults to this syntax as well), but gnu binutils defaults to the old syntax unless otherwise specified. (Btw, what build setups actually use the original unconverted asm sources within opus?) By injecting a ".syntax unified" at the start of the generated assembly sources, it works as intended with gnu binutils as well as clang. I'll post an updated patch that does this. // Martin
Martin Storsjo
2014-Feb-08 10:59 UTC
[opus] [PATCH v2] arm: Use the UAL syntax for instructions
This is required in order to build using the built-in assembler in clang. --- I squashed the two changes since it would break the normal gcc build otherwise. --- celt/arm/arm2gnu.pl | 2 ++ celt/arm/celt_pitch_xcorr_arm.s | 18 +++++++++--------- 2 files changed, 11 insertions(+), 9 deletions(-) diff --git a/celt/arm/arm2gnu.pl b/celt/arm/arm2gnu.pl index eab42ef..5c24758 100755 --- a/celt/arm/arm2gnu.pl +++ b/celt/arm/arm2gnu.pl @@ -25,6 +25,8 @@ $n=0; $thumb = 0; # ARM mode by default, not Thumb. @proc_stack = (); +printf (" .syntax unified\n"); + LINE: while (<>) { diff --git a/celt/arm/celt_pitch_xcorr_arm.s b/celt/arm/celt_pitch_xcorr_arm.s index 09917b1..598e45b 100644 --- a/celt/arm/celt_pitch_xcorr_arm.s +++ b/celt/arm/celt_pitch_xcorr_arm.s @@ -309,7 +309,7 @@ xcorr_kernel_edsp_process4_done SUBS r2, r2, #1 ; j-- ; Stall SMLABB r6, r12, r10, r6 ; sum[0] = MAC16_16(sum[0],x,y_0) - LDRGTH r14, [r4], #2 ; r14 = *x++ + LDRHGT r14, [r4], #2 ; r14 = *x++ SMLABT r7, r12, r10, r7 ; sum[1] = MAC16_16(sum[1],x,y_1) SMLABB r8, r12, r11, r8 ; sum[2] = MAC16_16(sum[2],x,y_2) SMLABT r9, r12, r11, r9 ; sum[3] = MAC16_16(sum[3],x,y_3) @@ -319,7 +319,7 @@ xcorr_kernel_edsp_process4_done SMLABB r7, r14, r11, r7 ; sum[1] = MAC16_16(sum[1],x,y_2) LDRH r10, [r5], #2 ; r10 = y_4 = *y++ SMLABT r8, r14, r11, r8 ; sum[2] = MAC16_16(sum[2],x,y_3) - LDRGTH r12, [r4], #2 ; r12 = *x++ + LDRHGT r12, [r4], #2 ; r12 = *x++ SMLABB r9, r14, r10, r9 ; sum[3] = MAC16_16(sum[3],x,y_4) BLE xcorr_kernel_edsp_done SMLABB r6, r12, r11, r6 ; sum[0] = MAC16_16(sum[0],tmp,y_2) @@ -327,7 +327,7 @@ xcorr_kernel_edsp_process4_done SMLABT r7, r12, r11, r7 ; sum[1] = MAC16_16(sum[1],tmp,y_3) LDRH r2, [r5], #2 ; r2 = y_5 = *y++ SMLABB r8, r12, r10, r8 ; sum[2] = MAC16_16(sum[2],tmp,y_4) - LDRGTH r14, [r4] ; r14 = *x + LDRHGT r14, [r4] ; r14 = *x SMLABB r9, r12, r2, r9 ; sum[3] = MAC16_16(sum[3],tmp,y_5) BLE xcorr_kernel_edsp_done SMLABT r6, r14, r11, r6 ; sum[0] = MAC16_16(sum[0],tmp,y_3) @@ -387,11 +387,11 @@ celt_pitch_xcorr_edsp_process1u_loop4 celt_pitch_xcorr_edsp_process1u_loop4_done ADDS r12, r12, #4 celt_pitch_xcorr_edsp_process1u_loop1 - LDRGEH r6, [r4], #2 + LDRHGE r6, [r4], #2 ; Stall SMLABBGE r14, r6, r8, r14 ; sum = MAC16_16(sum, *x, *y) - SUBGES r12, r12, #1 - LDRGTH r8, [r5], #2 + SUBSGE r12, r12, #1 + LDRHGT r8, [r5], #2 BGT celt_pitch_xcorr_edsp_process1u_loop1 ; Restore _x SUB r4, r4, r3, LSL #1 @@ -474,7 +474,7 @@ celt_pitch_xcorr_edsp_process2_1 ADDS r12, r12, #1 ; Stall SMLABB r10, r6, r8, r10 ; sum0 = MAC16_16(sum0, x_0, y_0) - LDRGTH r7, [r4], #2 + LDRHGT r7, [r4], #2 SMLABT r11, r6, r8, r11 ; sum1 = MAC16_16(sum1, x_0, y_1) BLE celt_pitch_xcorr_edsp_process2_done LDRH r9, [r5], #2 @@ -527,8 +527,8 @@ celt_pitch_xcorr_edsp_process1a_loop_done SUBGE r12, r12, #2 SMLATTGE r14, r6, r8, r14 ; sum = MAC16_16(sum, x_1, y_1) ADDS r12, r12, #1 - LDRGEH r6, [r4], #2 - LDRGEH r8, [r5], #2 + LDRHGE r6, [r4], #2 + LDRHGE r8, [r5], #2 ; Stall SMLABBGE r14, r6, r8, r14 ; sum = MAC16_16(sum, *x, *y) ; maxcorr = max(maxcorr, sum) -- 1.8.3.4 (Apple Git-47)
Martin Storsjö
2014-Feb-13 12:13 UTC
[opus] [PATCH v2] arm: Use the UAL syntax for instructions
On Sat, 8 Feb 2014, Martin Storsjo wrote:> This is required in order to build using the built-in assembler > in clang. > --- > I squashed the two changes since it would break the normal gcc > build otherwise. > --- > celt/arm/arm2gnu.pl | 2 ++ > celt/arm/celt_pitch_xcorr_arm.s | 18 +++++++++--------- > 2 files changed, 11 insertions(+), 9 deletions(-)Ping, any further comments on this one? The place in arm2gnu.pl where ".syntax unified" is added could probably be changed to some better place if there's suggestions, but this works at least. // Martin
Jean-Marc Valin
2014-Feb-24 20:59 UTC
[opus] [PATCH v2] arm: Use the UAL syntax for instructions
You patch is now merged in master. Thanks, Jean-Marc On 08/02/14 05:59 AM, Martin Storsjo wrote:> This is required in order to build using the built-in assembler > in clang. > --- > I squashed the two changes since it would break the normal gcc > build otherwise. > --- > celt/arm/arm2gnu.pl | 2 ++ > celt/arm/celt_pitch_xcorr_arm.s | 18 +++++++++--------- > 2 files changed, 11 insertions(+), 9 deletions(-) > > diff --git a/celt/arm/arm2gnu.pl b/celt/arm/arm2gnu.pl > index eab42ef..5c24758 100755 > --- a/celt/arm/arm2gnu.pl > +++ b/celt/arm/arm2gnu.pl > @@ -25,6 +25,8 @@ $n=0; > $thumb = 0; # ARM mode by default, not Thumb. > @proc_stack = (); > > +printf (" .syntax unified\n"); > + > LINE: > while (<>) { > > diff --git a/celt/arm/celt_pitch_xcorr_arm.s b/celt/arm/celt_pitch_xcorr_arm.s > index 09917b1..598e45b 100644 > --- a/celt/arm/celt_pitch_xcorr_arm.s > +++ b/celt/arm/celt_pitch_xcorr_arm.s > @@ -309,7 +309,7 @@ xcorr_kernel_edsp_process4_done > SUBS r2, r2, #1 ; j-- > ; Stall > SMLABB r6, r12, r10, r6 ; sum[0] = MAC16_16(sum[0],x,y_0) > - LDRGTH r14, [r4], #2 ; r14 = *x++ > + LDRHGT r14, [r4], #2 ; r14 = *x++ > SMLABT r7, r12, r10, r7 ; sum[1] = MAC16_16(sum[1],x,y_1) > SMLABB r8, r12, r11, r8 ; sum[2] = MAC16_16(sum[2],x,y_2) > SMLABT r9, r12, r11, r9 ; sum[3] = MAC16_16(sum[3],x,y_3) > @@ -319,7 +319,7 @@ xcorr_kernel_edsp_process4_done > SMLABB r7, r14, r11, r7 ; sum[1] = MAC16_16(sum[1],x,y_2) > LDRH r10, [r5], #2 ; r10 = y_4 = *y++ > SMLABT r8, r14, r11, r8 ; sum[2] = MAC16_16(sum[2],x,y_3) > - LDRGTH r12, [r4], #2 ; r12 = *x++ > + LDRHGT r12, [r4], #2 ; r12 = *x++ > SMLABB r9, r14, r10, r9 ; sum[3] = MAC16_16(sum[3],x,y_4) > BLE xcorr_kernel_edsp_done > SMLABB r6, r12, r11, r6 ; sum[0] = MAC16_16(sum[0],tmp,y_2) > @@ -327,7 +327,7 @@ xcorr_kernel_edsp_process4_done > SMLABT r7, r12, r11, r7 ; sum[1] = MAC16_16(sum[1],tmp,y_3) > LDRH r2, [r5], #2 ; r2 = y_5 = *y++ > SMLABB r8, r12, r10, r8 ; sum[2] = MAC16_16(sum[2],tmp,y_4) > - LDRGTH r14, [r4] ; r14 = *x > + LDRHGT r14, [r4] ; r14 = *x > SMLABB r9, r12, r2, r9 ; sum[3] = MAC16_16(sum[3],tmp,y_5) > BLE xcorr_kernel_edsp_done > SMLABT r6, r14, r11, r6 ; sum[0] = MAC16_16(sum[0],tmp,y_3) > @@ -387,11 +387,11 @@ celt_pitch_xcorr_edsp_process1u_loop4 > celt_pitch_xcorr_edsp_process1u_loop4_done > ADDS r12, r12, #4 > celt_pitch_xcorr_edsp_process1u_loop1 > - LDRGEH r6, [r4], #2 > + LDRHGE r6, [r4], #2 > ; Stall > SMLABBGE r14, r6, r8, r14 ; sum = MAC16_16(sum, *x, *y) > - SUBGES r12, r12, #1 > - LDRGTH r8, [r5], #2 > + SUBSGE r12, r12, #1 > + LDRHGT r8, [r5], #2 > BGT celt_pitch_xcorr_edsp_process1u_loop1 > ; Restore _x > SUB r4, r4, r3, LSL #1 > @@ -474,7 +474,7 @@ celt_pitch_xcorr_edsp_process2_1 > ADDS r12, r12, #1 > ; Stall > SMLABB r10, r6, r8, r10 ; sum0 = MAC16_16(sum0, x_0, y_0) > - LDRGTH r7, [r4], #2 > + LDRHGT r7, [r4], #2 > SMLABT r11, r6, r8, r11 ; sum1 = MAC16_16(sum1, x_0, y_1) > BLE celt_pitch_xcorr_edsp_process2_done > LDRH r9, [r5], #2 > @@ -527,8 +527,8 @@ celt_pitch_xcorr_edsp_process1a_loop_done > SUBGE r12, r12, #2 > SMLATTGE r14, r6, r8, r14 ; sum = MAC16_16(sum, x_1, y_1) > ADDS r12, r12, #1 > - LDRGEH r6, [r4], #2 > - LDRGEH r8, [r5], #2 > + LDRHGE r6, [r4], #2 > + LDRHGE r8, [r5], #2 > ; Stall > SMLABBGE r14, r6, r8, r14 ; sum = MAC16_16(sum, *x, *y) > ; maxcorr = max(maxcorr, sum) >