thr3ads.net - search: "vmovl"

[LLVMdev] RE : Vector argument passing abi for ARM ?

2012 Jul 05

2

[LLVMdev] RE : Vector argument passing abi for ARM ?

....1 generated code contains a misaligned load: bar: @ @bar @ BB#0: @ %L.entry push {r11, lr} add r0, r1, #2 vldr s0, [r1] vldr s2, [r0] # <= here load is misaligned vmovl.u8 q8, d0 vmovl.u8 q9, d1 vmovl.u16 q8, d16 vmovl.u16 q9, d18 vmov r0, r1, d16 vmov r2, r3, d18 bl zzz(PLT) pop {r11, pc} with LLVM trunk, assembly looks like: bar: @ @bar @ BB#0: @ %L....

[LLVMdev] RE : Vector argument passing abi for ARM ?

2012 Jul 05

0

[LLVMdev] RE : Vector argument passing abi for ARM ?

...misaligned load: > > bar: @ @bar > @ BB#0: @ %L.entry > push {r11, lr} > add r0, r1, #2 > vldr s0, [r1] > vldr s2, [r0] # <= here load is misaligned > vmovl.u8 q8, d0 > vmovl.u8 q9, d1 > vmovl.u16 q8, d16 > vmovl.u16 q9, d18 > vmov r0, r1, d16 > vmov r2, r3, d18 > bl zzz(PLT) > pop {r11, pc} > > with LLVM trunk, assembly looks like: > > bar:...

[LLVMdev] Vector argument passing abi for ARM ?

2012 Jul 05

0

[LLVMdev] Vector argument passing abi for ARM ?

Hi Sebastien, > Thanks for the quick answer, how do I know which type is legal/illegal with respect to calling convention ? the code generators are supposed to produce working code no matter what the parameter type is. The fact that the ARM ABI doesn't specify how <2 x i8> is passed just means that the code generators can pass it using whatever technique it feels like (since it

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

1

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...ely possible that it's LLVM that's confused about the alignment requirements here. :) > > I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: > extend: @ @extend > @ BB#0: > vldr d16, [r0] > vmovl.s16 q8, d16 > vstmia r1, {d16, d17} > vldr d16, [r0, #8] > add r0, r1, #16 > vmovl.s16 q8, d16 > vstmia r0, {d16, d17} > bx lr > > Note that we're using a plain vldr instruction here to load the d register, not a vld1 instruction. Similarly for the stores. Accordi...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 05

0

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hmmm. Well, it's entirely possible that it's LLVM that's confused about the alignment requirements here. :) I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: extend: @ @extend @ BB#0: vldr d16, [r0] vmovl.s16 q8, d16 vstmia r1, {d16, d17} vldr d16, [r0, #8] add r0, r1, #16 vmovl.s16 q8, d16 vstmia r0, {d16, d17} bx lr Note that we're using a plain vldr instruction here to load the d register, not a vld1 instruction. Similarly for the stores. According to the ARM ARM (DDI 0406C), you'...

[LLVMdev] Vector argument passing abi for ARM ?

2012 Jul 05

3

[LLVMdev] Vector argument passing abi for ARM ?

Hi Rotem, Thanks for the quick answer, how do I know which type is legal/illegal with respect to calling convention ? Best Regards Seb > -----Original Message----- > From: Rotem, Nadav [mailto:nadav.rotem at intel.com] > Sent: Thursday, July 05, 2012 11:21 AM > To: Sebastien DELDON-GNB; llvmdev at cs.uiuc.edu > Subject: RE: Vector argument passing abi for ARM ? > > The

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 05

3

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hello Jim, Thank you for the response. I may be confused about the alignment rules here. I had been looking at the ARM RVCT Assembler Guide, which seems to indicate vld1.16 operates on 16-bit aligned data, unless I am misinterpreting their table (Table 5-11 in ARM DUI 0204H, pg 5-70,5-71). Prior to the table, It does mention the accesses need to be "element" aligned, where I took

[LLVMdev] ARM aapcs calling convention for small vectors

2012 Sep 21

2

[LLVMdev] ARM aapcs calling convention for small vectors

...ret void } declare arm_aapcscc void @bar(<2 x i8> %a) and we compile it using llc -march=arm -mcpu=cortex-a9 we got following assembly generated: ... foo: @ @foo @ BB#0: @ %L.entry push {lr} vld1.16 {d16[0]}, [r0, :16] vmovl.u8 q8, d16 vmovl.u16 q8, d16 vmov r0, r1, d16 bl bar pop {lr} bx lr .Ltmp0: .size foo, .Ltmp0-foo ... Hi and Low part of vector is passed to bar using r0, r1 register. I was thinking that only r0 could have been used since size of vector is 16-bits. So is this behavior occuring because vect...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

2

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...rely possible that it's LLVM that's confused about the alignment requirements here. :) > > I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: > extend: @ @extend > @ BB#0: > vldr d16, [r0] > vmovl.s16 q8, d16 > vstmia r1, {d16, d17} > vldr d16, [r0, #8] > add r0, r1, #16 > vmovl.s16 q8, d16 > vstmia r0, {d16, d17} > bx lr > > Note that we're using a plain vldr instruction here to load the d register, not a vld1 instruction. Similarly for the stores. Accordin...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

0

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...ossible that it's LLVM that's confused > about the alignment requirements here. :) > > I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: > extend: @ @extend > @ BB#0: > vldr d16, [r0] > vmovl.s16 q8, d16 > vstmia r1, {d16, d17} > vldr d16, [r0, #8] > add r0, r1, #16 > vmovl.s16 q8, d16 > vstmia r0, {d16, d17} > bx lr > > Note that we're using a plain vldr instruction here to load the d register, not a vld1 instruction. Similarly for the stores. Accordin...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

2

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...9;s confused >> about the alignment requirements here. :) >> >> I think I see, in general, where. I twiddled the IR to give it higher > alignment (16 bytes) and get: >> extend: @ @extend >> @ BB#0: >> vldr d16, [r0] >> vmovl.s16 q8, d16 >> vstmia r1, {d16, d17} >> vldr d16, [r0, #8] >> add r0, r1, #16 >> vmovl.s16 q8, d16 >> vstmia r0, {d16, d17} >> bx lr >> >> Note that we're using a plain vldr instruction here to load the d > register, not a vld1 instructi...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

0

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...9;s confused >> about the alignment requirements here. :) >> >> I think I see, in general, where. I twiddled the IR to give it higher > alignment (16 bytes) and get: >> extend: @ @extend >> @ BB#0: >> vldr d16, [r0] >> vmovl.s16 q8, d16 >> vstmia r1, {d16, d17} >> vldr d16, [r0, #8] >> add r0, r1, #16 >> vmovl.s16 q8, d16 >> vstmia r0, {d16, d17} >> bx lr >> >> Note that we're using a plain vldr instruction here to load the d > register, not a vld1 instructi...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 07

2

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...the alignment requirements here. :) >>> >>> I think I see, in general, where. I twiddled the IR to give it higher >> alignment (16 bytes) and get: >>> extend: @ @extend >>> @ BB#0: >>> vldr d16, [r0] >>> vmovl.s16 q8, d16 >>> vstmia r1, {d16, d17} >>> vldr d16, [r0, #8] >>> add r0, r1, #16 >>> vmovl.s16 q8, d16 >>> vstmia r0, {d16, d17} >>> bx lr >>> >>> Note that we're using a plain vldr instruction here to load the d &gt...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 07

0

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...t; > >>> I think I see, in general, where. I twiddled the IR to give it > >>> higher > >> alignment (16 bytes) and get: > >>> extend: @ @extend > >>> @ BB#0: > >>> vldr d16, [r0] > >>> vmovl.s16 q8, d16 > >>> vstmia r1, {d16, d17} > >>> vldr d16, [r0, #8] > >>> add r0, r1, #16 > >>> vmovl.s16 q8, d16 > >>> vstmia r0, {d16, d17} > >>> bx lr > >>> > >>> Note that we're using a plain...

search for: vmovl