thr3ads.net - llvm dev - [LLVMdev] Question about LLVM NEON intrinsics [Sep 2012]

If this information is useful, please help other people find it:
Share via:

Sebastien DELDON-GNB

2012-Sep-21 09:58 UTC

[LLVMdev] RE : Question about LLVM NEON intrinsics

Hi Eli,

Thanks for the answer, it clarifies the situation for me. Do you know if there
is Pass in LLVM that could be adapted to 'legalize' intrinsics calls ?
Or shall I define my own intrinsics for non supported types ? 

Best Regards
Seb
________________________________________
De : Eli Friedman [eli.friedman at gmail.com]
Date d'envoi : vendredi 21 septembre 2012 11:54
À : Sebastien DELDON-GNB
Cc : llvmdev at cs.uiuc.edu
Objet : Re: [LLVMdev] Question about LLVM NEON intrinsics

On Fri, Sep 21, 2012 at 1:28 AM, Sebastien DELDON-GNB
<sebastien.deldon at st.com> wrote:> Hi all,
>
> I would like to know if LLVM Neon intrinsics are designed to support only
'Legal' types for NEON units.
> Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following ll
code:
>
>
> ; ModuleID = 'vmax.ll'
> target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
> target triple = "armv7-none-linux-androideabi"
>
> define void @vmaxf32(<4 x float> *%C, <4 x float>* %A, <4 x
float>* %B) nounwind {
>     %tmp1 = load <4 x float>* %A
>     %tmp2 = load <4 x float>* %B
>     %tmp3 = call <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x
float> %tmp1, <4 x float> %tmp2)
>     store <4 x float> %tmp3, <4 x float>* %C
>     ret void
> }
>
> declare <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x float>,
<4 x float>) nounwind readnone
>
> I've got following code generated:
>
> ...
> vmaxf32:                                @ @vmaxf32
> @ BB#0:
>         vld1.64 {d16, d17}, [r2]
>         vld1.64 {d18, d19}, [r1]
>         vmax.f32        q8, q9, q8
>         vst1.64 {d16, d17}, [r0]
>         bx      lr
> ...
>
> Now if use <16 x float> vectors instead of <4 x float>:
>
> define void @vmaxf32(<16 x float> *%C, <16 x float>* %A, <16
x float>* %B) nounwind {
>     %tmp1 = load <16 x float>* %A
>     %tmp2 = load <16 x float>* %B
>     %tmp3 = call <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x
float> %tmp1, <16 x float> %tmp2)
>     store <16 x float> %tmp3, <16 x float>* %C
>     ret void
> }
>
> declare <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x float>,
<16 x float>) nounwind readnone
>
> llc fails with following message:
>
> SplitVectorResult #0: 0x2258350: v16f32 = llvm.arm.neon.vmaxs 0x2258250,
0x2258050, 0x2258150 [ORD=3] [ID=0]
>
> LLVM ERROR: Do not know how to split the result of this operator!
>
> Is it a BUG ? If yes I'm happy to get some directions on how I can fix
it.
No... platform-specific intrinsics have platform-specific semantics,
including what types they're defined for. NEON doesn't have 16 x float
vectors, at least not for that sort of operation.
> If not I would like to know how to determine valid type for a given LLVM
intrinsics.
The ARM reference manual is probably your best bet for ARM intrinsics.

-Eli

Jim Grosbach

2012-Sep-21 16:54 UTC

head link

[LLVMdev] Question about LLVM NEON intrinsics

On Sep 21, 2012, at 2:58 AM, Sebastien DELDON-GNB <sebastien.deldon at
st.com> wrote:
> Hi Eli,
> 
> Thanks for the answer, it clarifies the situation for me. Do you know if
there is Pass in LLVM that could be adapted to 'legalize' intrinsics
calls ?
> Or shall I define my own intrinsics for non supported types ? 
You should never generate these sorts of intrinsics with non-legal types.
It's the job of the front end to make sure that they are only called with
legal types. Yes, this is different than normal LLVM IR.
> 
> Best Regards
> Seb
> ________________________________________
> De : Eli Friedman [eli.friedman at gmail.com]
> Date d'envoi : vendredi 21 septembre 2012 11:54
> À : Sebastien DELDON-GNB
> Cc : llvmdev at cs.uiuc.edu
> Objet : Re: [LLVMdev] Question about LLVM NEON intrinsics
> 
> On Fri, Sep 21, 2012 at 1:28 AM, Sebastien DELDON-GNB
> <sebastien.deldon at st.com> wrote:
>> Hi all,
>> 
>> I would like to know if LLVM Neon intrinsics are designed to support
only 'Legal' types for NEON units.
>> Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on following
ll code:
>> 
>> 
>> ; ModuleID = 'vmax.ll'
>> target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
>> target triple = "armv7-none-linux-androideabi"
>> 
>> define void @vmaxf32(<4 x float> *%C, <4 x float>* %A,
<4 x float>* %B) nounwind {
>>    %tmp1 = load <4 x float>* %A
>>    %tmp2 = load <4 x float>* %B
>>    %tmp3 = call <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x
float> %tmp1, <4 x float> %tmp2)
>>    store <4 x float> %tmp3, <4 x float>* %C
>>    ret void
>> }
>> 
>> declare <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x float>,
<4 x float>) nounwind readnone
>> 
>> I've got following code generated:
>> 
>> ...
>> vmaxf32:                                @ @vmaxf32
>> @ BB#0:
>>        vld1.64 {d16, d17}, [r2]
>>        vld1.64 {d18, d19}, [r1]
>>        vmax.f32        q8, q9, q8
>>        vst1.64 {d16, d17}, [r0]
>>        bx      lr
>> ...
>> 
>> Now if use <16 x float> vectors instead of <4 x float>:
>> 
>> define void @vmaxf32(<16 x float> *%C, <16 x float>* %A,
<16 x float>* %B) nounwind {
>>    %tmp1 = load <16 x float>* %A
>>    %tmp2 = load <16 x float>* %B
>>    %tmp3 = call <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x
float> %tmp1, <16 x float> %tmp2)
>>    store <16 x float> %tmp3, <16 x float>* %C
>>    ret void
>> }
>> 
>> declare <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x
float>, <16 x float>) nounwind readnone
>> 
>> llc fails with following message:
>> 
>> SplitVectorResult #0: 0x2258350: v16f32 = llvm.arm.neon.vmaxs
0x2258250, 0x2258050, 0x2258150 [ORD=3] [ID=0]
>> 
>> LLVM ERROR: Do not know how to split the result of this operator!
>> 
>> Is it a BUG ? If yes I'm happy to get some directions on how I can
fix it.
> 
> No... platform-specific intrinsics have platform-specific semantics,
> including what types they're defined for. NEON doesn't have 16 x
float
> vectors, at least not for that sort of operation.
> 
>> If not I would like to know how to determine valid type for a given
LLVM intrinsics.
> 
> The ARM reference manual is probably your best bet for ARM intrinsics.
> 
> -Eli
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Sebastien DELDON-GNB

2012-Sep-21 17:16 UTC

head link

[LLVMdev] Question about LLVM NEON intrinsics

Hi Jim,

Thanks for the answer, it confirms what I first thought.
Best Regards
Seb
> -----Original Message-----
> From: Jim Grosbach [mailto:grosbach at apple.com]
> Sent: Friday, September 21, 2012 6:54 PM
> To: Sebastien DELDON-GNB
> Cc: Eli Friedman; llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] Question about LLVM NEON intrinsics
> 
> 
> On Sep 21, 2012, at 2:58 AM, Sebastien DELDON-GNB
> <sebastien.deldon at st.com> wrote:
> 
> > Hi Eli,
> >
> > Thanks for the answer, it clarifies the situation for me. Do you know
if there
> is Pass in LLVM that could be adapted to 'legalize' intrinsics
calls ?
> > Or shall I define my own intrinsics for non supported types ?
> 
> You should never generate these sorts of intrinsics with non-legal types.
It's
> the job of the front end to make sure that they are only called with legal
> types. Yes, this is different than normal LLVM IR.
> 
> >
> > Best Regards
> > Seb
> > ________________________________________
> > De : Eli Friedman [eli.friedman at gmail.com] Date d'envoi :
vendredi 21
> > septembre 2012 11:54 À : Sebastien DELDON-GNB Cc :
> llvmdev at cs.uiuc.edu
> > Objet : Re: [LLVMdev] Question about LLVM NEON intrinsics
> >
> > On Fri, Sep 21, 2012 at 1:28 AM, Sebastien DELDON-GNB
> > <sebastien.deldon at st.com> wrote:
> >> Hi all,
> >>
> >> I would like to know if LLVM Neon intrinsics are designed to
support only
> 'Legal' types for NEON units.
> >> Using llc -march=arm -mcpu=cortex-a9 vmax4.ll -o vmax4.s on
following ll
> code:
> >>
> >>
> >> ; ModuleID = 'vmax.ll'
> >> target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-
> i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
> >> target triple = "armv7-none-linux-androideabi"
> >>
> >> define void @vmaxf32(<4 x float> *%C, <4 x float>* %A,
<4 x float>* %B)
> nounwind {
> >>    %tmp1 = load <4 x float>* %A
> >>    %tmp2 = load <4 x float>* %B
> >>    %tmp3 = call <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4
x float>
> %tmp1, <4 x float> %tmp2)
> >>    store <4 x float> %tmp3, <4 x float>* %C
> >>    ret void
> >> }
> >>
> >> declare <4 x float> @llvm.arm.neon.vmaxs.v4f32(<4 x
float>, <4 x
> >> float>) nounwind readnone
> >>
> >> I've got following code generated:
> >>
> >> ...
> >> vmaxf32:                                @ @vmaxf32
> >> @ BB#0:
> >>        vld1.64 {d16, d17}, [r2]
> >>        vld1.64 {d18, d19}, [r1]
> >>        vmax.f32        q8, q9, q8
> >>        vst1.64 {d16, d17}, [r0]
> >>        bx      lr
> >> ...
> >>
> >> Now if use <16 x float> vectors instead of <4 x
float>:
> >>
> >> define void @vmaxf32(<16 x float> *%C, <16 x float>*
%A, <16 x float>*
> %B) nounwind {
> >>    %tmp1 = load <16 x float>* %A
> >>    %tmp2 = load <16 x float>* %B
> >>    %tmp3 = call <16 x float>
@llvm.arm.neon.vmaxs.v16f32(<16 x float>
> %tmp1, <16 x float> %tmp2)
> >>    store <16 x float> %tmp3, <16 x float>* %C
> >>    ret void
> >> }
> >>
> >> declare <16 x float> @llvm.arm.neon.vmaxs.v16f32(<16 x
float>, <16 x
> >> float>) nounwind readnone
> >>
> >> llc fails with following message:
> >>
> >> SplitVectorResult #0: 0x2258350: v16f32 = llvm.arm.neon.vmaxs
> >> 0x2258250, 0x2258050, 0x2258150 [ORD=3] [ID=0]
> >>
> >> LLVM ERROR: Do not know how to split the result of this operator!
> >>
> >> Is it a BUG ? If yes I'm happy to get some directions on how I
can fix it.
> >
> > No... platform-specific intrinsics have platform-specific semantics,
> > including what types they're defined for. NEON doesn't have 16
x float
> > vectors, at least not for that sort of operation.
> >
> >> If not I would like to know how to determine valid type for a
given LLVM
> intrinsics.
> >
> > The ARM reference manual is probably your best bet for ARM intrinsics.
> >
> > -Eli
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Sep 2012 - [LLVMdev] Question about LLVM NEON intrinsics

[LLVMdev] RE : Question about LLVM NEON intrinsics

[LLVMdev] Question about LLVM NEON intrinsics

[LLVMdev] Question about LLVM NEON intrinsics

Reasonably Related Threads