hameeza ahmed via llvm-dev
2017-Aug-06 19:03 UTC
[llvm-dev] VBROADCAST Implementation Issues
I am trying to implement vector shuffle for v64i32. Is the following
correct?
def VSHUFFLE_256B : I<0xE8, MRMDestReg, (outs VR_2048:$dst),
(ins VR_2048:$src1, VRPIM_2048:$src2),"VSHUFFLE_256B\t{$src1, $src2,
$dst|$dst, $src1, $src2}",
[(set VR_2048:$dst, (shufflevector (v64i32 VR_2048:$src1), (v64i32
VR_2048:$src2)))]>, TA;
Please help.
On Sun, Aug 6, 2017 at 11:48 PM, hameeza ahmed <hahmed2305 at gmail.com>
wrote:
> i managed to get rid of above error for VT.is2048BitVector()).
>
> this was implemented already.
>
> now will try define other vectors like VT.is4096BitVector()).
>
>
>
> On Sun, Aug 6, 2017 at 11:11 PM, hameeza ahmed <hahmed2305 at
gmail.com>
> wrote:
>
>> Thank you. actually i have to implement both i32 and i64. so i
>> implemented two instructions now one broadcastS other broadcastD.
Although
>> while doing broadcast from memory to register i was getting no such
error
>> with 1 instruction and other patterns i64, i32 etc. but then also i
>> implemented its 2 versions single and double.
>>
>> Actually, i am trying to compile matrix multiplication code for greater
>> size vector. There i need to include many new instructions in my
backend
>> like shuffle, gather etc. For now i am getting the following error.
>>
>>
>> Legalizing: t208: v64i32 = BUILD_VECTOR Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>,
Constant:i32<-1>,
>> Constant:i32<-1>, Constant:i32<-1>, Constant:i32<-1>
>> llc: /lib/Target/X86/X86ISelLowering.cpp:5525: llvm::SDValue
>> getOnesVector(llvm::EVT, const llvm::X86Subtarget &,
llvm::SelectionDAG &,
>> const llvm::SDLoc &): Assertion `(VT.is128BitVector() ||
>> VT.is256BitVector() || VT.is512BitVector()) && "Expected a
128/256/512-bit
>> vector type"' failed.
>>
>> i tried including is2048Bit Vector() and others. also in vectortype.h
i
>> included these types for EVT but was unable to compile backend and
getting
>> errors.
>>
>> Please help.
>>
>> Thank You
>>
>>
>> On Sun, Aug 6, 2017 at 8:42 PM, Craig Topper <craig.topper at
gmail.com>
>> wrote:
>>
>>> You need a new instruction. And your scalar register size needs to
match
>>> your vector element size. So GR32 instead of GR64
>>>
>>> On Sun, Aug 6, 2017 at 5:44 AM hameeza ahmed <hahmed2305 at
gmail.com>
>>> wrote:
>>>
>>>> Sorry to disturb,
>>>> Now i want to implement instruction to broadcast scalar
register
>>>> content to vector.
>>>>
>>>> like this;
>>>> vpbroadcastq zmm0, rsi
>>>>
>>>>
>>>> I tried implementing it as follows;
>>>>
>>>> def BROADCASTR_256B : I<0x21, MRMSrcReg, (outs
VR_2048:$dst), (ins
>>>> GR64:$src),
>>>> "BROADCASTR_256B\t{$src, $dst|$dst,
$src}",
>>>> [(set VR_2048:$dst, (v64i32 (X86VBroadcast
>>>> GR64:$src)))],
>>>> IIC_MOV_MEM>, TA;
>>>>
>>>>
>>>>
>>>> def: Pat<(v64f32 (X86VBroadcast GR64:$src)),
>>>> (BROADCASTR_256B GR64:$src)>;
>>>>
>>>>
>>>> Is it fine? Also do i need to define a new instruction for this
like
>>>> BROADCASTR_256B? can i use the previous instruction
BROADCAST_256B (the one
>>>> that broadcast memory scalar to vector) and just define new
pattern?
>>>>
>>>> Please help.
>>>>
>>>> Thank You
>>>>
>>>>
>>>>
>>>> On Sun, Aug 6, 2017 at 5:10 AM, hameeza ahmed <hahmed2305 at
gmail.com>
>>>> wrote:
>>>>
>>>>> Thank You so much.
>>>>>
>>>>> Wao you are simply genius.
>>>>> initially I didnt include load in both the main instruction
and
>>>>> pattern so i included in both as follows:
>>>>> def BROADCAST_256B : I<0x31, MRMSrcMem, (outs
VR_2048:$dst), (ins
>>>>> i2048mem:$src),
>>>>> "BROADCAST_256B\t{$src, $dst|$dst,
$src}",
>>>>> [(set VR_2048:$dst, (v64i32
(X86VBroadcast (
>>>>> loadi32 addr:$src))))],
>>>>> IIC_MOV_MEM>, TA;
>>>>>
>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32 addr:$src))),
>>>>> (BROADCAST_256B addr:$src)>;
>>>>> And it worked perfectly.
>>>>>
>>>>> Thank You again.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Aug 6, 2017 at 4:28 AM, Craig Topper
<craig.topper at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Your pattern needs to be
>>>>>>
>>>>>> def: Pat<(v64f32 (X86VBroadcast (loadf32
addr:$src))),
>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>
>>>>>> ~Craig
>>>>>>
>>>>>> On Sat, Aug 5, 2017 at 2:47 PM, hameeza ahmed
<hahmed2305 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> it runs fine with v64i32. but with the following
pattern
>>>>>>>
>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>
>>>>>>> i am getting error.
>>>>>>> What is wrong with this pattern?
>>>>>>>
>>>>>>> On Sun, Aug 6, 2017 at 2:01 AM, hameeza ahmed
<hahmed2305 at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> in x86 it is;
>>>>>>>>
>>>>>>>> def : Pat<(int_x86_avx512_vbroadcast_ss_512
addr:$src),
>>>>>>>> (VBROADCASTSSZm addr:$src)>;
>>>>>>>>
>>>>>>>> mine is
>>>>>>>>
>>>>>>>> def: Pat<(v64f32 (X86VBroadcast addr:$src)),
>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Aug 6, 2017 at 1:59 AM, hameeza ahmed
<hahmed2305 at gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> for v16f32 it is defined as;
>>>>>>>>> : Pat<(v16f32 (X86VBroadcast (v16f32
VR512:$src))),
>>>>>>>>> (VBROADCASTSSZr (EXTRACT_SUBREG
(v16f32 VR512:$src),
>>>>>>>>> sub_xmm))>;
>>>>>>>>> which is similar to mine.
>>>>>>>>> Why its not working then?
>>>>>>>>>
>>>>>>>>> On Sun, Aug 6, 2017 at 1:45 AM, Craig
Topper <
>>>>>>>>> craig.topper at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> You need a pattern for v64f32 too.
>>>>>>>>>>
>>>>>>>>>> ~Craig
>>>>>>>>>>
>>>>>>>>>> On Sat, Aug 5, 2017 at 1:37 PM, hameeza
ahmed <
>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> as you said; these are instructions
that i defined in
>>>>>>>>>>> instrinfo.td
>>>>>>>>>>>
>>>>>>>>>>> def BROADCAST_256B : I<0x31,
MRMSrcMem, (outs VR_2048:$dst),
>>>>>>>>>>> (ins i2048mem:$src),
>>>>>>>>>>>
"BROADCAST_256B\t{$src, $dst|$dst, $src}",
>>>>>>>>>>> [(set
VR_2048:$dst, (v64i32 (X86VBroadcast
>>>>>>>>>>> addr:$src)))],
>>>>>>>>>>>
IIC_MOV_MEM>, TA;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> def: Pat<(v64f32 (X86VBroadcast
addr:$src)),
>>>>>>>>>>> (BROADCAST_256B addr:$src)>;
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:28 AM,
hameeza ahmed <
>>>>>>>>>>> hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I did as you said;
>>>>>>>>>>>> now getting this error:
>>>>>>>>>>>>
>>>>>>>>>>>> LLVM ERROR: Cannot select: t63:
v64f32 = X86ISD::VBROADCAST t62
>>>>>>>>>>>> t62: f32,ch =
load<LD4[ConstantPool]> t0, t65, undef:i64
>>>>>>>>>>>> t65: i64 = X86ISD::Wrapper
TargetConstantPool:i64<float
>>>>>>>>>>>> 0x3FC99999A0000000> 0
>>>>>>>>>>>> t64: i64 =
TargetConstantPool<float 0x3FC99999A0000000> 0
>>>>>>>>>>>> t8: i64 = undef
>>>>>>>>>>>> In function: stencil
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Aug 6, 2017 at 1:14 AM,
Craig Topper <
>>>>>>>>>>>> craig.topper at gmail.com>
wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Add VT.is2048BitVector() to
the assert?
>>>>>>>>>>>>>
>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Aug 5, 2017 at 1:11
PM, hameeza ahmed <
>>>>>>>>>>>>> hahmed2305 at gmail.com>
wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> added the
setoperationaction line in isellowering.cpp. now
>>>>>>>>>>>>>> getting the following
error.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> llc:
/lib/Target/X86/X86ISelLowering.cpp:6801: llvm::SDValue
>>>>>>>>>>>>>>
LowerVectorBroadcast(llvm::BuildVectorSDNode *, const
>>>>>>>>>>>>>> llvm::X86Subtarget
&, llvm::SelectionDAG &): Assertion
>>>>>>>>>>>>>> `(VT.is128BitVector()
|| VT.is256BitVector() || VT.is512BitVector()) &&
>>>>>>>>>>>>>> "Unsupported
vector type for broadcast."' failed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What should I do?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Aug 6, 2017 at
12:36 AM, Craig Topper <
>>>>>>>>>>>>>> craig.topper at
gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Well first have you
done this for your type
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
setOperationAction(ISD::BUILD_VECTOR, v64i32, Custom);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Aug 5, 2017
at 12:29 PM, hameeza ahmed <
>>>>>>>>>>>>>>> hahmed2305 at
gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> How to do this
task??
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Aug 6,
2017 at 12:24 AM, Craig Topper <
>>>>>>>>>>>>>>>> craig.topper at
gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It looks
like X86TargetLowering::LowerBUILD_VECTOR is not
>>>>>>>>>>>>>>>>> creating a
broadcast node for your wider vector type.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ~Craig
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Aug
5, 2017 at 12:19 PM, hameeza ahmed <
>>>>>>>>>>>>>>>>> hahmed2305
at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thank
You.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I made
your mentioned changes and included broadcast
>>>>>>>>>>>>>>>>>>
instruction in instructioninfo.td. but i made no changes
>>>>>>>>>>>>>>>>>> in
isellowering.cpp file.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Still
getting the following error.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> LLVM
ERROR: Cannot select: t29: v64f32 = BUILD_VECTOR
>>>>>>>>>>>>>>>>>> t62,
t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>> t62,
t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>> t62,
t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>> t62,
t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62, t62,
>>>>>>>>>>>>>>>>>> t62,
t62, t62, t62
>>>>>>>>>>>>>>>>>> t62:
f32,ch = load<LD4[ConstantPool]> t0, t64, undef:i64
>>>>>>>>>>>>>>>>>>
t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>
TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>
t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>> t8:
i64 = undef
>>>>>>>>>>>>>>>>>> t62:
f32,ch = load<LD4[ConstantPool]> t0, t64, undef:i64
>>>>>>>>>>>>>>>>>>
t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>
TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>
t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>> t8:
i64 = undef
>>>>>>>>>>>>>>>>>> t62:
f32,ch = load<LD4[ConstantPool]> t0, t64, undef:i64
>>>>>>>>>>>>>>>>>>
t64: i64 = X86ISD::Wrapper
>>>>>>>>>>>>>>>>>>
TargetConstantPool:i64<float 0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>
t63: i64 = TargetConstantPool<float
>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000> 0
>>>>>>>>>>>>>>>>>>
.................
>>>>>>>>>>>>>>>>>> In
function: stencil
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> How to
resolve this?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please
help..
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat,
Aug 5, 2017 at 11:19 PM, Craig Topper <
>>>>>>>>>>>>>>>>>>
craig.topper at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> You
need to use X86VBroadcast not "vbroadcast"
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
~Craig
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On
Sat, Aug 5, 2017 at 10:50 AM, hameeza ahmed <
>>>>>>>>>>>>>>>>>>>
hahmed2305 at gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Hello,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
i have a c code which multiplies vector with constant
>>>>>>>>>>>>>>>>>>>>
something like this;
>>>>>>>>>>>>>>>>>>>>
float con=0.2;
>>>>>>>>>>>>>>>>>>>>
for (k = 0; k < N; k++) {
>>>>>>>>>>>>>>>>>>>>
for (i = 1; i <= N-2; i++)
>>>>>>>>>>>>>>>>>>>>
for (j = 1; j <= N-2; j++)
>>>>>>>>>>>>>>>>>>>>
b[i][j] = con * (a[i][j] + a[i-1][j] +
>>>>>>>>>>>>>>>>>>>>
a[i+1][j] + a[i][j-1] + a[i][j+1]);
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
now in LLVM IR I m getting;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
%22 = fmul <64 x float> %21, <float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000, float 0x3FC99999A0000000, float
>>>>>>>>>>>>>>>>>>>>
0x3FC99999A0000000, float 0x3FC99999A0000000, float 0x3FC99999A0000000,
>>>>>>>>>>>>>>>>>>>>
float 0x3FC99999A0000000>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
but its assembly in x86 gives;
>>>>>>>>>>>>>>>>>>>>
.LCPI0_0:
>>>>>>>>>>>>>>>>>>>>
.long 1045220557 # float 0.200000003
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
vbroadcastss zmm1, dword ptr [rip + .LCPI0_0]
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
vmulps zmm2, zmm2, zmm1
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
how does it lowered the above IR code into vbroadcastss?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
What would be the pattern here to match?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
I want to implement similar broadcast for vector of 64
>>>>>>>>>>>>>>>>>>>>
elements.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
i tried the following code;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
def BROADCAST_DWORD : I<0x60, MRMSrcMem, (outs
>>>>>>>>>>>>>>>>>>>>
VREGG:$dst), (ins immem:$src),
>>>>>>>>>>>>>>>>>>>>
"BROADCAST_DWORD\t{$src, $dst|$dst,
>>>>>>>>>>>>>>>>>>>>
$src}",
>>>>>>>>>>>>>>>>>>>>
[(set VREGG:$dst, (v64i32
>>>>>>>>>>>>>>>>>>>>
(vbroadcast addr:$src)))],
>>>>>>>>>>>>>>>>>>>>
IIC_MOV_MEM>, TA;
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Please help me. I am stuck at this point.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
Thank You
>>>>>>>>>>>>>>>>>>>>
Regards
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>> --
>>> ~Craig
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170807/e9336cb9/attachment.html>