Displaying 15 results from an estimated 15 matches for "4xi32".
Did you mean:
64xi32
2009 Jul 23
1
[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work
...eeding into it,
giving us really atrocious code.
IMO, the solution to this is to have a legalize-types action for
vectors that corresponds to "promote" on scalars. In this case, since
X86 supports VSETCC, the 4 x i1 SETCC should "vector promote" to a
VSETCC node with a 4xi32 result, the and should vector promote to
4xi32, and the sext should vector promote as a vector sext_inreg.
I don't think that implementing this is particularly hard, but I have
plenty of other things I'm working on right now. Is anyone else
interested in working on this?
-Chris
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> wrote:
>
> Yes this IR does not build or shuffle any vector. Try to write a function that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 ints,
>
> This might sound like a dumb question, but how does one build a vector of ints out of regular ints in IR?
See: http://llvm.org/docs/LangRef.html#vector-operations
In short, the IR has "insertelement", which maps to "INSERT_VECTOR...
2016 Mar 18
2
generate vectorized code
...ECTOR, MVT::v4i32, Expand);
> setOperationAction(ISD::EXTRACT_VECTOR_ELT, MVT::v4i32, Expand);
> setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v4i32, Expand);
Yes this IR does not build or shuffle any vector. Try to write a function that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 ints, sum them, and store the result to the pointer.
>
> In other words I left the code as is.
>
> However if I use a .c code and run it through clang, I don't see any vector instructions. I'm puzzled. What am I doing wrong? There seems to...
2017 Sep 15
2
What should a truncating store do?
...111111011111110111111101111111. Or it can be written as a packed
vector which I think would resemble 0b00001111111111111111111111111111.
This would mean the memory layout changes depending on how/whether the
legaliser breaks large vectors down into smaller types. Is this the case?
For example, <4xi32> => <4 x i31> converts to two <2 x i32> => <2 x i31>
stores on a target with <2 x i32> legal but would not be split if <4 x i32>
were declared legal.
Thanks
Jon
On Fri, Sep 15, 2017 at 7:41 PM, Friedman, Eli <efriedma at codeaurora.org>
wrote:
>...
2018 Mar 20
1
Polly -polly-prevect-width
i musing polly with vec-width=16 default my IR emits <16xi32> and remaining
as <4xi32> by using polly. I want my IR to emit <16xi32> and remaining left
as <8xi32>. How to do this?
i m trying to use -polly-prevect-width.
please help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments...
2018 Mar 22
1
TargetOpcode::KILL confusion
...explain the semantics of TargetOpcode::KILL? Specifically, in this example, which register is killed? Would it be legal for operands 0 and 1 to refer to different registers?
128B %R3<def> = KILL %R3, %R3_1<imp-use>, %R3_23<imp-use>
(In my out-of-tree target, %R3 is a <4xi32> register, %R3_1 is an i32 sub-register of %R3, and %R3_23 is a <2xi32> sub-register of %R3).
Thanks,
Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180322/0267f9a4/attachment.html>
2015 Jan 16
3
[LLVMdev] Overloaded intrinsics: name explosion
Philip Reames wrote:
>> 1. Introduce aAny.
>
> Having a generic any type seems fine. I assume you'd create something like
> an llvm_any_type in Intrinsics.td?
That's precisely what ifavpAny is about: integer, float, array,
vector, pointer Any. aAny is meant for array-Any, and I wonder why so
few people care about arrays. I'll go ahead with Any and
llvm_any_type.
2010 May 31
1
[LLVMdev] Error with instruction selection
...ary
register. The physical return register is then overwritten in the next
call. (This is visible when calling "llc -view-isel-dags
-view-sched-dags". The first graph is OK, the second is not.)
The problem goes away if I:
-have the getPtr return anything else than <4xfloat>* or <4xi32>* (e.g.
<4xfloat> or float* work just fine)
-do not load from or store to the pointer - e.g. just returning the
pointer works.
-target any other processor than CellSPU (ok, some backends assert on
this code, and the PIC assebly I didn't understand :) )
Any explanation on what is going...
2016 Mar 18
4
generate vectorized code
...di.amini at apple.com>
> wrote:
>
>>
>> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com>
>> wrote:
>>
>> Yes this IR does not build or shuffle any vector. Try to write a function
>>> that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8
>>> ints,
>>>
>>
>> This might sound like a dumb question, but how does one build a vector of
>> ints out of regular ints in IR?
>>
>>
>> See: http://llvm.org/docs/LangRef.html#vector-operations
>>
>&g...
2017 Sep 15
2
What should a truncating store do?
...written as a packed
>> vector which I think would resemble 0b00001111111111111111111111111111.
>>
>> This would mean the memory layout changes depending on how/whether the
>> legaliser breaks large vectors down into smaller types. Is this the case?
>> For example, <4xi32> => <4 x i31> converts to two <2 x i32> => <2 x i31>
>> stores on a target with <2 x i32> legal but would not be split if <4 x i32>
>> were declared legal.
>>
>
> Vectors get complicated; I don't recall all the details of what the c...
2018 Jul 23
2
KNL Vectorization with larger vector width
...is. each time i debug it, it returns me vectorized IR in gdb.
My goal is simple when i mention my target name in opt it should vectorize
by keeping the vector width= highest supported by my target which is 2048.
So $ opt -O3 -mytarget 1.ll -o 1_opt.ll
1_opt.ll should emit <2048xi32>,
<1024xi32>.........................<32xi32> etc.
How to achieve this? Please help.
Thank You
Regards
On Fri, Jul 13, 2018 at 12:40 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> On 07/12/2018 02:32 PM, hameeza ahmed via llvm-dev wrote:
>
> Hello,
>
> If we pass march=knl,...
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 12:52 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:
>
>>
>> On Mar 18, 2016, at 12:45 PM, Rail Shafigulin <rail at esenciatech.com <mailto:rail at esenciatech.com>> wrote:
>>
>> On Thu, Mar 17, 2016 at 2:41 PM, Rail Shafigulin <rail at esenciatech.com <mailto:rail at esenciatech.com>> wrote:
>> On Thu,
2010 May 31
0
[LLVMdev] Finding Merge nodes in CFG (ambika@cse.iitb.ac.in)
...al return register is then overwritten in the next
> call. (This is visible when calling "llc -view-isel-dags
> -view-sched-dags". The first graph is OK, the second is not.)
>
> The problem goes away if I:
> -have the getPtr return anything else than <4xfloat>* or <4xi32>* (e.g.
> <4xfloat> or float* work just fine)
> -do not load from or store to the pointer - e.g. just returning the
> pointer works.
> -target any other processor than CellSPU (ok, some backends assert on
> this code, and the PIC assebly I didn't understand :) )
>
>...
2008 Sep 30
4
[LLVMdev] Generalizing shuffle vector
...ype(4) )) float float4;
typedef __attribute__(( ext_vector_type(8) )) float float8;
float8 f8;
float4 f4a, f4b, f4c;
f4a = f8.hi;
f8.hi = f4b; f8.lo = f4c;
where hi and lo represent the high half and low half of the vector.
The outgoing IR is
%f4a = shufflevector <8xf32>%f8, undef, <4xi32> <0, 1, 2, 3>
%f8 = shufflevector <4xf32>%f4b, <4xf32>%f4c, <8xi32> <0, 1, 2, 3,
4, 5, 6, 7>
The problem with generating insert and extracts is that we can
generate poor code
%tmp16 = extractelement <4 x float> %f4b, i32 0
%f8a = insert...
2017 Sep 25
0
What should a truncating store do?
...111111011111110111111101111111. Or it can be written as a packed vector which I think would resemble 0b00001111111111111111111111111111.
This would mean the memory layout changes depending on how/whether the legaliser breaks large vectors down into smaller types. Is this the case? For example, <4xi32> => <4 x i31> converts to two <2 x i32> => <2 x i31> stores on a target with <2 x i32> legal but would not be split if <4 x i32> were declared legal.
Vectors get complicated; I don't recall all the details of what the code generator currently does/is supp...