Displaying 20 results from an estimated 265 matches for "shufflevector".
2019 Nov 26
4
LangRef semantics for shufflevector with undef mask is incorrect
Hi,
This is a follow up on a discussion around shufflevector with undef mask in
https://reviews.llvm.org/D70641 and
https://bugs.llvm.org/show_bug.cgi?id=43958.
The current semantics of shufflevector in
http://llvm.org/docs/LangRef.html#shufflevector-instruction states:
"If the shuffle mask is undef, the result vector is undef. If any element of
th...
2019 Nov 27
2
LangRef semantics for shufflevector with undef mask is incorrect
...ro de 2019 01:10
To: Nuno Lopes <nuno.lopes at ist.utl.pt>; LLVMdev <llvm-dev at lists.llvm.org>
Cc: spatel at rotateright.com; Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr>;
zhengyang-liu at hotmail.com; John Regehr <regehr at cs.utah.edu>
Subject: RE: LangRef semantics for shufflevector with undef mask is
incorrect
The shuffle mask of a shufflevector is special: it's required to be a
constant in a specific form. From LangRef: "The shuffle mask operand is
required to be a constant vector with either constant integer or undef
values." So really, we can resolve this...
2017 Mar 30
2
InstructionSimplify: adding a hook for shufflevector instructions
As Sanjay noted in D31426<https://reviews.llvm.org/D31426#712701>, InstructionSimplify is missing the following simplification:
This function:
define <4 x i32> @splat_operand(<4 x i32> %x) {
%splat = shufflevector <4 x i32> %x, <4 x i32> undef, <4 x i32> zeroinitializer
%shuf = shufflevector <4 x i32> %splat, <4 x i32> undef, <4 x i32> <i32 0, i32 3, i32 2, i32 1>
ret <4 x i32> %shuf
}
can be simplified to:
define <4 x i32> @splat_operand(<4 x i...
2019 Dec 09
2
[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs
Sanjay,
I'm looking at some missed optimizations caused by D70246. Here's a test case:
define <4 x float> @f(i32 %t32, <4 x float>* %t24) {
.entry:
%t43 = insertelement <3 x i32> undef, i32 %t32, i32 2
%t44 = bitcast <3 x i32> %t43 to <3 x float>
%t45 = shufflevector <3 x float> %t44, <3 x float> undef, <4 x i32>
<i32 0, i32 undef, i32 undef, i32 undef>
%t46 = shufflevector <3 x float> %t44, <3 x float> undef, <4 x i32>
<i32 undef, i32 1, i32 undef, i32 undef>
%t47 = shufflevector <3 x float> %t44, <3...
2008 Sep 30
0
[LLVMdev] Generalizing shuffle vector
Hi Mon Ping,
Generalizing shufflevector would be great. I have an additional
suggestion below.
On 29-Sep-08, at 11:11 PM, Mon Ping Wang wrote:
> I am proposing to extend the shuffle vector definition to be
> <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32>
>...
2008 Sep 30
2
[LLVMdev] Generalizing shuffle vector
I agree further generalization seems like a very good idea. But I'd
like to see what Mon Ping proposed implemented first so we have a
better idea of the implementation cost.
Thanks,
Evan
On Sep 30, 2008, at 6:44 AM, Stefanus Du Toit wrote:
> Hi Mon Ping,
>
> Generalizing shufflevector would be great. I have an additional
> suggestion below.
>
> On 29-Sep-08, at 11:11 PM, Mon Ping Wang wrote:
>> I am proposing to extend the shuffle vector definition to be
>> <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, &...
2019 Nov 27
3
LangRef semantics for shufflevector with undef mask is incorrect
On 11/27/19 2:10 AM, Eli Friedman via llvm-dev wrote:
The shuffle mask of a shufflevector is special: it's required to be a constant in a specific form. From LangRef: "The shuffle mask operand is required to be a constant vector with either constant integer or undef values." So really, we can resolve this any way we want; "undef" in this context doesn't hav...
2017 Mar 30
2
InstructionSimplify: adding a hook for shufflevector instructions
Thanks, Sanjay, that makes sense. The opportunity for improving instcombining splat sounds promising.
Another question about shuffle simplification. This is a testcase from test/Transforms/InstCombine/vec_shuffle.ll:
define <4 x i32> @test10(<4 x i32> %tmp5) nounwind {
%tmp6 = shufflevector <4 x i32> %tmp5, <4 x i32> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
%tmp7 = shufflevector <4 x i32> %tmp6, <4 x i32> undef, <4 x i32> zeroinitializer
ret <4 x i32> %tmp7
}
opt –instcombine will combine to:
define <4 x i32>...
2014 Sep 26
2
[LLVMdev] Canonicalizing vector masking.
...x i32> %row3i to <4 x float>
This arises from code which expands a vector of scale factors into the
diagonal of a 4x4 diagonal matrix. This code pattern is coming from
intrinsics which are explicitly doing the masking like this.
My question is: should we canonicalize this to:
%row0 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
x i32> <i32 0, i32 4, i32 4, i32 4>
%row1 = shufflevector <4 x float> %input, <4 x float> zeroinitializer, <4
x i32> <i32 4, i32 1, i32 4, i32 4>
%row2 = shufflevector <4 x float> %input, &...
2016 Aug 28
2
IR canonicalization: vector select or shufflevector?
A vector select with a constant vector condition operand:
define <4 x i32> @foo(<4 x i32> %a, <4 x i32> %b) {
%sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32>
%a, <4 x i32> %b
ret <4 x i32> %sel
}
...is equivalent to a shufflevector:
define <4 x i32> @goo(<4 x i32> %a, <4 x i32> %b) {
%shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 0, i32
5, i32 6, i32 3>
ret <4 x i32> %shuf
}
For the goal of canonicalization in IR, which of these should we prefer?
Some ba...
2016 Aug 29
2
IR canonicalization: vector select or shufflevector?
I have a slight preference towards shufflevector, because it makes
sequences of shuffles, where only some of the shuffles can be converted
into selects (because the input and output vector sizes of the others don't
match) simpler to reason about.
I'm not sure this is a particularly good reason, though.
On Mon, Aug 29, 2016 at 8:19 AM, P...
2017 Jan 17
2
IR canonicalization: shufflevector or vector trunc?
...s this a valid argument to not canonicalize the IR?
On Mon, Jan 16, 2017 at 10:16 AM, Rackover, Zvi <zvi.rackover at intel.com>
wrote:
> Suppose we prefer the ‘trunc’ form, then what about cases such as:
>
> define <2 x i16> @shuffle(<16 x i16> %x) {
>
> %shuf = shufflevector <16 x i16> %x, <16 x i16> undef, <2 x i32> <i32 0,
> i32 8>
>
> ret <2 x i16> %shuf
>
> }
>
>
>
> Will the ‘shufflevector’ be canonicalized to a ‘trunc’ of a vector of i128?
>
> define <2 x i16> @trunc(<16 x i16> %x) {
>...
2017 Jan 21
2
IR canonicalization: shufflevector or vector trunc?
...:128-n8:16:32:64-S128" ; little-endian
define <4 x i32> @shuffle(<4 x i64> %x) {
%y = shl <4 x i64> %x, <i64 32, i64 32, i64 32, i64 32> ; low half of
each elt is zero
%bc = bitcast <4 x i64> %y to <8 x i32> ; even index elements are all zero
%trunc = shufflevector <8 x i32> %bc, <8 x i32> undef, <4 x i32> <i32 0,
i32 2, i32 4, i32 6>
ret <4 x i32> %trunc
}
define <4 x i32> @trunc(<4 x i64> %x) {
%y = shl <4 x i64> %x <i64 32, i64 32, i64 32, i64 32> ; low half of each
elt is zero
%trunc = trunc <...
2017 Jan 13
2
IR canonicalization: shufflevector or vector trunc?
Right - I think that case looks like this for little endian:
define <2 x i32> @zextshuffle(<2 x i16> %x) {
%zext_shuffle = shufflevector <2 x i16> %x, <2 x i16> zeroinitializer, <4
x i32> <i32 0, i32 2, i32 1, i32 2>
%bc = bitcast <4 x i16> %zext_shuffle to <2 x i32>
ret <2 x i32> %bc
}
define <2 x i32> @zextvec(<2 x i16> %x) {
%zext = zext <2 x i16> %x to <2 x i3...
2008 Sep 30
4
[LLVMdev] Generalizing shuffle vector
Hi,
The current definition of shuffle vector is
<result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <n x
i32> <mask> ; yields <n x <ty>>
The first two operands of a 'shufflevector' instruction are vectors
with types that match each other and types that match the result of
the instructio...
2016 Aug 29
2
IR canonicalization: vector select or shufflevector?
...ces at lists.llvm.org] *On Behalf Of *Michael
> Kuperstein via llvm-dev
> *Sent:* 29 August 2016 19:28
> *To:* Philip Reames <listmail at philipreames.com>
> *Cc:* llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* Re: [llvm-dev] IR canonicalization: vector select or
> shufflevector?
>
>
>
> I have a slight preference towards shufflevector, because it makes
> sequences of shuffles, where only some of the shuffles can be converted
> into selects (because the input and output vector sizes of the others don't
> match) simpler to reason about.
>
>
&g...
2008 Sep 30
0
[LLVMdev] Generalizing shuffle vector
Hi,
I agree that the more general shufflevector is more useful. I
narrowed the original proposal a little bit because of the concern for
the implementation cost. However, the slightly narrowed definition
will probably require falling backing to generate insert and extracts
for complex masks so it is possible that there will be no extra...
2017 Mar 14
3
llvm-stress crash
...%A4 = alloca double
%A3 = alloca float
%A2 = alloca i8
%A1 = alloca double
%A = alloca i64
%L = load i8, i8* %0
store i8 33, i8* %0
%E = extractelement <8 x i1> zeroinitializer, i32 2
br label %CF261
CF261: ; preds = %BB
%Shuff = shufflevector <2 x i16> zeroinitializer, <2 x i16> zeroinitializer, <2 x i32> <i32 undef, i32 3>
%I = insertelement <8 x i8> zeroinitializer, i8 69, i32 3
%B = udiv i8 -99, 33
%Tr = trunc i64 -1 to i32
%Sl = select i1 true, i64* %2, i64* %2
%L5 = load i64, i64* %Sl
store...
2020 Jan 30
7
[RFC] Extending shufflevector for vscale vectors (SVE etc.)
...les are allowed; we're considering allowing more different kinds of shuffles. The issue is, essentially, that a shuffle mask is a simple list of integers, and that isn't enough to express a scalable operation. For example, concatenating two fixed-length vectors currently looks like this:
shufflevector <2 x i32> %v1, <2 x i32> %v2, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
(Note that despite the syntax, the mask is really just a list of constant integers, not a general constant expression.)
There isn't any obvious way to extend this to variable shuffles. The mask would...
2016 Oct 10
2
[arm, aarch64] Alignment checking in interleaved access pass
...h
opens up new cases.
To give a bit more insight, here's a simple example of where the data is
still continuous: [0 .. 32) , but it needs to be split to use multiple
VSTns/STns. This is what Halide generates for aarch64:
%uglygep242243 = bitcast i8* %uglygep242 to <16 x i32>*
%114 = shufflevector <16 x i32> %112, <16 x i32> %113, <4 x i32> <i32 0,
i32 1, i32 2, i32 3>
%115 = shufflevector <16 x i32> %112, <16 x i32> %113, <4 x i32> <i32 8,
i32 9, i32 10, i32 11>
%116 = shufflevector <16 x i32> %112, <16 x i32> %113, <4 x i32...