Displaying 20 results from an estimated 30000 matches similar to: "[LLVMdev] Gather load in LLVM IR"
2014 Jan 21
2
[LLVMdev] Gather load in LLVM IR
Hi Evan, all,
The most obvious thing to me would be to extend the load instruction to
have an additional form that takes a vector of pointers instead of a
single pointer.
This form would return a vector of values instead of a single value.
If a gather instruction is not available on the target, then the load
could be lowered to a series of scalar loads and insertelements.
Thanks,
Nick
On
2013 Sep 05
2
[LLVMdev] Optimisation pass to move an alloca'd array to a global constant array
Hi All,
I was wondering if there is an optimisation pass that moves a stack
allocated array, initialised with constant values, to a global constant
array.
And if there is such a pass, what requirements are there for it to operate?
My optimised IR is below. As you can see an array of 5 integers is
created with alloca, then each element is stored to in turn. It would
be nice if this array was
2016 Jan 18
3
error of using GATHER intrinsic
Hi all,
I am using gather intrinsic to load a value from the same address twice at
the same time. Basically, I used my own pass to changed the following
bitcode:
%a = getelementptr inbounds [100 x double], [100 x double]* %A, i32, 0, i64
0
%1 = load double, double* a, align
to:
%a = getelementptr inbounds [100 x double], [100 x double]* %A, i32, 0, i64
0
%splat.a = insertelement <2 x
2011 Nov 29
4
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
----- Original Message -----
> "Rotem, Nadav" <nadav.rotem at intel.com> writes:
>
> > David,
> >
> > Thanks for the support! I sent a detailed email with the overall
> > plan. But just to reiterate, the GEP would look like this:
> >
> > %PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2, i32
> > 3, i32
2016 Jan 20
3
error of using GATHER intrinsic
> On Jan 20, 2016, at 12:59 PM, Tim Northover via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Hi Zhi,
>
> On 18 January 2016 at 11:28, zhi chen via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Any idea about this error? Or could anyone give me an example how to use the
>> gather intrinsic if there is something wrong with the way I am using it?
2016 Jan 23
2
how to force llvm generate gather intrinsic
Hi,
I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode,
say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I
used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather
intrinsic generated.
int foo(int A[800], int B[800], int C[800]) {
for (int i = 0; i < 800; i++) {
A[B[i]] = i + 5;
}
for (int i = 0; i < 800;
2011 Nov 30
2
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
Yes, indeed I can always fallback to intrinsics.
But still, I believe that the case I described is in its essence quite common-place, so it should be a first-class citizen in the LLVM IR. AVX2 is the target ISA I'm thinking of too BTW.
Let's forget 3D, and imagine something as trivial as a vectorized i32 => float table look up. I'd expect that the IR would look something like:
2016 Jan 23
3
how to force llvm generate gather intrinsic
Thanks for your response, Sanjay. I know there are intrinsics available in
C/C++. But the problem is that I want to instrument my code at the IR level
and generate those instructions. I don't want to touch the source code.
Best,
Zhi
On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com>
wrote:
> I was just looking at the related masked load/store operations, and
2016 Feb 25
2
how to force llvm generate gather intrinsic
Yes, masked load/store/gather/scatter are completed.
- Elena
From: zhi chen [mailto:zchenhn at gmail.com]
Sent: Thursday, February 25, 2016 01:20
To: Demikhovsky, Elena <elena.demikhovsky at intel.com>
Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] how to
2016 Feb 25
2
how to force llvm generate gather intrinsic
It seems that http://reviews.llvm.org/D15690 only implemented
gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to
enable gather for AVX/2? Thanks.
Best,
Zhi
On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com>
wrote:
> I don't think gather has been enabled for AVX2 as of r261875.
> Masked load/store were enabled for AVX with:
>
2016 Feb 25
0
how to force llvm generate gather intrinsic
I don't think gather has been enabled for AVX2 as of r261875.
Masked load/store were enabled for AVX with:
http://reviews.llvm.org/D16528 / http://reviews.llvm.org/rL258675
On Wed, Feb 24, 2016 at 11:39 PM, Demikhovsky, Elena <
elena.demikhovsky at intel.com> wrote:
> Yes, masked load/store/gather/scatter are completed.
>
>
>
> - * Elena*
>
>
>
>
2017 May 05
2
load instruction to gather intrinsics
The frontend would generate the load in the IR. I am using IRBuilder to
generate gather. I know it is mainly for discontinuous memory locations.
It's a long story why I want to use this. I want to gather some memory
locations. Suppose there are an array A, I manually duplicated it somewhere
with an offset x. Now, we have two arrays A and A', where A'[i] - A[i] =
offset. I want to
2011 Nov 29
0
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
Hi Jose,
The proposed IR change does not contribute nor hinder the usecase you mentioned. The case of a base + vector-index should be easily addressed by an intrinsic. The pointer-vector proposal comes to support full scatter/gather instructions (such as the AVX2 gather instructions).
Nadav
-----Original Message-----
From: Jose Fonseca [mailto:jfonseca at vmware.com]
Sent: Tuesday, November
2016 Feb 26
0
how to force llvm generate gather intrinsic
No. Gather operation is slow on AVX2 processors.
- Elena
From: zhi chen [mailto:zchenhn at gmail.com]
Sent: Thursday, February 25, 2016 20:48
To: Sanjay Patel <spatel at rotateright.com>
Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] how to force
2016 Jan 23
2
how to force llvm generate gather intrinsic
Thanks Sanjay for highlighting this, few days back I also faced similar problem
while generating masked store in avx1 mode, found its only supported under
avx2 else we scalarize it.
> 1) I did not switch-on masked_load/store to AVX1, I can do this.
Yes Elena, This should be supported for FP type in avx1 mode (for INT type, I doubt X86 has masked_load/store instruction in avx1 mode).
2016 Jan 23
2
how to force llvm generate gather intrinsic
Ø Can we legalize the same set of masked load/store operations for AVX1 as AVX2?
Yes, of course.
- Elena
From: Sanjay Patel [mailto:spatel at rotateright.com]
Sent: Saturday, January 23, 2016 18:42
To: Nema, Ashutosh <Ashutosh.Nema at amd.com>
Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; zhi chen <zchenhn at gmail.com>; llvm-dev <llvm-dev at
2016 Feb 26
2
how to force llvm generate gather intrinsic
If I'm understanding correctly, you're saying that vgather* is slow on all
of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will
not generate it for any of those machines.
Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() &&
!hasAVX512()". It could break for some hypothetical future processor that
manages to
2016 Feb 24
0
how to force llvm generate gather intrinsic
Hi Elena,
Are the masked_load and gather working now?
Best,
Zhi
On Sat, Jan 23, 2016 at 12:06 PM, Demikhovsky, Elena <
elena.demikhovsky at intel.com> wrote:
> Ø Can we legalize the same set of masked load/store operations for AVX1
> as AVX2?
>
> Yes, of course.
>
>
>
> - * Elena*
>
>
>
> *From:* Sanjay Patel [mailto:spatel at
2016 Feb 26
0
how to force llvm generate gather intrinsic
That makes great sense. It would be great if we have profitability mode to
see the necessity to use gathers. Or it also would be good if there is a
compiler option for the users to enable LLVM to generate the gather
instructions no matter it is faster or slow.
Best,
Zhi
On Fri, Feb 26, 2016 at 12:49 PM, Sanjay Patel <spatel at rotateright.com>
wrote:
> If I'm understanding
2011 Dec 10
2
[LLVMdev] [cfe-dev] GEP index type
Eli,
I understand the need to widen unsigned types. However, I ran into a problem with the current GEP/subscript that clang has.
AVX2 gather instructions rely on a 64-bit base pointer and a vector of 32-bit indices. Usually, when vectorizing programs, it is possible to detect that the GEP base pointer is uniform and that the index is variant (and needs to be vectorized). This works really nice