Displaying 20 results from an estimated 4536 matches for "gathered".
2017 May 05
2
load instruction to gather intrinsics
The frontend would generate the load in the IR. I am using IRBuilder to
generate gather. I know it is mainly for discontinuous memory locations.
It's a long story why I want to use this. I want to gather some memory
locations. Suppose there are an array A, I manually duplicated it somewhere
with an offset x. Now, we have two arrays A and A', where A'[i] - A[i] =
offset. I want to
2016 Jan 23
3
how to force llvm generate gather intrinsic
Thanks for your response, Sanjay. I know there are intrinsics available in
C/C++. But the problem is that I want to instrument my code at the IR level
and generate those instructions. I don't want to touch the source code.
Best,
Zhi
On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com>
wrote:
> I was just looking at the related masked load/store operations, and
2017 May 05
2
load instruction to gather intrinsics
Hi All,
Can I change a vector load to gather intrinsic? If so, how can I do it? For
example, I want to change the following IR code
%1 = load <2 x i64>* %arrayidx1, align 8
to
%1 = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %arrayidx1,
i32 8, <2 x i1> <i1 true, i1 true>, <2 x i64> undef)
Basically, I am not sure how to get two consecutive
2016 Jan 23
2
how to force llvm generate gather intrinsic
Hi,
I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode,
say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I
used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather
intrinsic generated.
int foo(int A[800], int B[800], int C[800]) {
for (int i = 0; i < 800; i++) {
A[B[i]] = i + 5;
}
for (int i = 0; i < 800;
2016 Dec 09
5
TableGen - Help to implement a form of gather/scatter operations for Mips MSA
Hello.
I read on page 4 of http://www.cs.fsu.edu/~whalley/cda5155/chap4.pdf that gather and
scatter operations exist for Mips, named LVI and SVI, respectively.
Did anyone think of implementing in the LLVM Mips back end (part of the MSA vector
instructions) gather and scatter operations?
If so, can you share with me the TableGen spec? (I tried to start from LD_DESC_BASE,
but it
2016 Jan 20
3
error of using GATHER intrinsic
> On Jan 20, 2016, at 12:59 PM, Tim Northover via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Hi Zhi,
>
> On 18 January 2016 at 11:28, zhi chen via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Any idea about this error? Or could anyone give me an example how to use the
>> gather intrinsic if there is something wrong with the way I am using it?
2016 Feb 26
2
how to force llvm generate gather intrinsic
If I'm understanding correctly, you're saying that vgather* is slow on all
of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will
not generate it for any of those machines.
Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() &&
!hasAVX512()". It could break for some hypothetical future processor that
manages to
2016 Feb 25
2
how to force llvm generate gather intrinsic
It seems that http://reviews.llvm.org/D15690 only implemented
gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to
enable gather for AVX/2? Thanks.
Best,
Zhi
On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com>
wrote:
> I don't think gather has been enabled for AVX2 as of r261875.
> Masked load/store were enabled for AVX with:
>
2016 Feb 26
0
how to force llvm generate gather intrinsic
That makes great sense. It would be great if we have profitability mode to
see the necessity to use gathers. Or it also would be good if there is a
compiler option for the users to enable LLVM to generate the gather
instructions no matter it is faster or slow.
Best,
Zhi
On Fri, Feb 26, 2016 at 12:49 PM, Sanjay Patel <spatel at rotateright.com>
wrote:
> If I'm understanding
2016 Aug 29
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
Hello everyone,
I think I have found an gvn / alias analysis related bug, but before
opening an issue on the tracker I wanted to see if I am missing something.
I have the following testcase:
define spir_kernel void @test(<2 x i32*> %in1, <2 x i32*> %in2, i32* %out) {
> entry:
> ; Just some temporary storage
> %tmp.0 = alloca i32
> %tmp.1 = alloca i32
> %tmp.i =
2016 Feb 26
0
how to force llvm generate gather intrinsic
No. Gather operation is slow on AVX2 processors.
- Elena
From: zhi chen [mailto:zchenhn at gmail.com]
Sent: Thursday, February 25, 2016 20:48
To: Sanjay Patel <spatel at rotateright.com>
Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] how to force
2019 Jul 12
4
Unexpected behaviour when comparing (==) long quoted expressions
Hi everyone:
I?m one of the interns at RStudio this summer working on a project that
helps teachers grade student code. I found an unexpected behaviour with
the |==| operator when comparing |quote|d expressions.
Example 1:
|u <- quote(tidyr::gather(key = key, value = value,
new_sp_m014:newrel_f65, na.rm = TRUE)) s <- quote(tidyr::gather(key =
key, value = value,
2014 Jan 21
2
[LLVMdev] Gather load in LLVM IR
Hi Evan, all,
The most obvious thing to me would be to extend the load instruction to
have an additional form that takes a vector of pointers instead of a
single pointer.
This form would return a vector of values instead of a single value.
If a gather instruction is not available on the target, then the load
could be lowered to a series of scalar loads and insertelements.
Thanks,
Nick
On
2016 Feb 25
2
how to force llvm generate gather intrinsic
Yes, masked load/store/gather/scatter are completed.
- Elena
From: zhi chen [mailto:zchenhn at gmail.com]
Sent: Thursday, February 25, 2016 01:20
To: Demikhovsky, Elena <elena.demikhovsky at intel.com>
Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] how to
2016 Feb 25
0
how to force llvm generate gather intrinsic
I don't think gather has been enabled for AVX2 as of r261875.
Masked load/store were enabled for AVX with:
http://reviews.llvm.org/D16528 / http://reviews.llvm.org/rL258675
On Wed, Feb 24, 2016 at 11:39 PM, Demikhovsky, Elena <
elena.demikhovsky at intel.com> wrote:
> Yes, masked load/store/gather/scatter are completed.
>
>
>
> - * Elena*
>
>
>
>
2011 Jun 13
2
[LLVMdev] Haswell New Instructions
On Jun 13, 2011, at 4:41 AM, Nicolas Capens wrote:
>
> So I was wondering whether in LLVM a gather operation is best represented with a 'load' instruction taking vector operands, or whether it's better to define it as a separate 'gather' instruction. What would be the pros and cons of each approach, and what do you think should be the long-term goals for the LLVM
2020 Aug 18
3
[PATCH V2 1/2] Add new flush_iotlb_range and handle freelists when using iommu_unmap_fast
Add a flush_iotlb_range to allow flushing of an iova range instead of a
full flush in the dma-iommu path.
Allow the iommu_unmap_fast to return newly freed page table pages and
pass the freelist to queue_iova in the dma-iommu ops path.
This patch is useful for iommu drivers (in this case the intel iommu
driver) which need to wait for the ioTLB to be flushed before newly
free/unmapped page table
2020 Aug 18
3
[PATCH V2 1/2] Add new flush_iotlb_range and handle freelists when using iommu_unmap_fast
Add a flush_iotlb_range to allow flushing of an iova range instead of a
full flush in the dma-iommu path.
Allow the iommu_unmap_fast to return newly freed page table pages and
pass the freelist to queue_iova in the dma-iommu ops path.
This patch is useful for iommu drivers (in this case the intel iommu
driver) which need to wait for the ioTLB to be flushed before newly
free/unmapped page table
2016 Dec 12
0
TableGen - Help to implement a form of gather/scatter operations for Mips MSA
Hi Alex,
> On 9 Dec 2016, at 01:52, Alex Susu via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
> Hello.
> I read on page 4 of
2016 Aug 29
2
GVN / Alias Analysis issue with llvm.masked.scatter/gather intrinsics
this is definitely a bug in AA.
225 for (auto I = CS2.arg_begin(), E = CS2.arg_end(); I != E; ++I) {
226 const Value *Arg = *I;
227 if (!Arg->getType()->isPointerTy())
-> 228 continue;
229 unsigned CS2ArgIdx = std::distance(CS2.arg_begin(), I);
230 auto CS2ArgLoc = MemoryLocation::getForArgument(CS2,
CS2ArgIdx, TLI);