similar to: load instruction to gather intrinsics

Displaying 20 results from an estimated 4000 matches similar to: "load instruction to gather intrinsics"

2017 May 05
2
load instruction to gather intrinsics
The frontend would generate the load in the IR. I am using IRBuilder to generate gather. I know it is mainly for discontinuous memory locations. It's a long story why I want to use this. I want to gather some memory locations. Suppose there are an array A, I manually duplicated it somewhere with an offset x. Now, we have two arrays A and A', where A'[i] - A[i] = offset. I want to
2015 Apr 16
3
[LLVMdev] double* to <2 x double>*
Does anyone know how to instrument *double* to <2 x doulbe>**, e.g., 2.2 --> <2.2, 2.2>? For example, I want to change the following IR code %arrayidx1 = getelementptr inbounds [100 x double]* @main.B, i32 0, i32 %i.021 %1 = load double* %arrayidx1, align 4, !tbaa !0 to: %arrayidx1 = getelementptr inbounds [100 x double]* @main.B, i32 0, i32 %i.021 %1 = bitcast double* %arrayidx1
2014 Oct 17
2
[LLVMdev] opt -O2 leads to incorrect operation (possibly a bug in the DSE)
Hi all, Consider the following example: define void @fn(i8* %buf) #0 { entry: %arrayidx = getelementptr i8* %buf, i64 18 tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arrayidx, i8* %buf, i64 18, i32 1, i1 false) %arrayidx1 = getelementptr i8* %buf, i64 18 store i8 1, i8* %arrayidx1, align 1 tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %buf, i8* %arrayidx, i64 18, i32 1, i1 false)
2016 Mar 16
3
RFC: A change in InstCombine canonical form
=== PROBLEM === (See this bug https://llvm.org/bugs/show_bug.cgi?id=26445) IR contains code for loading a float from float * and storing it to a float * address. After canonicalization of load in InstCombine [1], new bitcasts are added to the IR (see bottom of the email for code samples). This prevents select speculation in SROA to work. Also after SROA we have bitcasts from int32 to float.
2019 Aug 08
2
Suboptimal code generated by clang+llc in quite a common scenario (?)
I found a something that I quite not understand when compiling a common piece of code using the -Os flags. I found it while testing my own backend but then I got deeper and found that at least the x86 is affected as well. This is the referred code: char pp[3]; char *scscx = pp; int tst( char i, char j, char k ) { scscx[0] = i; scscx[1] = j; scscx[2] = k; return 0; } The above gets
2016 Jan 20
2
error of using GATHER intrinsic
Got it. Thanks. I will try it with the trunk version. On Wed, Jan 20, 2016 at 1:36 PM, Tim Northover <t.p.northover at gmail.com> wrote: > Hi Zhi, > On 20 January 2016 at 13:33, zhi chen <zchenhn at gmail.com> wrote: > > Thanks for your response. The attached is the .bc file after my pass. I > > could generate the assembly with -mcpu=skx but not with
2016 Jan 18
3
error of using GATHER intrinsic
Hi all, I am using gather intrinsic to load a value from the same address twice at the same time. Basically, I used my own pass to changed the following bitcode: %a = getelementptr inbounds [100 x double], [100 x double]* %A, i32, 0, i64 0 %1 = load double, double* a, align to: %a = getelementptr inbounds [100 x double], [100 x double]* %A, i32, 0, i64 0 %splat.a = insertelement <2 x
2016 Mar 16
2
RFC: A change in InstCombine canonical form
On Wed, Mar 16, 2016 at 8:34 AM, Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > How do it interact with the "typeless pointers" work? > Right - the goal of the typeless pointer work is to fix all these bugs related to "didn't look through bitcasts" in optimizations. Sometimes that's going to mean more work (because the code
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
Hi, LLVM community: I found basicaa seems not to tell must-not-alias for __restrict__ arguments in c/c++. It only compares two pointers and the underlying objects they point to. I wonder how clang does alias analysis for c/c++ keyword restrict. let assume we compile the following code: $cat myalias.cc float foo(float * __restrict__ v0, float * __restrict__ v1, float * __restrict__ v2, float *
2017 Sep 13
2
RFC phantom memory intrinsic
Hi Michael, >Interesting approach but how do you handle more complex offsets, e.g., when the pointer is part of an aggregate? Only one offset does not seem enough to handle generic cases. Yes, correct, this a little bit changed example is not working. #include <x86intrin.h> __m256d vsht_d4_fold(const double* ptr, unsigned long long i) { __m256d foo = (__m256d){ ptr[i], ptr[i+1],
2016 Jan 20
2
error of using GATHER intrinsic
Hi Tim, Thanks for your response. The attached is the .bc file after my pass. I could generate the assembly with -mcpu=skx but not with -mcpu=core-avx2. Could you please take a look? BTW, I am using LLVM-3.7. Best, Zhi On Wed, Jan 20, 2016 at 1:21 PM, Tim Northover <t.p.northover at gmail.com> wrote: > > Only typo that caught my eye is ‘llvm.masked.gather.v8f64’ which should >
2013 Nov 12
0
[LLVMdev] What's the Alias Analysis does clang use ?
Hi, Your problem is that the function arguments, which are makes as noalias, are not being directly used as the base objects of the array accesses: > %v0.addr = alloca float*, align 8 > %v1.addr = alloca float*, align 8 > %v2.addr = alloca float*, align 8 > %t.addr = alloca float*, align 8 ... > store float* %v0, float** %v0.addr, align 8 > store float* %v1, float** %v1.addr,
2016 Jan 23
3
how to force llvm generate gather intrinsic
Thanks for your response, Sanjay. I know there are intrinsics available in C/C++. But the problem is that I want to instrument my code at the IR level and generate those instructions. I don't want to touch the source code. Best, Zhi On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com> wrote: > I was just looking at the related masked load/store operations, and
2016 Jan 23
2
how to force llvm generate gather intrinsic
Hi, I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode, say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather intrinsic generated. int foo(int A[800], int B[800], int C[800]) { for (int i = 0; i < 800; i++) { A[B[i]] = i + 5; } for (int i = 0; i < 800;
2017 Sep 13
2
RFC phantom memory intrinsic
Hi Michael, >I have a case where InstCombine removes a store and your approach would be >valuable for me if the entire access to an aggregate could be restored. Yes, no problem and we could add the aggregate pointer to this new intrinsic and in my particular case I should ignore it, but I am looking now at "speculation_marker" metadata and I am still not sure how to implement it
2016 Jan 20
3
error of using GATHER intrinsic
> On Jan 20, 2016, at 12:59 PM, Tim Northover via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi Zhi, > > On 18 January 2016 at 11:28, zhi chen via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Any idea about this error? Or could anyone give me an example how to use the >> gather intrinsic if there is something wrong with the way I am using it?
2017 Sep 12
3
RFC phantom memory intrinsic
Hi, For PR21780 solution, I plan to add a new functionality to restore memory operations that was once deleted, in this particular case it is the load operations that were deleted by InstCombine, please note that once the load was removed there is no way to restore it back and that prevents us from vectorizing the shuffle operation. There are probably more similar issues where this approach could
2016 Feb 25
2
how to force llvm generate gather intrinsic
It seems that http://reviews.llvm.org/D15690 only implemented gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to enable gather for AVX/2? Thanks. Best, Zhi On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com> wrote: > I don't think gather has been enabled for AVX2 as of r261875. > Masked load/store were enabled for AVX with: >
2016 Feb 25
2
how to force llvm generate gather intrinsic
Yes, masked load/store/gather/scatter are completed. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 01:20 To: Demikhovsky, Elena <elena.demikhovsky at intel.com> Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to
2017 Sep 26
0
RFC phantom memory intrinsic
On 09/13/2017 04:46 PM, Dinar Temirbulatov via llvm-dev wrote: > Hi Michael, >> I have a case where InstCombine removes a store and your approach would be >> valuable for me if the entire access to an aggregate could be restored. > Yes, no problem and we could add the aggregate pointer to this new > intrinsic and in my particular case I should ignore it, but I am > looking