thr3ads.net - similar to: "error of using GATHER intrinsic"

Displaying 20 results from an estimated 900 matches similar to: "error of using GATHER intrinsic"

2016 Jan 20

error of using GATHER intrinsic

> On Jan 20, 2016, at 12:59 PM, Tim Northover via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi Zhi, > > On 18 January 2016 at 11:28, zhi chen via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Any idea about this error? Or could anyone give me an example how to use the >> gather intrinsic if there is something wrong with the way I am using it?

error of using GATHER intrinsic

2016 Jan 20

error of using GATHER intrinsic

Hi Tim, Thanks for your response. The attached is the .bc file after my pass. I could generate the assembly with -mcpu=skx but not with -mcpu=core-avx2. Could you please take a look? BTW, I am using LLVM-3.7. Best, Zhi On Wed, Jan 20, 2016 at 1:21 PM, Tim Northover <t.p.northover at gmail.com> wrote: > > Only typo that caught my eye is ‘llvm.masked.gather.v8f64’ which should >

error of using GATHER intrinsic

2016 Jan 20

error of using GATHER intrinsic

Got it. Thanks. I will try it with the trunk version. On Wed, Jan 20, 2016 at 1:36 PM, Tim Northover <t.p.northover at gmail.com> wrote: > Hi Zhi, > On 20 January 2016 at 13:33, zhi chen <zchenhn at gmail.com> wrote: > > Thanks for your response. The attached is the .bc file after my pass. I > > could generate the assembly with -mcpu=skx but not with

how to force llvm generate gather intrinsic

2016 Jan 23

how to force llvm generate gather intrinsic

Thanks for your response, Sanjay. I know there are intrinsics available in C/C++. But the problem is that I want to instrument my code at the IR level and generate those instructions. I don't want to touch the source code. Best, Zhi On Fri, Jan 22, 2016 at 4:54 PM, Sanjay Patel <spatel at rotateright.com> wrote: > I was just looking at the related masked load/store operations, and

how to force llvm generate gather intrinsic

2016 Jan 23

how to force llvm generate gather intrinsic

Hi, I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode, say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather intrinsic generated. int foo(int A[800], int B[800], int C[800]) { for (int i = 0; i < 800; i++) { A[B[i]] = i + 5; } for (int i = 0; i < 800;

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

It seems that http://reviews.llvm.org/D15690 only implemented gather/scatter for AVX-512, but not for AVX/AVX2. Is there any plan to enable gather for AVX/2? Thanks. Best, Zhi On Thu, Feb 25, 2016 at 8:28 AM, Sanjay Patel <spatel at rotateright.com> wrote: > I don't think gather has been enabled for AVX2 as of r261875. > Masked load/store were enabled for AVX with: >

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

Yes, masked load/store/gather/scatter are completed. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 01:20 To: Demikhovsky, Elena <elena.demikhovsky at intel.com> Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

If I'm understanding correctly, you're saying that vgather* is slow on all of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will not generate it for any of those machines. Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() && !hasAVX512()". It could break for some hypothetical future processor that manages to

how to force llvm generate gather intrinsic

2016 Feb 25

how to force llvm generate gather intrinsic

I don't think gather has been enabled for AVX2 as of r261875. Masked load/store were enabled for AVX with: http://reviews.llvm.org/D16528 / http://reviews.llvm.org/rL258675 On Wed, Feb 24, 2016 at 11:39 PM, Demikhovsky, Elena < elena.demikhovsky at intel.com> wrote: > Yes, masked load/store/gather/scatter are completed. > > > > - * Elena* > > > >

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

That makes great sense. It would be great if we have profitability mode to see the necessity to use gathers. Or it also would be good if there is a compiler option for the users to enable LLVM to generate the gather instructions no matter it is faster or slow. Best, Zhi On Fri, Feb 26, 2016 at 12:49 PM, Sanjay Patel <spatel at rotateright.com> wrote: > If I'm understanding

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

No. Gather operation is slow on AVX2 processors. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 20:48 To: Sanjay Patel <spatel at rotateright.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force

how to force llvm generate gather intrinsic

2016 Feb 24

how to force llvm generate gather intrinsic

Hi Elena, Are the masked_load and gather working now? Best, Zhi On Sat, Jan 23, 2016 at 12:06 PM, Demikhovsky, Elena < elena.demikhovsky at intel.com> wrote: > Ø Can we legalize the same set of masked load/store operations for AVX1 > as AVX2? > > Yes, of course. > > > > - * Elena* > > > > *From:* Sanjay Patel [mailto:spatel at

load instruction to gather intrinsics

2017 May 05

load instruction to gather intrinsics

Hi All, Can I change a vector load to gather intrinsic? If so, how can I do it? For example, I want to change the following IR code %1 = load <2 x i64>* %arrayidx1, align 8 to %1 = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %arrayidx1, i32 8, <2 x i1> <i1 true, i1 true>, <2 x i64> undef) Basically, I am not sure how to get two consecutive

how to force llvm generate gather intrinsic

2016 Jan 23

how to force llvm generate gather intrinsic

Ø Can we legalize the same set of masked load/store operations for AVX1 as AVX2? Yes, of course. - Elena From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: Saturday, January 23, 2016 18:42 To: Nema, Ashutosh <Ashutosh.Nema at amd.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; zhi chen <zchenhn at gmail.com>; llvm-dev <llvm-dev at

load instruction to gather intrinsics

2017 May 05

load instruction to gather intrinsics

The frontend would generate the load in the IR. I am using IRBuilder to generate gather. I know it is mainly for discontinuous memory locations. It's a long story why I want to use this. I want to gather some memory locations. Suppose there are an array A, I manually duplicated it somewhere with an offset x. Now, we have two arrays A and A', where A'[i] - A[i] = offset. I want to

how to force llvm generate gather intrinsic

2016 Jan 23

how to force llvm generate gather intrinsic

Thanks Sanjay for highlighting this, few days back I also faced similar problem while generating masked store in avx1 mode, found its only supported under avx2 else we scalarize it. > 1) I did not switch-on masked_load/store to AVX1, I can do this. Yes Elena, This should be supported for FP type in avx1 mode (for INT type, I doubt X86 has masked_load/store instruction in avx1 mode).

[LLVMdev] troubles with ISD::FPOWI

2014 Sep 18

[LLVMdev] troubles with ISD::FPOWI

Hi, I'm stumped by how to handle fpowi. Here is the context: my architecture has i64, f32, and f64 registers. No i32. For calls & returns, we promote i32 to i64. There is no support in the architecture to perform fpowi - it has to go through the runtime. I'm using gfortran + dragonegg + llvm3.4 to generate .ll files via plugin. The fortran expression REAL = REAL ** INTEGER*4

btrfs_print_tree?

2012 Jul 01

btrfs_print_tree?

HI, Do anyone know where btrfs_print_tree is invoked? thanks. -- Regards, Zhi Yong Wu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Get timestamp and processor ID in the IR

2015 Nov 19

Get timestamp and processor ID in the IR

Hi Hal, Thanks for the pointer. Is it possible to get the processor ID on X86 architecture? There is a library call in linux, sched_getcpu(), to the ID. Also, is it possible to get the program counter in the IR? Best, Zhi On Thu, Nov 19, 2015 at 1:33 PM, Hal Finkel <hfinkel at anl.gov> wrote: > Hi Zhi, > > There is no standard (architecture-independent) way to get the processor

[LLVMdev] how to disable sse and avx

2015 May 14

[LLVMdev] how to disable sse and avx

Thanks, Mats. Actually, it is able to generate the assembly now if I use the follow command: clang++ -O3 -S -mllvm --x86-asm-syntax=intel -mno-sse -o test_nosee.s test.cpp However, when I use g++ -O3 -o test_nosse test_nosse.s -lm to generate the executable, if gives me the following errors: Error: too many memory references for `sub' Error: too many memory references for `mov' Error:

similar to: error of using GATHER intrinsic