thr3ads.net - similar to: "InstCombine question on combineLoadToOperationType"

Displaying 20 results from an estimated 6000 matches similar to: "InstCombine question on combineLoadToOperationType"

[LLVMdev] Legalizing v32i1, v64i1 for Haswell pext/pdep instructions

2014 May 18

[LLVMdev] Legalizing v32i1, v64i1 for Haswell pext/pdep instructions

I have a group of students working with me on some LLVM projects related to our Parabix research. One interesting issue that has come up for us is code generation support for the Haswell new instructions pext and pdep. These instructions shuffle bits within a 64-bit word, either gathering all selected bits to the beginning (pext) or scattering some initial bits throughout (pdep). A natural

Redundant ptrtoint/inttoptr instructions

2020 Jul 02

Redundant ptrtoint/inttoptr instructions

Hi all, We noticed a lot of unnecessary ptrtoint instructions that stand in way of some of our optimizations; the code pattern looks like this: bb1: %int1 = ptrtoint %struct.s* %ptr1 to i64 bb2: %int2 = ptrtoint %struct.s* %ptr2 to i64 %bb3: %phi.node = phi i64 [ %int1, %bb1 ], [%int2, %bb2 ] %ptr = inttoptr i64 %phi.node to %struct.s* In short, the pattern above arises due to: 1.

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

AVX512 instruction generated when JIT compiling for an avx2 architecture

On 06/23/2016 12:56 PM, Craig Topper wrote: > Can you check what value "getHostCPUName" returned? getHostCPUName() = skylake > > On Thu, Jun 23, 2016 at 9:53 AM, Frank Winter via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > With LLVM 3.8 the JIT compiler engine generates an AVX512 > instruction although I

AVX512 instruction generated when JIT compiling for an avx2 architecture

2016 Jun 23

AVX512 instruction generated when JIT compiling for an avx2 architecture

With LLVM 3.8 the JIT compiler engine generates an AVX512 instruction although I target an 'avx2' CPU (intel Core I7). I just downloaded the most recent 3.8 and still it happens. It happens with this input module: target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" define void @module_cFFEMJ(i64 %lo, i64 %hi, i64 %myId, i1 %ordered, i64 %start, i32* noalias align 32

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 28

Question about VectorLegalizer::ExpandStore() with v4i1

On Tue, Jun 28, 2016 at 2:45 AM, jingu kang via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Hi All, > > Can someone comment below question whether it is wrong or not please? > > 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: >> Hi All, >> >> I have a problem with VectorLegalizer::ExpandStore() with v4i1. >> >> Let's

[X86][AVX512] RFC: make i1 illegal in the Codegen

2017 Jan 24

[X86][AVX512] RFC: make i1 illegal in the Codegen

Hi All, AVX-512 introduced the K mask registers and masked operations which make a natural choice for legalizing vectors of i1's. For example, define <8 x i32> @foo(<8 x i32>%a, <8 x i32*> %p) { %r = call <8 x i32> @llvm.masked.gather.v8i32(<8 x i32*> %p, i32 4, <8 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>,

[RFC] Adding ARC backend

2017 Sep 01

[RFC] Adding ARC backend

Hi Pete, Thanks for your kind response! I migrated AVR target for lld https://reviews.llvm.org/D32991 it is very beginning, only support R_AVR_CALL reloc, and ARC is more complex than AVR, I will learn it from binutils, also ARC related doc, then try to implement it. 发自我的iPhone ------------------ Original ------------------ From: Pete Couperus <Peter.J.Couperus at synopsys.com> Date:

[LLVMdev] vselect on ARM/NEON

2012 Oct 11

[LLVMdev] vselect on ARM/NEON

Hello, We've run into a couple of cases where we'd like to use select on vector types, but vselect handling is absent from the ARM backend. Would there be any potential harm by marking VSELECT as Expand on ARM targets with NEON? Adding this seems to fix the following PR's: http://llvm.org/bugs/show_bug.cgi?id=13831 http://llvm.org/bugs/show_bug.cgi?id=13961 Thanks! Pete

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 05

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hello Jim, Thank you for the response. I may be confused about the alignment rules here. I had been looking at the ARM RVCT Assembler Guide, which seems to indicate vld1.16 operates on 16-bit aligned data, unless I am misinterpreting their table (Table 5-11 in ARM DUI 0204H, pg 5-70,5-71). Prior to the table, It does mention the accesses need to be "element" aligned, where I took

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hello, Thanks again. We did try overestimating the alignment, and saw the vldr you reference here. It looks like a recent change (r161962?) did enable vld1 generation for this case (great!) on darwin, but not linux. I'm not sure if the effect of lowering load <4 x i16>* align 2 to vld1.16 this was intentional in this change or not. If so, my question is what is the preferable way to

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

On Sep 6, 2012, at 2:48 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > Hi Pete, > > We ran into the same issue with generating vector loads/stores for vectors > with less than word alignment. It seems we took a similar approach to > solving the problem by modifying the logic in allowsUnalignedMemoryAccesses. > > As you and Jim mentioned, it looks like the

[RFC] Adding ARC backend

2017 Sep 01

[RFC] Adding ARC backend

Hi Pete, > https://reviews.llvm.org/D36331 Congratulations! > Following shortly: > * Clang driver and target triple support. great, then it is able to generate ELF by $ /opt/llvm-svn/bin/clang -c --target=arc hello.c -o hello.o -mmcu=XXX and do you plan to implement ARC target for lld[1]? it is a good testcase: flash them directly to the chip[2], or simulator[3]. 1. ARC

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 05

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hello all, I am a first time writer here, but am a happy LLVM tinkerer. It is a pleasure to use :). We have come across some sub-optimal behavior when LLVM lowers loads for vectors with small integers, i.e. load <4 x i16>* %a, align 2, using a sequence of scalar loads rather than a single vld1 on armv7 linux with NEON. Looking at the code in svn, it appears the ARM backend is capable of

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 05

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hmmm. Well, it's entirely possible that it's LLVM that's confused about the alignment requirements here. :) I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: extend: @ @extend @ BB#0: vldr d16, [r0] vmovl.s16 q8, d16 vstmia r1, {d16, d17} vldr d16, [r0, #8] add r0, r1, #16 vmovl.s16 q8, d16 vstmia

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hi Pete, We ran into the same issue with generating vector loads/stores for vectors with less than word alignment. It seems we took a similar approach to solving the problem by modifying the logic in allowsUnalignedMemoryAccesses. As you and Jim mentioned, it looks like the vld1/vst1 instructions should support element aligned access for any armv7 implementation (I'm looking at Table A3-1

[LLVMdev] vselect on ARM/NEON

2012 Oct 11

[LLVMdev] vselect on ARM/NEON

Seems reasonable to me. Plain 'SELECT' is already marked expand for vector types. I bet that just didn't get updates when VSELECT was introduced. -Jim On Oct 11, 2012, at 10:25 AM, Peter Couperus <peter.couperus at st.com> wrote: > Hello, > > We've run into a couple of cases where we'd like to use select on vector types, but vselect handling is absent from

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 07

[LLVMdev] Unaligned vector memory access for ARM/NEON.

On Sep 6, 2012, at 4:40 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > -----Original Message----- > From: Bob Wilson [mailto:bob.wilson at apple.com] > Sent: Thursday, September 06, 2012 3:39 PM > To: David Peixotto > Cc: 'Peter Couperus'; 'Jim Grosbach'; 'Jakob Olesen'; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Unaligned vector

[LLVMdev] vselect on ARM/NEON

2012 Oct 11

[LLVMdev] vselect on ARM/NEON

If you mark VSELECT as 'expand' then it will be expanded to a sequence of AND/OR/XOR, which is pretty efficient (found in LegalizeVectorOps.cpp ExpandVSELECT). On Oct 11, 2012, at 11:05 AM, Jim Grosbach <grosbach at apple.com> wrote: > Seems reasonable to me. Plain 'SELECT' is already marked expand for vector types. I bet that just didn't get updates when VSELECT

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

-----Original Message----- From: Bob Wilson [mailto:bob.wilson at apple.com] Sent: Thursday, September 06, 2012 3:39 PM To: David Peixotto Cc: 'Peter Couperus'; 'Jim Grosbach'; 'Jakob Olesen'; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Unaligned vector memory access for ARM/NEON. On Sep 6, 2012, at 2:48 PM, David Peixotto <dpeixott at codeaurora.org> wrote:

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

2016 May 15

r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa vmovntpd vmovntps instruction set

Hi , In the future, we will address this issue. Regards Michael Zuckerman From: Eric Christopher [mailto:echristo at gmail.com] Sent: Sunday, May 01, 2016 19:54 To: Zuckerman, Michael <michael.zuckerman at intel.com>; Craig Topper <craig.topper at gmail.com> Cc: llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] r267690 - [Clang][BuiltIn][AVX512]Adding intrinsics for vmovntdqa

similar to: InstCombine question on combineLoadToOperationType