thr3ads.net - similar to: "[LLVMdev] Legalizing v32i1, v64i1 for Haswell pext/pdep instructions"

Displaying 20 results from an estimated 100 matches similar to: "[LLVMdev] Legalizing v32i1, v64i1 for Haswell pext/pdep instructions"

[LLVMdev] SIMD Projects with LLVM

2014 Apr 03

[LLVMdev] SIMD Projects with LLVM

Hi everyone. After lurking for a while, this is my first post to the list. I am working with some graduate students on the general topic of compiler support for SIMD programming and specific projects related to LLVM and my own Parabix technology (parabix.costar.sfu.ca). Right now we have a few course projects on the go and already a question arising out of one of them (SSE2 Hoisting).

Beginner question: Calling intrinsic

2018 Jan 07

Beginner question: Calling intrinsic

Hello, I’m not sure if this is the right place to ask beginner questions. If not, please direct me to the appropriate place. I’m writing my first llvm program and I’m trying to call an intrinsic, but failing. So far this is what I have: declare ccc i32 @llvm.x86.bmi.pdep.32(i32, i32) @.str2 = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1 declare i32 @printf(i8*, ...)

Beginner question: Calling intrinsic

2018 Jan 07

Beginner question: Calling intrinsic

Hi John, What targer are you trying to compile this for? I imagine this is just a case of using an X86 intrinsic on a non-X86 back end. Or is this an intrinsic you added and didn't provide a selection pattern for? In any case, this intrinsic makes it into the selection DAG and the instruction selector tries to select a sequence of instructions for it. However, it fails to find a pattern that

Beginner question: Calling intrinsic

2018 Jan 08

Beginner question: Calling intrinsic

If you are using x86, you probably need to pass something like -mcpu=haswell or -mattr=bmi2 to enable support for the intrinsic. It seems to -mcpu=native doesn't work for lli so it can't just autodetect your CPU. ~Craig On Sun, Jan 7, 2018 at 7:54 AM, Nemanja Ivanovic via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi John, > What targer are you trying to compile this

InstCombine question on combineLoadToOperationType

2016 Nov 16

InstCombine question on combineLoadToOperationType

Hello, Context: We have a backend where v32i1 is a Legal type, but the storage for v32i1 is not 32-bits/uses a different instruction sequence. We ran into an issue because combineLoadToOperationType changed v32i1 loads into i32 loads, so a sequence like: define void @bits(<32 x i1>* %A, <32 x i1>* %B) { %a = load <32 x i1>, <32 x i1>* %A store <32 x i1> %a,

Rotates, once again

2018 May 15

Rotates, once again

Thanks for writing this up. I'd like to have this intrinsic too. Another argument for having the intrinsic is shown in PR37426: https://bugs.llvm.org/show_bug.cgi?id=37426 Vectorization goes overboard because the throughput cost model used by the vectorizers doesn't match the 6 IR instructions that correspond to 1 x86 rotate instruction. Instead, we have: $ opt 37426prevectorize.ll -S

How to loop a Vobis sound ?

2006 Jan 17

How to loop a Vobis sound ?

The sound file is played correctly for the first time, then when rewind to the initial position. then copy PCM to buffer, the OpenAL report an error. It seems like the OpenAL doesn't recognize the PCM data. The OpenAL error number : AL_INVALID_VALUE 0xA003 void Buffer::PCMData (ALuint id, ALenum eFormat, ALvoid *data, ALsizei size, ALsizei freq) { // Copy

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 10

[LLVMdev] Vector select/compare support in LLVM

"Rotem, Nadav" <nadav.rotem at intel.com> writes: > One of the arguments for packing masks is that it reduces > vector-registers pressure. Auto-vectorizing compilers maintain > multiple masks for different execution paths (for each loop nesting, > etc). Saving masks in xmm registers may result in vector-register > pressure which will cause spilling of these

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 10

[LLVMdev] Vector select/compare support in LLVM

Hi David, The MOVMSKPS instruction is cheap (2 cycles). Not to be confused with VMASKMOV, the AVX masked move, which is expensive. One of the arguments for packing masks is that it reduces vector-registers pressure. Auto-vectorizing compilers maintain multiple masks for different execution paths (for each loop nesting, etc). Saving masks in xmm registers may result in vector-register

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 14

[LLVMdev] Vector select/compare support in LLVM

David, The problem with the sparse representation is that it is word-width dependent. For 32-bit data-types, the mask is the 32nd bit, while fore 64bit types the mask is the 64th bit. How would you legalize the mask for the following code ? %mask = cmp nge <4 x float> %A, %B ; <4 x i1> %val = select <4 x i1>% mask, <4 x double> %X, %Y ; <4 x

Question about VectorLegalizer::ExpandStore() with v4i1

2016 Jun 28

Question about VectorLegalizer::ExpandStore() with v4i1

On Tue, Jun 28, 2016 at 2:45 AM, jingu kang via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Hi All, > > Can someone comment below question whether it is wrong or not please? > > 2016-06-25 7:52 GMT+01:00 jingu kang <jaykang10 at gmail.com>: >> Hi All, >> >> I have a problem with VectorLegalizer::ExpandStore() with v4i1. >> >> Let's

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 10

[LLVMdev] Vector select/compare support in LLVM

After I implemented a new type of legalization (the packing of i1 vectors), I found that x86 does not have a way to load packed masks into SSE registers. So, I guess that legalizing of <4 x i1> to <4 x i32> is the way to go. Cheers, Nadav -----Original Message----- From: Rotem, Nadav Sent: Thursday, March 10, 2011 11:04 To: 'David A. Greene' Cc: llvmdev at cs.uiuc.edu

Rotates, once again

2018 May 14

Rotates, once again

Hi everyone! I recently ran into some interesting issues with generation of rotate instructions - the details are in the bug tracker (https://bugs.llvm.org/show_bug.cgi?id=37387 and related bugs) for those interested - and it brought up the issue of rotates in the IR again. Now this is a proposal that has been made (and been rejected) several times, but I've been told that this time round we

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 10

[LLVMdev] Vector select/compare support in LLVM

Hey, I am currently forced to create the BLENDVPS intrinsic as an external call (via Intrinsic::x86_sse41_blendvps) which has the following signature (from IntrinsicsX86.td): def int_x86_sse41_blendvps : GCCBuiltin<"__builtin_ia32_blendvps">, Intrinsic<[llvm_v4f32_ty],[llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty],[IntrNoMem]> Thus, it expects the mask (first operand if

[Bug 1092] New: nft v0.6 segfault in must_print_eq_op at expression.c:520 during 'nft monitor trace' in netdev filter

2016 Oct 20

[Bug 1092] New: nft v0.6 segfault in must_print_eq_op at expression.c:520 during 'nft monitor trace' in netdev filter

https://bugzilla.netfilter.org/show_bug.cgi?id=1092 Bug ID: 1092 Summary: nft v0.6 segfault in must_print_eq_op at expression.c:520 during 'nft monitor trace' in netdev filter Product: nftables Version: unspecified Hardware: x86_64 OS: All Status: NEW

[LLVMdev] GSoc 2009 (Bad Subject in the previous email)

2009 Mar 27

[LLVMdev] GSoc 2009 (Bad Subject in the previous email)

Dear all I am a PhD student of Computer Scince at Simon Fraser University ( http://www.cs.sfu.ca) interested in applying to GSoC. My PhD is focused on theoretical computer science, but since Sep. 2008 I have started working on Software projects again. Currently I am working in COSTAR lab ( http://costar.sfu.ca/) on a high performance regular expression engine based on Parallel bit streams

[LLVMdev] GSoc 2009 (Bad Subject in the previous email)

2009 Mar 30

[LLVMdev] GSoc 2009 (Bad Subject in the previous email)

Hi Ehsan, All of the projects you have listed are quite interesting. If I were to advocate for one, it would be #2. I think the scope of work is perfect for GSoc. I'd encourage send out a more concrete proposal when you're ready. Thanks, Evan On Mar 27, 2009, at 2:35 PM, Ehsan Amiri wrote: > Dear all > > I am a PhD student of Computer Scince at Simon Fraser University

[LLVMdev] GSoc 2009 (Bad Subject in the previous email)

2009 Apr 01

[LLVMdev] GSoc 2009 (Bad Subject in the previous email)

Hi Evan Thanks for the email. I had a look at gcc implementation of TBAA and I think three main steps in implementation of TBAA for LLVM will be this: April 20 ~ May 23: I will read gcc implementation in depth and play around with LLVM code. May 23 ~ July 6: Implementation and test of a simple version of TBAA that does not work with all aggregate types. I think part of the coding required for

PROBLEMA DE MEMORIA AL HACER PERMUTACIONES

2015 Mar 12

PROBLEMA DE MEMORIA AL HACER PERMUTACIONES

Buenas tardes amigos, De nuevo por aqui con un incoveniente, tengo el siguiente arreglo: > MuestraS [1] 1 0 0 0 1 0 1 1 1 1 1 0 Deseo realizar todas las permutaciones posibles para luego tomar una muestra aleatoria pequeña, esto lo debo hacer varias veces incrementando el largo del arreglo "MuestraS". El inconveniente esta en que al hacer las permutaciones con este arreglo de 12

[LLVMdev] Vector select/compare support in LLVM

2011 Mar 09

[LLVMdev] Vector select/compare support in LLVM

"Rotem, Nadav" <nadav.rotem at intel.com> writes: > I can think of two ways to represent masks in x86: sparse and > packed. In the sparse method, the masks are kept in <4 x 32bit> > registers, which are mapped to xmm registers. This is the ‘native’ way > of using masks. This argues for the sparse representation, I think. > _Sparse_ After my discussion with

similar to: [LLVMdev] Legalizing v32i1, v64i1 for Haswell pext/pdep instructions