thr3ads.net - search: "insert_vector

Displaying 20 results from an estimated 29 matches for "insert_vector_elts".

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during target

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

...neInfo is not > providing the require information). > > What did I miss? > > Err, wait, sorry, my fault; I missed that you only insert the bitcasts on the other side of the branch. You should be able to do it the other way, though: generate the build_vector unconditionally, and pull insert_vector_elts out of it in a DAGCombine. (At this point, you know whether DAGCombine will remove the bit casts because if it could, it would have already done it.) -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130...

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com>wrote: > Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and > target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during > target specific isel lowering will be eliminated during DAG combine. > > Let me

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

Hi, ** Problematic ** I am looking for advices to share some logic between DAG combine and target lowering. Basically, I need to know if a bitcast that is about to be inserted during target specific isel lowering will be eliminated during DAG combine. Let me know if there is another, better supported, approach for this kind of problems. ** Motivating Example ** The motivating example comes

[LLVMdev] More AVX Advice Needed

2009 Dec 02

[LLVMdev] More AVX Advice Needed

On Wednesday 02 December 2009 17:24, Eli Friedman wrote: > On Wed, Dec 2, 2009 at 3:08 PM, David Greene <dag at cray.com> wrote: > > On Wednesday 02 December 2009 16:51, Eli Friedman wrote: > >> On Wed, Dec 2, 2009 at 2:44 PM, David Greene <dag at cray.com> wrote: > >> > I'm working on some of the AVX insert/extract instructions. They're >

Instruction selection problems due to SelectionDAGBuilder

2016 Aug 02

Instruction selection problems due to SelectionDAGBuilder

Hello. I'm having problems at instruction selection with my back end with the following basic-block due to a vector add with immediate constant vector (obtained by vectorizing a simple C program doing vector sum map): vector.ph: ; preds = %vector.memcheck50 %.splatinsert = insertelement <8 x i64> undef, i64 %i.07.unr, i32 0

[LLVMdev] More AVX Advice Needed

2009 Dec 02

[LLVMdev] More AVX Advice Needed

On Wed, Dec 2, 2009 at 3:08 PM, David Greene <dag at cray.com> wrote: > On Wednesday 02 December 2009 16:51, Eli Friedman wrote: >> On Wed, Dec 2, 2009 at 2:44 PM, David Greene <dag at cray.com> wrote: >> > I'm working on some of the AVX insert/extract instructions. They're >> > stupid. They do not operate on ymm registers, meaning we have to

[LLVMdev] More AVX Advice Needed

2009 Dec 02

[LLVMdev] More AVX Advice Needed

On Wednesday 02 December 2009 16:51, Eli Friedman wrote: > On Wed, Dec 2, 2009 at 2:44 PM, David Greene <dag at cray.com> wrote: > > I'm working on some of the AVX insert/extract instructions. They're > > stupid. They do not operate on ymm registers, meaning we have to > > use VINSERTF128/VEXTRACTF128 and then do the real operation. > > > > Anyway,

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 16

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

Alex, From my experience in working with GPU vector registers; there is no support for swizzles in the manner that you would normally code them, and in my case I have 6^4 permutations on src registers and 24 combinations in the dst registers. The way that I ended up handling this was to have different register classes for 1, 2, 3 and 4 component vectors. This made the generic cases very simple

generate vectorized code

2016 Mar 18

generate vectorized code

> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > > Yes this IR does not build or shuffle any vector. Try to write a function that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 ints, > > This might sound like a dumb question, but how does one build a vector of ints out of regular ints in IR? See:

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

2009 Feb 16

[LLVMdev] Modeling GPU vector registers, again (with my implementation)

Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in

[LLVMdev] X86 - Help on fixing a poor code generation bug

2013 Dec 05

[LLVMdev] X86 - Help on fixing a poor code generation bug

Hi all, I noticed that the x86 backend tends to emit unnecessary vector insert instructions immediately after sse scalar fp instructions like addss/mulss. For example: ///////////////////////////////// __m128 foo(__m128 A, __m128 B) { _mm_add_ss(A, B); } ///////////////////////////////// produces the sequence: addss %xmm0, %xmm1 movss %xmm1, %xmm0 which could be easily optimized into

[LLVMdev] Subregister coalescing

2010 Jul 28

[LLVMdev] Subregister coalescing

...gister & ops, *but* not vector loads. All the vector register elements are directly accesible, so VI1 reg (Vector Integer 1) has I4, I5, I6 and I7 as its (integer) subregisters. Subregisters of same reg *never* overlap. Therefore, vector loads are lowered to scalar loads followed by a chain of INSERT_VECTOR_ELTs. Then we select those to INSERT_SUBREG, everything fine to that point. Status before live analisys is (non-related instrs removed): 36 %reg16388<def> = LDWr %reg16384, 0; mem:LD4[<unknown>] 68 %reg16392<def> = INSERT_SUBREG %reg16392<undef>, %reg16388<kill>, 1 76 %re...

[LLVMdev] X86 - Help on fixing a poor code generation bug

2013 Dec 05

[LLVMdev] X86 - Help on fixing a poor code generation bug

Hi Andrea, Thanks for working on this. I can see two approaches to solving this problem. The first one (that you suggested) is to catch this pattern after register allocation. The second approach is to eliminate this redundancy during instruction selection. Can you please look into catching this pattern during iSel? The idea is that ADDSS does an ADD plus BLEND operations, and you can easily

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

On Mon, Sep 29, 2008 at 8:11 PM, Mon Ping Wang <wangmp at apple.com> wrote: > The problem with generating insert and extracts is that we can generate poor > code > %tmp16 = extractelement <4 x float> %f4b, i32 0 > %f8a = insertelement <8 x float> %f8a, float %tmp16, i32 0 > %tmp18 = extractelement <4 x float> %f4b, i32 1 > %f8c

generate vectorized code

2016 Mar 18

generate vectorized code

On Fri, Mar 18, 2016 at 2:03 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > On Fri, Mar 18, 2016 at 1:53 PM, Mehdi Amini <mehdi.amini at apple.com> > wrote: > >> >> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> >> wrote: >> >> Yes this IR does not build or shuffle any vector. Try to write a function

[LLVMdev] Subregister coalescing

2010 Jul 28

[LLVMdev] Subregister coalescing

On Jul 28, 2010, at 12:25 PM, Carlos Sánchez de La Lama wrote: > Which after register coalescing gets transformed into: > > 36 %reg16404:1<def> = LDWr %reg16384, 0; mem:LD4[<unknown>] > 76 %reg16394<def> = LDWr %reg16386<kill>, 0; mem:LD4[<unknown>] > 124 %reg16404<def> = INSERT_SUBREG %reg16404, %reg16394<kill>, 2 > 132

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

Hi, The current definition of shuffle vector is <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <n x i32> <mask> ; yields <n x <ty>> The first two operands of a 'shufflevector' instruction are vectors with types that match each other and types that match the result of the instruction. The third

generate vectorized code

2016 Mar 18

generate vectorized code

> On Mar 18, 2016, at 1:37 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > >> I think you created a cycle, this is easy to do with SelectionDAG :) >> Basically SelecitonDAG will iterate until it does not see anything to change. So if you insert a transformation on a pattern A, that generates pattern B, while you have another transformation that matches B and

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

2009 May 20

[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations

Per subject, this patch adding an additional pass to handle vector operations; the idea is that this allows removing the code from LegalizeDAG that handles illegal types, which should be a significant simplification. There are still some issues with this patch, but does the approach look sane? -Eli -------------- next part -------------- Index: lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

search for: insert_vector_elts