search for: insert_vector_elts

Displaying 20 results from an estimated 29 matches for "insert_vector_elts".

Did you mean: insert_vector_elt
2013 Jul 01
3
[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during target
2013 Jul 01
0
[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
...neInfo is not > providing the require information). > > What did I miss? > > Err, wait, sorry, my fault; I missed that you only insert the bitcasts on the other side of the branch. You should be able to do it the other way, though: generate the build_vector unconditionally, and pull insert_vector_elts out of it in a DAGCombine. (At this point, you know whether DAGCombine will remove the bit casts because if it could, it would have already done it.) -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130...
2013 Jul 01
0
[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com>wrote: > Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and > target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during > target specific isel lowering will be eliminated during DAG combine. > > Let me
2013 Jul 01
3
[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
Hi, ** Problematic ** I am looking for advices to share some logic between DAG combine and target lowering. Basically, I need to know if a bitcast that is about to be inserted during target specific isel lowering will be eliminated during DAG combine. Let me know if there is another, better supported, approach for this kind of problems. ** Motivating Example ** The motivating example comes
2009 Dec 02
1
[LLVMdev] More AVX Advice Needed
On Wednesday 02 December 2009 17:24, Eli Friedman wrote: > On Wed, Dec 2, 2009 at 3:08 PM, David Greene <dag at cray.com> wrote: > > On Wednesday 02 December 2009 16:51, Eli Friedman wrote: > >> On Wed, Dec 2, 2009 at 2:44 PM, David Greene <dag at cray.com> wrote: > >> > I'm working on some of the AVX insert/extract instructions.  They're >
2016 Aug 02
2
Instruction selection problems due to SelectionDAGBuilder
Hello. I'm having problems at instruction selection with my back end with the following basic-block due to a vector add with immediate constant vector (obtained by vectorizing a simple C program doing vector sum map): vector.ph: ; preds = %vector.memcheck50 %.splatinsert = insertelement <8 x i64> undef, i64 %i.07.unr, i32 0
2009 Dec 02
0
[LLVMdev] More AVX Advice Needed
On Wed, Dec 2, 2009 at 3:08 PM, David Greene <dag at cray.com> wrote: > On Wednesday 02 December 2009 16:51, Eli Friedman wrote: >> On Wed, Dec 2, 2009 at 2:44 PM, David Greene <dag at cray.com> wrote: >> > I'm working on some of the AVX insert/extract instructions.  They're >> > stupid.  They do not operate on ymm registers, meaning we have to
2009 Dec 02
2
[LLVMdev] More AVX Advice Needed
On Wednesday 02 December 2009 16:51, Eli Friedman wrote: > On Wed, Dec 2, 2009 at 2:44 PM, David Greene <dag at cray.com> wrote: > > I'm working on some of the AVX insert/extract instructions.  They're > > stupid.  They do not operate on ymm registers, meaning we have to > > use VINSERTF128/VEXTRACTF128 and then do the real operation. > > > > Anyway,
2009 Feb 16
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Alex, From my experience in working with GPU vector registers; there is no support for swizzles in the manner that you would normally code them, and in my case I have 6^4 permutations on src registers and 24 combinations in the dst registers. The way that I ended up handling this was to have different register classes for 1, 2, 3 and 4 component vectors. This made the generic cases very simple
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > > Yes this IR does not build or shuffle any vector. Try to write a function that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 ints, > > This might sound like a dumb question, but how does one build a vector of ints out of regular ints in IR? See:
2009 Feb 16
2
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
Evan Cheng-2 wrote: > > Well, how many possible permutations are there? Is it possible to > model each case as a separate physical register? > > Evan > I don't think so. There are 4x4x4x4 = 256 permutations. For example: * xyzw: default * zxyw * yyyy: splat Even if can model each of these 256 cases as a separate physical register, how can I model the use of r0.xyzw in
2013 Dec 05
3
[LLVMdev] X86 - Help on fixing a poor code generation bug
Hi all, I noticed that the x86 backend tends to emit unnecessary vector insert instructions immediately after sse scalar fp instructions like addss/mulss. For example: ///////////////////////////////// __m128 foo(__m128 A, __m128 B) { _mm_add_ss(A, B); } ///////////////////////////////// produces the sequence: addss %xmm0, %xmm1 movss %xmm1, %xmm0 which could be easily optimized into
2010 Jul 28
3
[LLVMdev] Subregister coalescing
...gister & ops, *but* not vector loads. All the vector register elements are directly accesible, so VI1 reg (Vector Integer 1) has I4, I5, I6 and I7 as its (integer) subregisters. Subregisters of same reg *never* overlap. Therefore, vector loads are lowered to scalar loads followed by a chain of INSERT_VECTOR_ELTs. Then we select those to INSERT_SUBREG, everything fine to that point. Status before live analisys is (non-related instrs removed): 36 %reg16388<def> = LDWr %reg16384, 0; mem:LD4[<unknown>] 68 %reg16392<def> = INSERT_SUBREG %reg16392<undef>, %reg16388<kill>, 1 76 %re...
2013 Dec 05
0
[LLVMdev] X86 - Help on fixing a poor code generation bug
Hi Andrea, Thanks for working on this. I can see two approaches to solving this problem. The first one (that you suggested) is to catch this pattern after register allocation. The second approach is to eliminate this redundancy during instruction selection. Can you please look into catching this pattern during iSel? The idea is that ADDSS does an ADD plus BLEND operations, and you can easily
2008 Sep 30
0
[LLVMdev] Generalizing shuffle vector
On Mon, Sep 29, 2008 at 8:11 PM, Mon Ping Wang <wangmp at apple.com> wrote: > The problem with generating insert and extracts is that we can generate poor > code > %tmp16 = extractelement <4 x float> %f4b, i32 0 > %f8a = insertelement <8 x float> %f8a, float %tmp16, i32 0 > %tmp18 = extractelement <4 x float> %f4b, i32 1 > %f8c
2016 Mar 18
4
generate vectorized code
On Fri, Mar 18, 2016 at 2:03 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > On Fri, Mar 18, 2016 at 1:53 PM, Mehdi Amini <mehdi.amini at apple.com> > wrote: > >> >> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> >> wrote: >> >> Yes this IR does not build or shuffle any vector. Try to write a function
2010 Jul 28
0
[LLVMdev] Subregister coalescing
On Jul 28, 2010, at 12:25 PM, Carlos Sánchez de La Lama wrote: > Which after register coalescing gets transformed into: > > 36 %reg16404:1<def> = LDWr %reg16384, 0; mem:LD4[<unknown>] > 76 %reg16394<def> = LDWr %reg16386<kill>, 0; mem:LD4[<unknown>] > 124 %reg16404<def> = INSERT_SUBREG %reg16404, %reg16394<kill>, 2 > 132
2008 Sep 30
4
[LLVMdev] Generalizing shuffle vector
Hi, The current definition of shuffle vector is <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <n x i32> <mask> ; yields <n x <ty>> The first two operands of a 'shufflevector' instruction are vectors with types that match each other and types that match the result of the instruction. The third
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 1:37 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > >> I think you created a cycle, this is easy to do with SelectionDAG :) >> Basically SelecitonDAG will iterate until it does not see anything to change. So if you insert a transformation on a pattern A, that generates pattern B, while you have another transformation that matches B and
2009 May 20
2
[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations
Per subject, this patch adding an additional pass to handle vector operations; the idea is that this allows removing the code from LegalizeDAG that handles illegal types, which should be a significant simplification. There are still some issues with this patch, but does the approach look sane? -Eli -------------- next part -------------- Index: lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp